Refactoring and bug fixing beam search generate #3135

patrickvonplaten · 2020-03-05T12:13:36Z

This PR cleanes the beam_search decoding part of language generation. It simplifies the code and fixes a small bug for do_sample=True (see comments in code).
It was also tested on all language generation slow tests.

Future PR

Do the same change for TF 2.0 if ok -> refactored beam search according to torch implementation #3148

src/transformers/modeling_utils.py

patrickvonplaten · 2020-03-05T12:56:50Z

src/transformers/modeling_utils.py

            # stop when there is a </s> in each sentence, or if we exceed the maximul length
            if unfinished_sents.max() == 0:
                break

+            cur_len = cur_len + 1


think it's always good to have cur_len += 1 as the last statement in the loop!

patrickvonplaten · 2020-03-05T12:58:21Z

src/transformers/modeling_utils.py

@@ -996,6 +997,9 @@ def _generate_beam_search(
                # Compute next scores
                next_scores = torch.gather(_scores, -1, next_tokens)  # (batch_size, num_beams * 2)

+                # sort the sampled vector to make sure that the first num_beams samples are the best


I added this small sort function for the following reason. We are sampling 2 * num_beams samples per batch_idx and always take the first three of those samples. I think the first three samples that we take should then also correspond to the best three out of 6 samples.

@thomwolf @LysandreJik

I'm fine with that, it goes with the beam search philosophy indeed (a bit pushed to the extreme when you're sampling anyway)

patrickvonplaten · 2020-03-05T13:00:15Z

src/transformers/modeling_utils.py

                    if eos_token_ids is not None and token_id.item() in eos_token_ids:
                        generated_hyps[batch_idx].add(
-                            input_ids[batch_idx * num_beams + beam_id, :cur_len].clone(), score.item(),
+                            input_ids[effective_beam_id].clone(), score.item(),


cur_len parameter is useless here

patrickvonplaten · 2020-03-05T13:00:39Z

src/transformers/modeling_utils.py

            # stop when we are done with each sentence
            if all(done):
                break

+            # update current length
+            cur_len = cur_len + 1


think it's always good to have cur_len += 1 as the last statement in the loop!

patrickvonplaten · 2020-03-05T13:06:14Z

src/transformers/modeling_utils.py

-                    generated_hyps[batch_idx].add(
-                        input_ids[batch_idx * num_beams + beam_id, :cur_len].clone(), score.item()
-                    )
+            # test that beam scores match previously calculated scores if not eos and batch_idx not done


This refactoring was taken from bart's generate() function (@sshleifer) I think it's much cleaner by showing clearly that at the end the current best 3 (first 3) open hypotheses are added to the generated hypotheses. Also we take the final scores from the variable beam_scores here instead of next_scores (as before) which is the "correct" variable to take the scores from since it's the most current updated accumulated score.
Also an assert statement verifying the beam_scores are the correctly calculated is added making sure that the logic will not be broken in further changes. @thomwolf @LysandreJik

patrickvonplaten · 2020-03-05T13:25:22Z

Good to merge for me

thomwolf

Perfect!

src/transformers/modeling_utils.py

thomwolf · 2020-03-05T20:59:17Z

src/transformers/modeling_utils.py

@@ -996,6 +997,9 @@ def _generate_beam_search(
                # Compute next scores
                next_scores = torch.gather(_scores, -1, next_tokens)  # (batch_size, num_beams * 2)

+                # sort the sampled vector to make sure that the first num_beams samples are the best


I'm fine with that, it goes with the beam search philosophy indeed (a bit pushed to the extreme when you're sampling anyway)

patrickvonplaten added 3 commits March 5, 2020 13:12

refactoring and bug fixing beam search generate

c47394b

remove ipdb

4220fd5

uncomment expression

e33ed12

patrickvonplaten commented Mar 5, 2020

View reviewed changes

src/transformers/modeling_utils.py Show resolved Hide resolved

patrickvonplaten commented Mar 5, 2020

View reviewed changes

patrickvonplaten changed the title ~~[WIP] refactoring and bug fixing beam search generate~~ Refactoring and bug fixing beam search generate Mar 5, 2020

patrickvonplaten requested a review from thomwolf March 5, 2020 13:08

patrickvonplaten mentioned this pull request Mar 5, 2020

Merge bart generate into default generate #3140

Merged

thomwolf approved these changes Mar 5, 2020

View reviewed changes

thomwolf merged commit bbabbc1 into huggingface:master Mar 5, 2020

patrickvonplaten deleted the refactor_beam_search_generate branch March 5, 2020 22:16

patrickvonplaten mentioned this pull request Mar 5, 2020

refactored beam search according to torch implementation #3148

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refactoring and bug fixing beam search generate #3135

Refactoring and bug fixing beam search generate #3135

Uh oh!

patrickvonplaten commented Mar 5, 2020 •

edited

Loading

Uh oh!

Uh oh!

patrickvonplaten Mar 5, 2020 •

edited

Loading

Uh oh!

patrickvonplaten Mar 5, 2020 •

edited

Loading

Uh oh!

thomwolf Mar 5, 2020

Uh oh!

patrickvonplaten Mar 5, 2020

Uh oh!

patrickvonplaten Mar 5, 2020

Uh oh!

patrickvonplaten Mar 5, 2020

Uh oh!

patrickvonplaten commented Mar 5, 2020

Uh oh!

thomwolf left a comment

Uh oh!

Uh oh!

thomwolf Mar 5, 2020

Uh oh!

Uh oh!

Refactoring and bug fixing beam search generate #3135

Refactoring and bug fixing beam search generate #3135

Uh oh!

Conversation

patrickvonplaten commented Mar 5, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Future PR

Uh oh!

Uh oh!

patrickvonplaten Mar 5, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

patrickvonplaten Mar 5, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

thomwolf Mar 5, 2020

Choose a reason for hiding this comment

Uh oh!

patrickvonplaten Mar 5, 2020

Choose a reason for hiding this comment

Uh oh!

patrickvonplaten Mar 5, 2020

Choose a reason for hiding this comment

Uh oh!

patrickvonplaten Mar 5, 2020

Choose a reason for hiding this comment

Uh oh!

patrickvonplaten commented Mar 5, 2020

Uh oh!

thomwolf left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

thomwolf Mar 5, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

patrickvonplaten commented Mar 5, 2020 •

edited

Loading

patrickvonplaten Mar 5, 2020 •

edited

Loading

patrickvonplaten Mar 5, 2020 •

edited

Loading