TF generate refactor - XLA sample by gante · Pull Request #16713 · huggingface/transformers

gante · 2022-04-11T21:58:32Z

What does this PR do?

This PR brings XLA to sample, in generate. Four important details before reviewing:

The diff has the changes of TF beam search: handle case without past #16704, review that PR first plz :) It fixes a test from beam_search. I will rebase as soon as the other PR gets merged (the changes were bundled to confirm that it passes all generate tests).
The body is mostly copy/paste from greedy_search;
The sample step was changed from the previous implementation -- if we want to seed sampling with XLA, we need to use the stateless functions;
The XLA sample tests do not compare all generated tokens to their non-XLA sample counterparts, due to the numerical instabilities discussed on Slack. We do compare the first tokens, which are the same.

Finally, tests have been run for the usual models (gpt2, t5, rag, speech2text, encoder_decoder, vision_encoder_decoder, bart).

I've also run a quick sanity check on GPU. Using GPT2+sample, on an Nvidia T4:

eager TF: ~1.7s
XLA TF: ~54ms (~22s compile time) 👉 31x speedup

gante · 2022-04-11T22:09:53Z

tests/gpt2/test_modeling_tf_gpt2.py

This test was pretty much the same as test_lm_generate_gpt2 (the only difference was the starting input_ids)

HuggingFaceDocBuilderDev · 2022-04-11T22:22:41Z

The documentation is not available anymore as the PR was closed or merged.

patrickvonplaten · 2022-04-12T09:09:51Z

src/transformers/generation_tf_utils.py

If I understand this correctly, if the user passes a seed tuple, the same seed is used on every sampling for the entire generation run? That's not a problem, just making sure I got it right!

src/transformers/generation_tf_utils.py

patrickvonplaten · 2022-04-12T09:13:58Z

src/transformers/generation_tf_utils.py

(nit) I'd prefer to move input_ids_length before model_kwargs -> kwargs or model_kwargs is usually the last function arg

tests/gpt2/test_modeling_tf_gpt2.py

patrickvonplaten

Great! Awesome to see such a speed-up !

Looks good to me. The only thing that is not very intuitive to me is that seed is a list of integers - why is this? Is this normal in TF?

gante · 2022-04-12T10:19:39Z

@patrickvonplaten the stateless TF functions accept a seed argument that is a tuple of two integers 😅 Not very intuitive, I agree. They correspond to the key and counter used in the internal RNG algorithms (source).

If you think it will be unintuitive for users, I can change it so that our seed argument corresponds to the key of the tuple (i.e. a single integer), and fix the counter to 0. For practical purposes, it should be the same thing.

patrickvonplaten · 2022-04-12T10:29:56Z

@patrickvonplaten the stateless TF functions accept a seed argument that is a tuple of two integers sweat_smile Not very intuitive, I agree. They correspond to the key and counter used in the internal RNG algorithms (source).

If you think it will be unintuitive for users, I can change it so that our seed argument corresponds to the key of the tuple (i.e. a single integer), and fix the counter to 0. For practical purposes, it should be the same thing.

I see - ok maybe better to leave as is then to be aligned with TF

src/transformers/generation_tf_utils.py

Rocketknight1 · 2022-04-13T12:19:06Z

tests/t5/test_modeling_tf_t5.py

Small comment but "schöner" is correct here - I hope it's just sampling issues and not a sign of a bug that the XLA version gets it wrong!

Could you try, as a kind of stupid manual once-off test, asking it to translate a batch of sentences from English to Portuguese and make sure that even if they're different, the quality of the XLA ones is similar to the quality of the manual ones? Numerical bugs can be very annoying to catch, but if the quality is similar then that would make me confident that the XLA implementation is not worse.

Rocketknight1

Overall this looks great! I have a couple of very nitpicky nitpicks, mostly because I had a bug in my XLA implementation of greedy that I didn't notice for a long time because it only degraded the quality of sampling, so now I'm paranoid about catching bugs like that by checking output quality.

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

gante · 2022-04-18T09:58:03Z

While running tests for T5 (as suggested by @Rocketknight1), I found out that our XLA code is not behaving properly for T5, for both sample and greedy_search. Because the problem is not exclusive to sample, I'm merging this PR and fixing the issue in a future one.

(example)

gante commented Apr 11, 2022

View reviewed changes

gante requested review from Rocketknight1 and patrickvonplaten April 11, 2022 22:10

patrickvonplaten reviewed Apr 12, 2022

View reviewed changes

src/transformers/generation_tf_utils.py Outdated Show resolved Hide resolved

patrickvonplaten reviewed Apr 12, 2022

View reviewed changes

src/transformers/generation_tf_utils.py Outdated Show resolved Hide resolved

patrickvonplaten reviewed Apr 12, 2022

View reviewed changes

tests/gpt2/test_modeling_tf_gpt2.py Outdated Show resolved Hide resolved

patrickvonplaten reviewed Apr 12, 2022

View reviewed changes

tests/gpt2/test_modeling_tf_gpt2.py Outdated Show resolved Hide resolved

patrickvonplaten approved these changes Apr 12, 2022

View reviewed changes

gante mentioned this pull request Apr 12, 2022

TF beam search: handle case without past #16704

Merged

Rocketknight1 reviewed Apr 13, 2022

View reviewed changes

src/transformers/generation_tf_utils.py Outdated Show resolved Hide resolved

Rocketknight1 reviewed Apr 13, 2022

View reviewed changes

Rocketknight1 approved these changes Apr 13, 2022

View reviewed changes

gante force-pushed the xla_sample branch from 13e9e51 to 884a6ff Compare April 13, 2022 15:31

gante and others added 8 commits April 18, 2022 08:53

MVP (seeded XLA/non-XLA not matching)

acaabec

xla is different

463ab06

Add tests passing

7cc2559

add seeded gpt2 xla sample test

6c146cb

Update src/transformers/generation_tf_utils.py

507bf14

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

reorder inputs in beam search loop

02a2301

pip install now updates

3df2f82

revert dbg commit

f3b1ca9

gante force-pushed the xla_sample branch from ee6361a to f3b1ca9 Compare April 18, 2022 08:55

rmv unwanted lines

492cbda

gante merged commit b4ddd26 into huggingface:main Apr 18, 2022

gante deleted the xla_sample branch April 18, 2022 09:58

elusenji pushed a commit to elusenji/transformers that referenced this pull request Jun 12, 2022

TF generate refactor - XLA sample (huggingface#16713)

0f89c5c

Conversation

gante commented Apr 11, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

gante Apr 11, 2022

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Apr 11, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

patrickvonplaten Apr 12, 2022

Choose a reason for hiding this comment

Uh oh!

Rocketknight1 Apr 13, 2022

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

patrickvonplaten Apr 12, 2022

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

patrickvonplaten left a comment

Choose a reason for hiding this comment

Uh oh!

gante commented Apr 12, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

patrickvonplaten commented Apr 12, 2022

Uh oh!

Uh oh!

Rocketknight1 Apr 13, 2022

Choose a reason for hiding this comment

Uh oh!

Rocketknight1 Apr 13, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Rocketknight1 left a comment

Choose a reason for hiding this comment

Uh oh!

gante commented Apr 18, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

gante commented Apr 11, 2022 •

edited

Loading

HuggingFaceDocBuilderDev commented Apr 11, 2022 •

edited

Loading

gante commented Apr 12, 2022 •

edited

Loading

Rocketknight1 Apr 13, 2022 •

edited

Loading