Tests: upgrade test_eager_matches_sdpa_generate#34386
Tests: upgrade test_eager_matches_sdpa_generate#34386gante merged 3 commits intohuggingface:mainfrom
test_eager_matches_sdpa_generate#34386Conversation
tests/generation/test_utils.py
Outdated
There was a problem hiding this comment.
(this is mostly copy-paste, going to comment the sections that are changed)
tests/generation/test_utils.py
Outdated
There was a problem hiding this comment.
Uses self.prepare_config_and_inputs_for_generate() instead, which enables us to pass a dictionary of inputs to generate (better input control than simply using inputs_dict[model_class.main_input_name])
tests/generation/test_utils.py
Outdated
There was a problem hiding this comment.
Uses dictionaries -> more compact
tests/generation/test_utils.py
Outdated
There was a problem hiding this comment.
flakiness handling as explained in the PR header
tests/generation/test_utils.py
Outdated
There was a problem hiding this comment.
with #34282 we won't have to init 2 models.
Also this is memory hungry
There was a problem hiding this comment.
TBH we can do it one model at a time, going to change the test.
After Flex attention becomes the norm, we probably won't need this test
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
ydshieh
left a comment
There was a problem hiding this comment.
This is sooooo great! ❤️
6ffa18d to
415d009
Compare
* Support BatchNorm in Hubert pos_conv_emb as in fairseq * Correct the new defaults (#34377) * Correct the new defaults * CIs * add check * Update utils.py * Update utils.py * Add the max_length in generate test checking shape without passing length * style * CIs * fix fx CI issue * [auto. ping] Avoid sending empty info + add more team members (#34383) * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Fix glm (#34388) * Fix duplicated * fix import * Use non nested images and batched text Idefics2/3 (#34222) * add support for non nested images and add tests * add tests error scenario * fix style * added single and no image to error tests * Fix onnx non-expotable inplace aten op (#34376) * fix onnx non-expotable inplace op * mistral, qwen2, qwen2_vl, starcoder2 * fixup copies * Fix right padding in LLaVA models (#34305) * fix right pad llavas * device mismatch * no filter (#34391) * no filter * no filter * no filter --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * SynthID: better example (#34372) * better example * Update src/transformers/generation/configuration_utils.py * Update src/transformers/generation/logits_process.py * nits * Tests: upgrade `test_eager_matches_sdpa_generate` (#34386) * Fix bnb training test failure (#34414) * Fix bnb training test: compatibility with OPTSdpaAttention * Avoid check expected exception when it is on CUDA (#34408) * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Fix typos in agents_advanced.md (#34405) * [docs] Cache implementations (#34325) cache * [run-slow] hubert * Support BatchNorm in Hubert pos_conv_emb as in fairseq Add conversion integration test, and make batchnorm explicit variable * Support BatchNorm in Hubert pos_conv_emb as in fairseq fix make fixup styling changes * [run-slow] hubert * Support BatchNorm in Hubert pos_conv_emb as in fairseq * [run-slow] hubert * Support BatchNorm in Hubert pos_conv_emb as in fairseq Add conversion integration test, and make batchnorm explicit variable * Support BatchNorm in Hubert pos_conv_emb as in fairseq fix make fixup styling changes * [run-slow] hubert * [run-slow] hubert --------- Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com> Co-authored-by: Raushan Turganbay <raushan@huggingface.co> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com> Co-authored-by: Rudy Delouya <rudy.delouya@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>
What does this PR do?
test_eager_matches_sdpa_generatehas failed in our new failure reporting system (here, cc @ydshieh )Having a look at the test, the cause for flakiness was clear: we are using random models with
generate, and tiny perturbations can result in a different sampled token, causing generation to go in a different direction and ultimately failing the check (eager generate == sdpa generate).This PR:
GenerationTesterMixin, as it callsgenerateis_flaky().The following test commands were run:
RUN_SLOW=1 py.test tests/models/ -k test_eager_matches_sdpa_generate✅RUN_SLOW=1 py.test tests/models/gpt2/test_modeling_gpt2.py::GPT2ModelTest::test_eager_matches_sdpa_generate --flake-finder --flake-runs 500✅