Skip to content

[tests] Stricter generate + compilation test -- no recompilations allowed#37629

Merged
gante merged 5 commits intohuggingface:mainfrom
gante:compilation_tests_no_recompilation
Apr 22, 2025
Merged

[tests] Stricter generate + compilation test -- no recompilations allowed#37629
gante merged 5 commits intohuggingface:mainfrom
gante:compilation_tests_no_recompilation

Conversation

@gante
Copy link
Contributor

@gante gante commented Apr 19, 2025

What does this PR do?

Follow-up to #37447

This PR upgrades test_generate_compile_model_forward to catch recompilation issues. This is done by a) activating recompilation logs b) catching recompilation messages in the logs. The improved test would have failed with the changes that broke gemma 2/3 + compile 🤗

In the process, a few extra things were standardized in the test, and a few skips were removed.

@gante gante requested a review from zucchini-nlp April 19, 2025 13:24
@github-actions
Copy link
Contributor

Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. The CI will be paused while the PR is in draft mode. When it is ready for review, please click the Ready for review button (at the bottom of the PR page). This will assign reviewers and trigger CI.

@github-actions github-actions bot marked this pull request as draft April 19, 2025 13:24
@gante gante requested a review from Cyrilvallez April 19, 2025 13:24
@gante gante marked this pull request as ready for review April 19, 2025 13:25
else:
# the 4D causal mask exists, it should be present in the base model (XXXModel class) or in its decoder.
base_model = getattr(self, self.base_model_prefix, self)
decoder = base_model.get_decoder() if hasattr(base_model, "get_decoder") else None
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this fixes the compilation test for opt and helps with decoder-only whisper

@gante
Copy link
Contributor Author

gante commented Apr 19, 2025

cc @manueldeprada (since you mentioned you are interested in torch.compile)

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Member

@zucchini-nlp zucchini-nlp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the test! So this means all VLMs are actually compiling without any breaks, right. I am still wondering about the slow downs we experienced when compiling certain models

One q about the test: it guards torch>=2.6, are the runner installing the latest version? Would be nice to get this test running on each PR to avoid accidentally breaking gemma again

Comment on lines +2094 to +2096
# TODO (joao, raushan): do we need a custom `generate` in these models? can we call `super().generate`, as
# opposed to the inner model's `generate`? If yes, we would get a more standardized codebase
if "blip" in model.__class__.__name__.lower():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, blip cannot be fixed anymore, we risk to break it for existing hub repos 😢 I had a branch somewhere, but it didn't work

And it's not worth fixing, we can even skip compile tests for blip if it's making our life harder

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haha okay I'll remove the TODO 😢 (I'm a bit sad on the inside)

@gante
Copy link
Contributor Author

gante commented Apr 21, 2025

@zucchini-nlp

So this means all VLMs are actually compiling without any breaks, right. I am still wondering about the slow downs we experienced when compiling certain models

Yes, they are compilable without graph breaks. That by itself is useful, it means they can probably be exported with torch.export 💪 We'll have to profile to understand why we don't observe as big speedups as in LLMs 🔍 It may be due to the number of input tokens, it may be due to suboptimal image processing ops, ...

One q about the test: it guards torch>=2.6, are the runner installing the latest version? Would be nice to get this test running on each PR to avoid accidentally breaking gemma again

Yes, they are running torch 2.6 (the runners' images install the latest version e.g. here).

On this PR's ci/circleci: tests_torch:
Screenshot 2025-04-21 at 15 42 20

Copy link
Member

@zucchini-nlp zucchini-nlp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, happy to hear CI actually runs tests!

@gante gante merged commit 85665a4 into huggingface:main Apr 22, 2025
20 checks passed
@gante gante deleted the compilation_tests_no_recompilation branch April 22, 2025 10:12
@gante gante mentioned this pull request Apr 23, 2025
2 tasks
zucchini-nlp pushed a commit to zucchini-nlp/transformers that referenced this pull request May 14, 2025
…owed (huggingface#37629)

* tmp commit

* stricter compilation test

* trigger tests

* rm todo
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants