CI: avoid human error, automatically infer generative models#33212
CI: avoid human error, automatically infer generative models#33212ydshieh merged 40 commits intohuggingface:mainfrom
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
|
Super cool! If VLMs start failing on generation tests, that's okay. I have a local draft for that, which needs to be reviewed after a few other PRs are merged |
489a2d6 to
1e8e794
Compare
|
Many side-fixes later, ready for review 🙌 |
| class TFCoreModelTesterMixin: | ||
| model_tester = None | ||
| all_model_classes = () | ||
| all_generative_model_classes = () |
There was a problem hiding this comment.
why we don't need def all_generative_model_classes for TF ..?
There was a problem hiding this comment.
incorrectly moved
(I tried to apply the same pattern on TF, but it has too many broken models. Then I reverted the changes. This one was incorrectly reverted, I'm going to chase down the TF diff to 0)
ydshieh
left a comment
There was a problem hiding this comment.
left 3 nit question but yeah thank you!
I see a few files have style changes: I always think they should be fixed in a separate PR, but you can keep them here if you wish.
|
@ydshieh all comments addressed 🤗 |
ydshieh
left a comment
There was a problem hiding this comment.
Thank you again.
Just to make sure the change in tests/test_modeling_tf_common.py is what you expect:
- removing
all_generative_model_classes = ()OK - not adding
def all_generative_model_classes(self):: if this is intended, OK, for you to check
ah , I see you put it back :-), all good then |
zucchini-nlp
left a comment
There was a problem hiding this comment.
Thanks, this hopefully helps us prevent cases when generation tests are not added.
I am not sure I got why audio models generation is turned off, if it was enabled before and CI was green? Same question for Blip2
| # Doesn't run generation tests. TODO: fix generation tests for Blip2ForConditionalGeneration | ||
| all_generative_model_classes = () |
There was a problem hiding this comment.
I think this was working on main for Blip2ForConditionalGeneration, are there many failures?
There was a problem hiding this comment.
If we remove this line (= if we don't skip the tests), py.test tests/models/blip_2/test_modeling_blip_2.py rebased on main results in 21 failures :P
I've also double-checked the other models with skips on Monday. Most of them have unique model properties that do not work well with generate
There was a problem hiding this comment.
BTW, note that there are two testers (Blip2ForConditionalGenerationDecoderOnlyTest and Blip2ModelTest), the skips are only on the latter. I don't know why the latter needs to skip, but that's beyond the scope of this PR :P
They were also being skipped before.
There was a problem hiding this comment.
on main, Blip2ModelTest doesn't have all_generative_model_classes. which means it doesn't run generate tests, and this PR doesn't skip any extra tests for this test class
(I don't know why however)
There was a problem hiding this comment.
I see, thanks! I believe something similar for audio model since for me the skip comments weren't very clear
There was a problem hiding this comment.
yeah, that's why I ask frequently more (detailed) comments in many PRs I reviewed 😆
|
@gante you can ping me to merge if the CI persists to be red but irrelevant to your PR :-) |
|
@ydshieh yes, please merge 🙏 (was trying to be autonomous :D) |
What does this PR do?
This PR:
generatewith an automatic definition -- ifmodel_class.can_generate(), then it runs tests fromGenerationTesterMixin. No more human mistakes, which happened frequently in the past months 🐛 🔫all_generative_model_classesand explain why certain classes are being skipped. (bad apples detected withpy.test tests/models/ -k test_greedy_generate -vv) 💔In a follow-up PR:
We need to manually define the model's main input name (e.g.,Done ✅pixel_values) in the model tester. Make it usemodel.main_input_nameinstead, to avoid human errorgeneratetests will only run ifGenerationTesterMixinis inherited. We can easily forget to add the mixin, resulting in a false positive. Add an automated check: if any of the model classes can generate, thenGenerationTesterMixinmust be inherited in the tester