BLIP: enable generation tests#34174
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
I'd like very much to avoid this change -- extra logic for all tests to handle a niche corner case. Let's brainstorm alternatives! |
|
@gante I tried to force OMG, I found an option while writing this reply, whisper and the other audio model are encoder-decoder so we can make it work by getting main input in decoder-only models. Just before the check happens, in the same indent block :) |
ArthurZucker
left a comment
There was a problem hiding this comment.
AH actually we might need / want to force return_dict to TRUE, to avoid all the if elses
if it works, sounds good! (make sure to leave a comment) |
|
@gante requesting re-review, since the |
ArthurZucker
left a comment
There was a problem hiding this comment.
Thanks for cleaning up
| logits = outputs.logits if return_dict else outputs[1] | ||
| loss = outputs.loss | ||
| logits = outputs.logits | ||
| outputs = outputs.to_tuple() if not return_dict else outputs |
There was a problem hiding this comment.
outputs will have loss and logits twice there
There was a problem hiding this comment.
(this was probably already a bug)
There was a problem hiding this comment.
yep, in general i don't like that we return it as this and would better return unwrapped lm outputs. But we can't prob just delete it for BC reasons
| models_without_standard_cache = ( | ||
| "ctrl", | ||
| "fsmt", | ||
| "gptbigcode", | ||
| "mega", | ||
| "reformer", | ||
| "jamba", | ||
| "mamba", | ||
| "xlnet", | ||
| "zamba", | ||
| ) | ||
| has_standard_cache = not any( | ||
| model_name in config.__class__.__name__.lower() for model_name in models_without_standard_cache | ||
| ) | ||
| if has_standard_cache: |
There was a problem hiding this comment.
when we overwrite we don't need that no?
There was a problem hiding this comment.
yeah, will remove unnecessary part
* blip2 tests * instructblips * copies * fix slow tests * fix * uncomment this * clean up after rebase * should be model main input * fix overwritten tests * oops len should be multiple of frame number * style * fix some tests
What does this PR do?
Enables generation tests for BLIP models, except BLIP-1 (turned out to be a bit harder). I changed the generation tests to use
modelTest.input_nameas BLIP is the only model that uses pixel values as main input and thus checking generated text length's will always fail.I tried to get rid of custom generate for these models, but that opened a Pandora box so I think better not waste time on an old model and maintain it for a while, until the model gets deprecated. But still I did some changes so we don't need to add extra
bosat the beginning and now the decoder-based BLIP models return full text at output. Encoder-decoder based models return only generated text, which is consistent with what an LLM should do