Conversation
… not sure if RoPE is right.
Co-authored-by: Joshua Lochner <admin@xenova.com>
…arbitrary number of images in prompts
…e starting with BOS
Raushan address PR comments
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
|
Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. When it is ready for review, please click the |
ArthurZucker
left a comment
There was a problem hiding this comment.
As reviewed before LGTM!
| ("fnet", "FNetForPreTraining"), | ||
| ("fsmt", "FSMTForConditionalGeneration"), | ||
| ("funnel", "FunnelForPreTraining"), | ||
| ("gemma3", "Gemma3ForConditionalGeneration"), |
There was a problem hiding this comment.
I think this should be included in MODEL_FOR_VISION_2_SEQ_MAPPING_NAMES as well? I can't load the multimodal variant using AutoModelForVision2Seq, which works for most multimodal models.
There was a problem hiding this comment.
VLMs should be loaded with AutoModelForImageTextToText which is a new mapping we added for multimodal models. The old AutoModelForVision2Seq is supposed to work only for models like BLIP which are used to take bare images without instructions, and caption them
Since earlier we didn't have a specific mapping for VLMs, everything got dumped in Vision2Seq, sorry if it was confusing. New releases all will come under ImageTextToText + all older models support this mapping
There was a problem hiding this comment.
I see, thanks for the explanation!
What does this PR do?
Fixes # (issue)
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.