Skip to content

Saving and loading LlavaOnevisionProcessor results in unexpected behavior #33484

@yonigozlan

Description

@yonigozlan

System Info

  • transformers 4.45.0.dev0 (up-to-date with github repo)

Who can help?

@zucchini-nlp @amyeroberts

Reproduction

If we save an instance of LlavaOnevisionProcessor such as:

tmpdirname = tempfile.mkdtemp()

image_processor = LlavaOnevisionImageProcessor()
video_processor = LlavaOnevisionVideoProcessor()
tokenizer = Qwen2TokenizerFast.from_pretrained("Qwen/Qwen2-0.5B-Instruct")

processor = LlavaOnevisionProcessor(
    video_processor=video_processor, image_processor=image_processor, tokenizer=tokenizer
)
processor.save_pretrained(tmpdirname)

and then try to load it with AutoImageProcessor:

>>> image_processor = AutoImageProcessor.from_pretrained(tmpdirname)
>>> print(image_processor.__class__.__name__)
LlavaOnevisionVideoProcessor

Variant of the problem:

tmpdirname = tempfile.mkdtemp()
image_processor = LlavaOnevisionImageProcessor(rescale_factor=10)
video_processor = LlavaOnevisionVideoProcessor(rescale_factor=5)
tokenizer = Qwen2TokenizerFast.from_pretrained("Qwen/Qwen2-0.5B-Instruct")

processor = LlavaOnevisionProcessor(
    video_processor=video_processor, image_processor=image_processor, tokenizer=tokenizer
)
processor.save_pretrained(tmpdirname)
>>> image_processor = LlavaOnevisionImageProcessor.from_pretrained(tmpdirname)
>>> print(image_processor.rescale_factor)
5

Expected behavior

Expected: no ambiguity on whether we are loading the image_processor or the video_processor config/class.

It looks like both image_processor and video_processor configs are saved to the same preprocessor_config.json file as they both inherit from BaseImageProcessor, and thus overwrite each other when saved.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions