Skip to content

TypeError: color must be int or single-element tuple when processing a grayscale image with LLaVA #96

@isaac-vidas

Description

@isaac-vidas

When processing a grayscale image with different width and height the following error will occur.

Exception in TokenizerManager:
Traceback (most recent call last):
  File "/home/gcpuser/sglang/python/sglang/srt/managers/tokenizer_manager.py", line 61, in get_pixel_values
    image = expand2square(
  File "/home/gcpuser/sglang/python/sglang/srt/mm_utils.py", line 167, in expand2square
    result = Image.new(pil_img.mode, (width, width), background_color)
  File "/opt/conda/envs/sglang_flashinfer/lib/python3.9/site-packages/PIL/Image.py", line 2941, in new
    return im._new(core.fill(mode, size, color))
TypeError: color must be int or single-element tuple

The error originates here and it fails because PIL won't the background consisting of 3 values to the new image with L mode.

So far I only encountered this for grayscale images, perhaps one way to solve it would be to convert these images to RGB before resizing them?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions