[Data.llm] Fix multimodal image extraction when no system prompt is present#56435
Conversation
…resent PrepareImageStage was failing to extract images when messages had uniform content types (no system prompt), because Ray Data uses PyArrow serialization instead of pickle, and isinstance(pyarrow_obj, list) returns False Added .tolist() conversion like ChatTemplateStage to handle both PyArrow and Python objects consistently. Fixes ray-project#56125 Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>
kouroshHakha
left a comment
There was a problem hiding this comment.
Awesome. Can you add a unittest as well that covers these failure cases ? uniform and non uniform types?
There was a problem hiding this comment.
Awesome. Can you add a unittest as well that covers these failure cases ? uniform and non uniform types?
yes will re-add tests to ensure metadata doesn't get lost etc and the request is parsed/passed through in both cases (prompt/no-prompt) - without actually evaluating the model output
Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>
nrghosh
left a comment
There was a problem hiding this comment.
fixed + added tests
|
Yay amazing, thank you @nrghosh for investigating and fixing this! 🙏 |
nrghosh
left a comment
There was a problem hiding this comment.
fixed + added tests cc @kouroshHakha
| pass | ||
|
|
||
|
|
||
| # Test that image extraction works consistently with both uniform content types |
There was a problem hiding this comment.
can you consolidate these tests into one test unit with parametrization so that maintenance is simpler?
Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>
…resent (ray-project#56435) Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com> Signed-off-by: zac <zac@anyscale.com>
…resent (ray-project#56435) Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com> Signed-off-by: Marco Stephan <marco@magic.dev>
…resent (ray-project#56435) Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com> Signed-off-by: Douglas Strodtman <douglas@anyscale.com>
…resent (ray-project#56435) Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>
…resent (ray-project#56435) Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>
Added
.tolist()conversion just like inChatTemplateStageto handle both PyArrow and Python objects consistently.Problem:
PrepareImageStage.extract_image_info()method has a hardcoded isinstance(message["content"], list) check that only works with Python lists, not PyArrow objects, causing it to silently skip all image extraction in the uniform caseFix:
.tolist()conversion to handle PyArrow objects the same wayChatTemplateStagedoes -> consistent image extraction and handling regardless of serialization method (prompt vs no prompt).Why are these changes needed?
PrepareImageStage was failing to extract images when messages had uniform content types (no system prompt), because Ray Data uses PyArrow serialization instead of pickle, and
isinstance(pyarrow_obj, list)->FalseRelated issue number
Fixes #56125
Checks
git commit -s) in this PR.scripts/format.shto lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/under thecorresponding
.rstfile.Reproduction Script (based on user repro)
Differences
VLLM_USE_V1="1"