Skip to content

[fix] Clarify unsupported-modality error messages#3792

Merged
tomaarsen merged 2 commits into
huggingface:mainfrom
tomaarsen:fix/better-modality-errors
Jun 12, 2026
Merged

[fix] Clarify unsupported-modality error messages#3792
tomaarsen merged 2 commits into
huggingface:mainfrom
tomaarsen:fix/better-modality-errors

Conversation

@tomaarsen

Copy link
Copy Markdown
Member

Supersedes #3789

Hello!

Pull Request overview

  • Clarify the errors raised for unsupported, mixed, and combined-modality inputs
  • Share a single raise_unsupported_modality_error helper across BaseModel and Transformer

Details

When inputs don't match what a model supports, the errors were confusing. Because a batch that mixes modalities collapses to the internal "message" modality (the format we use to combine modalities into one input), passing e.g. a list of images and texts to CLIP raised Modality 'message' is not supported, even though the user never passed a message. That's the confusion reported in #3722. A single combined {"text": ..., "image": ...} input was worse: on a model that supports both modalities individually it raised Modality 'image+text' is not supported. Supported modalities: text, image, contradicting itself.

I've reworked the validation to re-inspect the per-sample modalities on the error path and emit guidance tailored to the actual situation: explicit chat-style message inputs, a mixed batch containing a genuinely unsupported modality (now named explicitly), a mixed batch the model could encode one modality at a time (so the message says to do exactly that), and a single combined input whose parts are each supported but can't be fused without message support. All of this now lives in one raise_unsupported_modality_error helper in modality.py, shared by BaseModel.preprocess and Transformer.preprocess so both raise the same accurate message.

This supersedes #3789, which clarified only the mixed-modality batch case. This version should cover every unsupported-modality scenario and unifies the two code paths. I also expanded the tests into a parametrized matrix that exercises the real inference path for each case, with guards against misleading wording (e.g. never claiming a modality is unsupported when every part is actually supported).

  • Tom Aarsen

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves modality-validation error reporting by centralizing the unsupported-modality messaging logic in a shared helper and expanding tests to cover mixed, combined, and chat-style message inputs. It aims to ensure users see errors that reflect what they actually passed (rather than internal “message” inference artifacts).

Changes:

  • Added raise_unsupported_modality_error() in sentence_transformers/base/modality.py and wired it into both BaseModel.preprocess and Transformer.preprocess.
  • Updated and expanded tests to validate clearer, scenario-specific error messages for unsupported, mixed, combined, and explicit chat-style inputs.
  • Adjusted static embedding modality tests to expect the new mixed-batch wording.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
sentence_transformers/base/modality.py Introduces shared helper to generate tailored unsupported-modality errors.
sentence_transformers/base/model.py Switches BaseModel.preprocess to use the shared helper for modality errors.
sentence_transformers/base/modules/transformer.py Switches Transformer.preprocess to use the shared helper for modality errors.
tests/base/test_model.py Replaces prior targeted modality tests with a parametrized matrix covering many error scenarios.
tests/base/modules/test_transformer.py Adds coverage ensuring Transformer uses the shared helper and emits expected wording.
tests/sentence_transformer/modules/test_static_embedding.py Updates expected error message for mixed-modality batch handling.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread sentence_transformers/base/modality.py
Comment thread sentence_transformers/base/modality.py
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@tomaarsen tomaarsen merged commit 2f64929 into huggingface:main Jun 12, 2026
17 of 18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants