[`fix`] Clarify unsupported-modality error messages by tomaarsen · Pull Request #3792 · huggingface/sentence-transformers

tomaarsen · 2026-06-02T14:03:32Z

Supersedes #3789

Hello!

Pull Request overview

Clarify the errors raised for unsupported, mixed, and combined-modality inputs
Share a single raise_unsupported_modality_error helper across BaseModel and Transformer

Details

When inputs don't match what a model supports, the errors were confusing. Because a batch that mixes modalities collapses to the internal "message" modality (the format we use to combine modalities into one input), passing e.g. a list of images and texts to CLIP raised Modality 'message' is not supported, even though the user never passed a message. That's the confusion reported in #3722. A single combined {"text": ..., "image": ...} input was worse: on a model that supports both modalities individually it raised Modality 'image+text' is not supported. Supported modalities: text, image, contradicting itself.

I've reworked the validation to re-inspect the per-sample modalities on the error path and emit guidance tailored to the actual situation: explicit chat-style message inputs, a mixed batch containing a genuinely unsupported modality (now named explicitly), a mixed batch the model could encode one modality at a time (so the message says to do exactly that), and a single combined input whose parts are each supported but can't be fused without message support. All of this now lives in one raise_unsupported_modality_error helper in modality.py, shared by BaseModel.preprocess and Transformer.preprocess so both raise the same accurate message.

This supersedes #3789, which clarified only the mixed-modality batch case. This version should cover every unsupported-modality scenario and unifies the two code paths. I also expanded the tests into a parametrized matrix that exercises the real inference path for each case, with guards against misleading wording (e.g. never claiming a modality is unsupported when every part is actually supported).

Tom Aarsen

Copilot

Pull request overview

This PR improves modality-validation error reporting by centralizing the unsupported-modality messaging logic in a shared helper and expanding tests to cover mixed, combined, and chat-style message inputs. It aims to ensure users see errors that reflect what they actually passed (rather than internal “message” inference artifacts).

Changes:

Added raise_unsupported_modality_error() in sentence_transformers/base/modality.py and wired it into both BaseModel.preprocess and Transformer.preprocess.
Updated and expanded tests to validate clearer, scenario-specific error messages for unsupported, mixed, combined, and explicit chat-style inputs.
Adjusted static embedding modality tests to expect the new mixed-batch wording.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
`sentence_transformers/base/modality.py`	Introduces shared helper to generate tailored unsupported-modality errors.
`sentence_transformers/base/model.py`	Switches `BaseModel.preprocess` to use the shared helper for modality errors.
`sentence_transformers/base/modules/transformer.py`	Switches `Transformer.preprocess` to use the shared helper for modality errors.
`tests/base/test_model.py`	Replaces prior targeted modality tests with a parametrized matrix covering many error scenarios.
`tests/base/modules/test_transformer.py`	Adds coverage ensuring Transformer uses the shared helper and emits expected wording.
`tests/sentence_transformer/modules/test_static_embedding.py`	Updates expected error message for mixed-modality batch handling.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Clarify unsupported-modality error messages

d0da490

tomaarsen requested a review from Copilot June 2, 2026 14:03

Copilot started reviewing on behalf of tomaarsen June 2, 2026 14:03 View session

Copilot AI reviewed Jun 2, 2026

View reviewed changes

Comment thread sentence_transformers/base/modality.py

Comment thread sentence_transformers/base/modality.py

Also mention lists of dicts as message inputs

6441355

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

tomaarsen merged commit 2f64929 into huggingface:main Jun 12, 2026
17 of 18 checks passed

This was referenced Jun 12, 2026

[fix] Clarify mixed-modality batch errors #3789

Closed

Mixing modalities in encode() doesn't work for CLIP #3722

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[`fix`] Clarify unsupported-modality error messages#3792

[`fix`] Clarify unsupported-modality error messages#3792
tomaarsen merged 2 commits into
huggingface:mainfrom
tomaarsen:fix/better-modality-errors

tomaarsen commented Jun 2, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

tomaarsen commented Jun 2, 2026

Pull Request overview

Details

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants