[feat] Per-call processing_kwargs override in Transformer.preprocess#3753
Merged
tomaarsen merged 1 commit intoMay 7, 2026
Merged
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
Adds a per-call processing_kwargs override path for Transformer.preprocess, and threads the merged kwargs through the processor call chain so chat_template overrides reliably reach apply_chat_template. It also allowlists processing_kwargs at the main inference entry points and adds regression tests to ensure the kwarg isn’t silently dropped.
Changes:
- Add
processing_kwargsparameter toTransformer.preprocesswith shallow per-top-level-key merge semantics overself.processing_kwargs. - Plumb merged
chat_templatekwargs through_call_processor→_process_chat_messagesso per-call overrides work for message inputs. - Allowlist and document
processing_kwargsonSentenceTransformer.encode/SparseEncoder.encode(and document onCrossEncoder.predict), with test coverage across all three model types.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| tests/sparse_encoder/test_model.py | Adds integration test verifying SparseEncoder.encode(..., processing_kwargs=...) affects preprocessing (truncation) and output. |
| tests/sentence_transformer/test_model.py | Adds integration test verifying SentenceTransformer.encode(..., processing_kwargs=...) affects tokenization and embeddings. |
| tests/cross_encoder/test_model.py | Adds integration test verifying CrossEncoder.predict(..., processing_kwargs=...) affects preprocessing and prediction scores. |
| tests/base/modules/test_transformer.py | Adds unit tests for per-call merge behavior, warning on unknown keys, non-mutation, and per-call chat_template override routing. |
| sentence_transformers/sparse_encoder/model.py | Allowlists processing_kwargs in SparseEncoder.encode kwarg validation and documents it. |
| sentence_transformers/sentence_transformer/model.py | Allowlists processing_kwargs in SentenceTransformer.encode kwarg validation and documents it. |
| sentence_transformers/cross_encoder/model.py | Documents processing_kwargs as a forwarded kwarg for predict. |
| sentence_transformers/base/modules/transformer.py | Implements per-call processing_kwargs merge and threads chat_template_kwargs through processor dispatch. |
| sentence_transformers/base/model.py | Documents that processing_kwargs is forwarded to the input module preprocess (notably Transformer). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
tomaarsen
commented
May 6, 2026
Comment on lines
+1218
to
+1223
| text = "this sentence is much longer than four tokens for sure" | ||
| truncated = model.encode( | ||
| [text], | ||
| convert_to_tensor=True, | ||
| processing_kwargs={"text": {"max_length": 4, "truncation": True}}, | ||
| ) |
Member
Author
There was a problem hiding this comment.
This is how it can be used for actual models from this PR onwards.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Hello!
Pull Request overview
processing_kwargsoverride toTransformer.preprocessencode/predictand document it on the entry pointsDetails
A dependent project wanted to pass different
processing_kwargson each call rather than only at construction time. I've added aprocessing_kwargsparameter toTransformer.preprocessthat merges on top of the instance-levelself.processing_kwargswith shallow per-modality semantics, so individual settings (e.g. onlymax_length) can be overridden without replacing the whole modality dict.The merged dict is also threaded through
_call_processorto_process_chat_messages, since thechat_templateslot was read directly fromself.processing_kwargsrather than via the existingmodality_kwargs/common_kwargsplumbing. Without this, per-call{"chat_template": {...}}overrides would silently no-op.SentenceTransformer.encodeandSparseEncoder.encodenow allowlistprocessing_kwargsalongsidetask;CrossEncoder.predictandBaseModel.preprocessalready forward**kwargsthrough.cc @NohTow