fix TSDAE weight tying on transformers v5#3781
Merged
Merged
Conversation
Contributor
Author
Member
|
Hello! This looks great, thank you for the detailed before and after. I wrote a tiny cleanup and added some simple tests to avoid further regressions (hopefully), and then I think this is ready!
|
Contributor
There was a problem hiding this comment.
Pull request overview
Fixes TSDAE (DenoisingAutoEncoderLoss) weight tying for transformers v5 by replacing the removed private HF helper with an in-repo tying implementation, and adds regression tests to validate weight sharing and gradient flow.
Changes:
- Replace usage of
PreTrainedModel._tie_encoder_decoder_weights(removed in transformers v5) with a local_tie_encoder_decoder_weightsimplementation. - Remove the transformers v5 RuntimeError guard so
tie_encoder_decoder=Trueworks again. - Add a dedicated test suite covering non-raising behavior, true parameter aliasing, and backward/grad flow for tied weights.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
sentence_transformers/sentence_transformer/losses/denoising_auto_encoder.py |
Implements new cross-model encoder/decoder tying logic and removes the v5 incompatibility error. |
tests/sentence_transformer/losses/test_denoising_auto_encoder.py |
Adds regression tests ensuring tied weights actually share storage and training signals propagate. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.



DenoisingAutoEncoderLossraised aRuntimeErroron transformers >= 5.0.0 whenevertie_encoder_decoder=True(the default). Since tying is the recommended TSDAE setting, this effectively broke TSDAE on v5 unless you pinned an older transformers.RuntimeErrorThe loss tied the encoder and decoder through the private
PreTrainedModel._tie_encoder_decoder_weights. transformers v5 removed that helper and moved tying to a per model_tied_weights_keysmapping that only applies within a singlePreTrainedModel. TSDAE keeps the encoder and decoder as two separate models, so the native v5 machinery cannot bridge them, and #3619 added theRuntimeErroras a stopgap.Fixes #3737