Skip to content

fix TSDAE weight tying on transformers v5#3781

Merged
tomaarsen merged 3 commits into
huggingface:mainfrom
omkar-334:tsdae-v5
Jun 2, 2026
Merged

fix TSDAE weight tying on transformers v5#3781
tomaarsen merged 3 commits into
huggingface:mainfrom
omkar-334:tsdae-v5

Conversation

@omkar-334

Copy link
Copy Markdown
Contributor

DenoisingAutoEncoderLoss raised a RuntimeError on transformers >= 5.0.0 whenever tie_encoder_decoder=True (the default). Since tying is the recommended TSDAE setting, this effectively broke TSDAE on v5 unless you pinned an older transformers.

loss code transformers 4.57.6 (<5) transformers 5.9.0 (>=5)
before this PR ✅ works RuntimeError
with this PR ✅ works ✅ works

The loss tied the encoder and decoder through the private PreTrainedModel._tie_encoder_decoder_weights. transformers v5 removed that helper and moved tying to a per model _tied_weights_keys mapping that only applies within a single PreTrainedModel. TSDAE keeps the encoder and decoder as two separate models, so the native v5 machinery cannot bridge them, and #3619 added the RuntimeError as a stopgap.

Fixes #3737

@omkar-334

omkar-334 commented May 23, 2026

Copy link
Copy Markdown
Contributor Author

transformers 4.57.6 - Before and after

Screenshot 2026-05-24 at 5 21 46 AM

transformers 5.9.0 - Before

Screenshot 2026-05-24 at 5 22 35 AM

transformers 5.9.0 - After

Screenshot 2026-05-24 at 5 22 58 AM

@tomaarsen

Copy link
Copy Markdown
Member

Hello!

This looks great, thank you for the detailed before and after. I wrote a tiny cleanup and added some simple tests to avoid further regressions (hopefully), and then I think this is ready!

  • Tom Aarsen

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes TSDAE (DenoisingAutoEncoderLoss) weight tying for transformers v5 by replacing the removed private HF helper with an in-repo tying implementation, and adds regression tests to validate weight sharing and gradient flow.

Changes:

  • Replace usage of PreTrainedModel._tie_encoder_decoder_weights (removed in transformers v5) with a local _tie_encoder_decoder_weights implementation.
  • Remove the transformers v5 RuntimeError guard so tie_encoder_decoder=True works again.
  • Add a dedicated test suite covering non-raising behavior, true parameter aliasing, and backward/grad flow for tied weights.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
sentence_transformers/sentence_transformer/losses/denoising_auto_encoder.py Implements new cross-model encoder/decoder tying logic and removes the v5 incompatibility error.
tests/sentence_transformer/losses/test_denoising_auto_encoder.py Adds regression tests ensuring tied weights actually share storage and training signals propagate.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/sentence_transformer/losses/test_denoising_auto_encoder.py

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

@tomaarsen tomaarsen merged commit 3ed5567 into huggingface:main Jun 2, 2026
17 of 18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DenoisingAutoEncoderLoss and Transformers v5

3 participants