Skip to content

fix: forward use_cache kwarg to attention mixer in nemotron_h#45792

Merged
zucchini-nlp merged 2 commits into
huggingface:mainfrom
CharlieKerfoot:fix/attention-mixer-invoked
May 5, 2026
Merged

fix: forward use_cache kwarg to attention mixer in nemotron_h#45792
zucchini-nlp merged 2 commits into
huggingface:mainfrom
CharlieKerfoot:fix/attention-mixer-invoked

Conversation

@CharlieKerfoot

Copy link
Copy Markdown
Contributor

In src/transformers/models/nemotron_h/modular_nemotron_h.py:294 the attention mixer is called with user_cache=use_cache. The typo means use_cache is never forwarded and an unexpected user_cache kwarg gets passed through instead.

Simply, Rename the keyword argument from user_cache to use_cache so the flag actually reaches the attention mixer.

@github-actions

github-actions Bot commented May 5, 2026

Copy link
Copy Markdown
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: nemotron_h

@zucchini-nlp zucchini-nlp left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch! TBH I am sure if we are keeping use_cache arg on purpose since it is not used anymore in latest releases. Mostly just passed as kwarg

Could you run make fix-repo to fxi CI?

@CharlieKerfoot CharlieKerfoot force-pushed the fix/attention-mixer-invoked branch from 77148a6 to a03e2ed Compare May 5, 2026 17:40
@zucchini-nlp zucchini-nlp enabled auto-merge May 5, 2026 18:17
@zucchini-nlp zucchini-nlp added this pull request to the merge queue May 5, 2026
@HuggingFaceDocBuilderDev

Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Merged via the queue into huggingface:main with commit 41c3a5a May 5, 2026
22 checks passed
Exile333 pushed a commit to Exile333/transformers that referenced this pull request May 6, 2026
…gface#45792)

* fix: forward use_cache kwarg to attention mixer in nemotron_h

* Ran make fix-repo
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants