Skip to content

Fix top_k default value to 0 for disabling top-k filtering#4695

Merged
albertvillanova merged 8 commits into
huggingface:mainfrom
albertvillanova:fix-top-k-disabling-value
Dec 16, 2025
Merged

Fix top_k default value to 0 for disabling top-k filtering#4695
albertvillanova merged 8 commits into
huggingface:mainfrom
albertvillanova:fix-top-k-disabling-value

Conversation

@albertvillanova

@albertvillanova albertvillanova commented Dec 15, 2025

Copy link
Copy Markdown
Member

Fix top_k default value to 0 for disabling top-k, instead of ambiguous (and non-documented) None.

See related comment:

For context, I was testing this fix in transformers:

and I discovered this edge case for tests/test_grpo_trainer.py::TestGRPOTrainer::test_training_vlm[trl-internal-testing/tiny-Qwen2VLForConditionalGeneration]:

  • model has top_k=1 in its generation_config.json
  • we set top_k=None by default, so k-filtering is disabled
  • as the latter is None, it gets silently overwritten to 1

Therefore, we need this PR so the transformers fix works as expected once merged.

Not sure about the expected default behavior:

  • since now, we were setting default value to None, saying this disables top-k filtering
  • however, transformers default value is 50

Should I set 50 instead of 0 as our default?

@HuggingFaceDocBuilderDev

Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@qgallouedec

Copy link
Copy Markdown
Member

Should I set 50 instead of 0 as our default?

We should use 0, see #3494

--

Can you also update the docstring? like here

https://github.com/albertvillanova/trl/blob/21cf0a9a9680b9c41128828eb04a827e0565b6c0/trl/trainer/grpo_config.py#L84-L86

@albertvillanova

Copy link
Copy Markdown
Member Author

I am fixing vLLM top_kl...

@qgallouedec

Copy link
Copy Markdown
Member

After checking vLLM, 0 is now a valid value for top_k. The most recent version that disallowed 0 was 0.8.5, see

https://github.com/vllm-project/vllm/blob/3015d5634e74d59704e2b39bab0dbe2e6f86a38a/vllm/sampling_params.py#L411-L413

"temperature": self.temperature,
"top_p": self.top_p,
"top_k": -1 if self.top_k is None else self.top_k,
"top_k": -1 if not self.top_k else self.top_k,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the vLLM support to consider all tokens by setting top_k=0 (besides -1) was introduced in vllm-0.9.0

and we require vllm>=0.10.2.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the other hand, what if the users still set top_k=None? Maybe we should start a deprecation cycle.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could indeed deprecate None for 1 version before removal. I'd do it only for GRPO and RLOO though

@qgallouedec qgallouedec left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Just a few nits

`1.0` to consider all tokens.
top_k (`int`, *optional*):
Number of highest probability vocabulary tokens to keep for top-k-filtering. If `None`, top-k-filtering is
top_k (`int`, defaults to `0`):

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
top_k (`int`, defaults to `0`):
top_k (`int`, *optional*, defaults to `0`):

Comment thread trl/trainer/grpo_config.py Outdated
`1.0` to consider all tokens.
top_k (`int`, *optional*):
Number of highest probability vocabulary tokens to keep for top-k-filtering. If `None`, top-k-filtering is
top_k (`int`, defaults to `0`):

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
top_k (`int`, defaults to `0`):
top_k (`int`, *optional*, defaults to `0`):

Comment thread trl/trainer/rloo_config.py Outdated
`1.0` to consider all tokens.
top_k (`int`, *optional*):
Number of highest probability vocabulary tokens to keep for top-k-filtering. If `None`, top-k-filtering is
top_k (`int`, defaults to `0`):

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
top_k (`int`, defaults to `0`):
top_k (`int`, *optional*, defaults to `0`):

@albertvillanova albertvillanova merged commit c86be21 into huggingface:main Dec 16, 2025
9 of 11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants