Fix top_k default value to 0 for disabling top-k filtering by albertvillanova · Pull Request #4695 · huggingface/trl

albertvillanova · 2025-12-15T12:42:14Z

Fix top_k default value to 0 for disabling top-k, instead of ambiguous (and non-documented) None.

See related comment:

🚨 Generation config defaults are now None transformers#42702 (comment)

For context, I was testing this fix in transformers:

🚨 Generation config defaults are now None transformers#42702

and I discovered this edge case for tests/test_grpo_trainer.py::TestGRPOTrainer::test_training_vlm[trl-internal-testing/tiny-Qwen2VLForConditionalGeneration]:

model has top_k=1 in its generation_config.json
we set top_k=None by default, so k-filtering is disabled
as the latter is None, it gets silently overwritten to 1

Therefore, we need this PR so the transformers fix works as expected once merged.

Not sure about the expected default behavior:

since now, we were setting default value to None, saying this disables top-k filtering
however, transformers default value is 50

Should I set 50 instead of 0 as our default?

HuggingFaceDocBuilderDev · 2025-12-15T12:44:47Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

qgallouedec · 2025-12-15T15:51:32Z

Should I set 50 instead of 0 as our default?

We should use 0, see #3494

--

Can you also update the docstring? like here

https://github.com/albertvillanova/trl/blob/21cf0a9a9680b9c41128828eb04a827e0565b6c0/trl/trainer/grpo_config.py#L84-L86

albertvillanova · 2025-12-15T16:05:21Z

I am fixing vLLM top_kl...

qgallouedec · 2025-12-15T16:08:29Z

After checking vLLM, 0 is now a valid value for top_k. The most recent version that disallowed 0 was 0.8.5, see

https://github.com/vllm-project/vllm/blob/3015d5634e74d59704e2b39bab0dbe2e6f86a38a/vllm/sampling_params.py#L411-L413

qgallouedec · 2025-12-15T16:12:31Z

                "temperature": self.temperature,
                "top_p": self.top_p,
-                "top_k": -1 if self.top_k is None else self.top_k,
+                "top_k": -1 if not self.top_k else self.top_k,


I think can (should) just use self.top_k, and don't replace by -1

https://github.com/vllm-project/vllm/blob/855b101d75d2fc1fa02a47a6fcfa4053e8541cf0/vllm/sampling_params.py#L393C5-L393C54

Yes, the vLLM support to consider all tokens by setting top_k=0 (besides -1) was introduced in vllm-0.9.0

Change top_k to be disabled with 0 (still accept -1 for now) vllm-project/vllm#17773

and we require vllm>=0.10.2.

On the other hand, what if the users still set top_k=None? Maybe we should start a deprecation cycle.

we could indeed deprecate None for 1 version before removal. I'd do it only for GRPO and RLOO though

qgallouedec

Thanks! Just a few nits

qgallouedec · 2025-12-15T19:17:37Z

            `1.0` to consider all tokens.
-        top_k (`int`, *optional*):
-            Number of highest probability vocabulary tokens to keep for top-k-filtering. If `None`, top-k-filtering is
+        top_k (`int`, defaults to `0`):


Suggested change

top_k (`int`, defaults to `0`):

top_k (`int`, *optional*, defaults to `0`):

qgallouedec · 2025-12-15T19:18:04Z

            `1.0` to consider all tokens.
-        top_k (`int`, *optional*):
-            Number of highest probability vocabulary tokens to keep for top-k-filtering. If `None`, top-k-filtering is
+        top_k (`int`, defaults to `0`):


Suggested change

top_k (`int`, defaults to `0`):

top_k (`int`, *optional*, defaults to `0`):

qgallouedec · 2025-12-15T19:18:43Z

            `1.0` to consider all tokens.
-        top_k (`int`, *optional*):
-            Number of highest probability vocabulary tokens to keep for top-k-filtering. If `None`, top-k-filtering is
+        top_k (`int`, defaults to `0`):


Suggested change

top_k (`int`, defaults to `0`):

top_k (`int`, *optional*, defaults to `0`):

…ce#4695)

Fix top_k default value to 0 for disabling top-k

21cf0a9

Update docstrings

8ad0480

Fix vLLM top_k for config.top_k either 0 or None (BC)

6f1192e

qgallouedec reviewed Dec 15, 2025

View reviewed changes

albertvillanova added 4 commits December 15, 2025 17:54

Pass top_k=0 to vLLM w/o converting to -1

8645da5

Raise deprecation warning

71bf5b2

Remove deprecation warning from experimental OnlineDPOConfig

4376906

Reduce removal version to 0.28

986f7c2

qgallouedec approved these changes Dec 15, 2025

View reviewed changes

Apply review suggestions

b02c444

albertvillanova merged commit c86be21 into huggingface:main Dec 16, 2025
9 of 11 checks passed

qgallouedec mentioned this pull request Dec 18, 2025

Fix test assertion for top_k parameter in OnlineDPOTrainer #4714

Merged

Datta0 mentioned this pull request Jan 26, 2026

[trl] vllm trl topk fixup unslothai/unsloth#3935

Merged

This was referenced Jan 29, 2026

Update wordle.py example with masking of env tokens #4895

Merged

Set default top_k to 0 in VLLMClient #4927

Merged

songhappy pushed a commit to songhappy/trl that referenced this pull request Apr 20, 2026

Fix top_k default value to 0 for disabling top-k filtering (huggingfa…

46a9c4d

…ce#4695)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix top_k default value to 0 for disabling top-k filtering#4695

Fix top_k default value to 0 for disabling top-k filtering#4695
albertvillanova merged 8 commits into
huggingface:mainfrom
albertvillanova:fix-top-k-disabling-value

albertvillanova commented Dec 15, 2025 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Dec 15, 2025

Uh oh!

qgallouedec commented Dec 15, 2025

Uh oh!

albertvillanova commented Dec 15, 2025

Uh oh!

qgallouedec commented Dec 15, 2025

Uh oh!

qgallouedec Dec 15, 2025

Uh oh!

albertvillanova Dec 15, 2025

Uh oh!

albertvillanova Dec 15, 2025

Uh oh!

qgallouedec Dec 15, 2025

Uh oh!

qgallouedec left a comment

Uh oh!

qgallouedec Dec 15, 2025

Uh oh!

qgallouedec Dec 15, 2025

Uh oh!

qgallouedec Dec 15, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	top_k (`int`, defaults to `0`):
	top_k (`int`, optional, defaults to `0`):

Conversation

albertvillanova commented Dec 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Dec 15, 2025

Uh oh!

qgallouedec commented Dec 15, 2025

Uh oh!

albertvillanova commented Dec 15, 2025

Uh oh!

qgallouedec commented Dec 15, 2025

Uh oh!

qgallouedec Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

albertvillanova Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

albertvillanova Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

qgallouedec Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

qgallouedec left a comment

Choose a reason for hiding this comment

Uh oh!

qgallouedec Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

qgallouedec Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

qgallouedec Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

albertvillanova commented Dec 15, 2025 •

edited

Loading