Fix top_k default value to 0 for disabling top-k filtering#4695
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
We should use 0, see #3494 -- Can you also update the docstring? like here |
|
I am fixing vLLM |
|
After checking vLLM, |
| "temperature": self.temperature, | ||
| "top_p": self.top_p, | ||
| "top_k": -1 if self.top_k is None else self.top_k, | ||
| "top_k": -1 if not self.top_k else self.top_k, |
There was a problem hiding this comment.
I think can (should) just use self.top_k, and don't replace by -1
There was a problem hiding this comment.
Yes, the vLLM support to consider all tokens by setting top_k=0 (besides -1) was introduced in vllm-0.9.0
and we require vllm>=0.10.2.
There was a problem hiding this comment.
On the other hand, what if the users still set top_k=None? Maybe we should start a deprecation cycle.
There was a problem hiding this comment.
we could indeed deprecate None for 1 version before removal. I'd do it only for GRPO and RLOO though
| `1.0` to consider all tokens. | ||
| top_k (`int`, *optional*): | ||
| Number of highest probability vocabulary tokens to keep for top-k-filtering. If `None`, top-k-filtering is | ||
| top_k (`int`, defaults to `0`): |
There was a problem hiding this comment.
| top_k (`int`, defaults to `0`): | |
| top_k (`int`, *optional*, defaults to `0`): |
| `1.0` to consider all tokens. | ||
| top_k (`int`, *optional*): | ||
| Number of highest probability vocabulary tokens to keep for top-k-filtering. If `None`, top-k-filtering is | ||
| top_k (`int`, defaults to `0`): |
There was a problem hiding this comment.
| top_k (`int`, defaults to `0`): | |
| top_k (`int`, *optional*, defaults to `0`): |
| `1.0` to consider all tokens. | ||
| top_k (`int`, *optional*): | ||
| Number of highest probability vocabulary tokens to keep for top-k-filtering. If `None`, top-k-filtering is | ||
| top_k (`int`, defaults to `0`): |
There was a problem hiding this comment.
| top_k (`int`, defaults to `0`): | |
| top_k (`int`, *optional*, defaults to `0`): |
Fix top_k default value to 0 for disabling top-k, instead of ambiguous (and non-documented)
None.See related comment:
Nonetransformers#42702 (comment)For context, I was testing this fix in
transformers:Nonetransformers#42702and I discovered this edge case for
tests/test_grpo_trainer.py::TestGRPOTrainer::test_training_vlm[trl-internal-testing/tiny-Qwen2VLForConditionalGeneration]:top_k=1in itsgeneration_config.jsontop_k=Noneby default, so k-filtering is disabledNone, it gets silently overwritten to1Therefore, we need this PR so the transformers fix works as expected once merged.
Not sure about the expected default behavior:
None, saying this disables top-k filtering50Should I set 50 instead of 0 as our default?