Skip to content

[data][llm] Cannot pass enable_thinking to chat_template #56384

@bobingm

Description

@bobingm

What happened + What you expected to happen

When using Qwen/Qwen3-8B with ray.data.llm, there is no way to turn enable_thinking off

The ChatTemplateUDF doesn't pass down any keyword arguments

https://github.com/ray-project/ray/blob/master/python/ray/llm/_internal/batch/stages/chat_template_stage.py#L81

Versions / Dependencies

Python

Reproduction script

config = vLLMEngineProcessorConfig(
    model_source="Qwen/Qwen3-8B",
    concurrency=1,
    apply_chat_template=True,
    tokenize=False,
)

processor = build_llm_processor(
    config,
    preprocess=lambda row: dict(
        messages=[
            {"role": "system", "content": "You are a bot that responds with haikus."},
            {"role": "user", "content": row["item"]}
        ],
        sampling_params=dict(
            temperature=0.3,
            max_tokens=250,
        )
    ),
    postprocess=lambda row: dict(
        answer=row["generated_text"],
        **row  # This will return all the original columns in the dataset.
    ),
)

Issue Severity

Low: It annoys or frustrates me.

Metadata

Metadata

Assignees

Labels

bugSomething that is supposed to be working; but isn'tcommunity-backlogdataRay Data-related issuesgood-first-issueGreat starter issue for someone just starting to contribute to RayllmtriageNeeds triage (eg: priority, bug/not-bug, and owning component)usability

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions