-
Notifications
You must be signed in to change notification settings - Fork 7.4k
Closed
Labels
bugSomething that is supposed to be working; but isn'tSomething that is supposed to be working; but isn'tcommunity-backlogllmserveRay Serve Related IssueRay Serve Related IssuestabilitytriageNeeds triage (eg: priority, bug/not-bug, and owning component)Needs triage (eg: priority, bug/not-bug, and owning component)
Description
What happened + What you expected to happen
chat_template_kwargs was introduced in #56490 and is currently passed through all processors, even though some (e.g., ServeDeploymentProcessor and HttpRequestProcessor) do not support this field. As a result, constructing a ServeDeploymentProcessor via build_llm_processor raises errors.
This issue wasn’t caught in tests because they have been using ProcessorBuilder.build, which bypasses the layer where chat_template_kwargs is passed.
Versions / Dependencies
master
Reproduction script
def test_completion_model(model_opt_125m, create_model_opt_125m_deployment):
deployment_name, app_name = create_model_opt_125m_deployment
config = ServeDeploymentProcessorConfig(
deployment_name=deployment_name,
app_name=app_name,
dtype_mapping={
"CompletionRequest": CompletionRequest,
},
batch_size=16,
concurrency=1,
)
processor = build_llm_processor(
config,
preprocess=lambda row: dict(
method="completions",
dtype="CompletionRequest",
request_kwargs=dict(
model=model_opt_125m,
prompt=row["prompt"],
stream=False,
),
),
postprocess=lambda row: dict(
resp=row["choices"][0]["text"],
),
)
ds = ray.data.range(60)
ds = ds.map(lambda x: {"prompt": f"Hello {x['id']}"})
ds = processor(ds)
ds = ds.materialize()
outs = ds.take_all()
assert len(outs) == 60
assert all("resp" in out for out in outs)
Issue Severity
None
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething that is supposed to be working; but isn'tSomething that is supposed to be working; but isn'tcommunity-backlogllmserveRay Serve Related IssueRay Serve Related IssuestabilitytriageNeeds triage (eg: priority, bug/not-bug, and owning component)Needs triage (eg: priority, bug/not-bug, and owning component)