Checklist
Describe the bug
Summary
When tool_choice specifies a function, the JSON schema constraint uses maxItems: 1. If the prompt implies multiple tool calls, the model gets stuck generating whitespace instead of closing the array.
Reproducer
python -m sglang.launch_server --model LiquidAI/LFM2.5-1.2B-Instruct --port 30000
import openai
client = openai.OpenAI(base_url="http://localhost:30000/v1", api_key="sk-123456")
for i in range(5):
response = client.chat.completions.create(
model="LiquidAI/LFM2.5-1.2B-Instruct",
messages=[{"role": "user", "content": "What is the weather in NYC, LA, and Chicago?"}],
tools=[{"type": "function", "function": {"name": "get_weather", "parameters": {"type": "object", "properties": {"city": {"type": "string"}}}}}],
tool_choice={"type": "function", "function": {"name": "get_weather"}},
max_tokens=100,
)
ok = response.choices[0].message.tool_calls is not None
print(f"{i+1}: {'OK' if ok else 'FAIL'}")
if not ok:
print(f" Content: {repr(response.choices[0].message.content)}")
Output (~50% fail rate):
1: FAIL
Content: '[\n {"name": "get_weather", "parameters": {"city": "NYC"}} \n \n \n\n\n\n\n\n\n \n \n\n \n \n\n \n\n \n\n...'
2: OK
3: FAIL
Content: '[\n {"name": "get_weather", "parameters": {"city": "NYC"}} \n \n \n\n\n\n\n\n\n \n \n\n \n \n\n \n\n \n\n...'
4: OK
5: FAIL
Content: '[\n {"name": "get_weather", "parameters": {"city": "NYC"}} \n \n \n\n\n\n\n\n\n \n \n\n \n \n\n \n\n \n\n...'
What happens
After completing the first tool call {"name": "get_weather", "parameters": {"city": "NYC"}}, the model wants to generate , to add calls for LA and Chicago. But maxItems: 1 masks the , token. The only valid tokens left are whitespace (\n, ), so the model loops on whitespace until max_tokens.
Root Cause
In sglang/srt/function_call/utils.py:
if isinstance(tool_choice, ToolChoice):
return {
"type": "array",
"minItems": 1,
"maxItems": 1, # <-- masks `,` token, causes whitespace stall
"items": _get_tool_schema(tool),
}
Suggested Fix
Remove maxItems: 1, or add system prompt guidance that only one call is expected.
Reproduction
As above
Environment
sglang==0.5.8 and that's it
Checklist
Describe the bug
Summary
When
tool_choicespecifies a function, the JSON schema constraint usesmaxItems: 1. If the prompt implies multiple tool calls, the model gets stuck generating whitespace instead of closing the array.Reproducer
Output (~50% fail rate):
What happens
After completing the first tool call
{"name": "get_weather", "parameters": {"city": "NYC"}}, the model wants to generate,to add calls for LA and Chicago. ButmaxItems: 1masks the,token. The only valid tokens left are whitespace (\n,), so the model loops on whitespace untilmax_tokens.Root Cause
In
sglang/srt/function_call/utils.py:Suggested Fix
Remove
maxItems: 1, or add system prompt guidance that only one call is expected.Reproduction
As above
Environment
sglang==0.5.8and that's it