Skip to content

[Bug] tool_choice with specific function causes model to stall on multi-call prompts #17998

@tugot17

Description

@tugot17

Checklist

  • I searched related issues but found no solution.
  • The bug persists in the latest version.
  • Issues without environment info and a minimal reproducible demo are hard to resolve and may receive no feedback.
  • If this is not a bug report but a general question, please start a discussion at https://github.com/sgl-project/sglang/discussions. Otherwise, it will be closed.
  • Please use English. Otherwise, it will be closed.

Describe the bug

Summary

When tool_choice specifies a function, the JSON schema constraint uses maxItems: 1. If the prompt implies multiple tool calls, the model gets stuck generating whitespace instead of closing the array.

Reproducer

python -m sglang.launch_server --model LiquidAI/LFM2.5-1.2B-Instruct --port 30000
import openai

client = openai.OpenAI(base_url="http://localhost:30000/v1", api_key="sk-123456")

for i in range(5):
    response = client.chat.completions.create(
        model="LiquidAI/LFM2.5-1.2B-Instruct",
        messages=[{"role": "user", "content": "What is the weather in NYC, LA, and Chicago?"}],
        tools=[{"type": "function", "function": {"name": "get_weather", "parameters": {"type": "object", "properties": {"city": {"type": "string"}}}}}],
        tool_choice={"type": "function", "function": {"name": "get_weather"}},
        max_tokens=100,
    )
    ok = response.choices[0].message.tool_calls is not None
    print(f"{i+1}: {'OK' if ok else 'FAIL'}")
    if not ok:
        print(f"   Content: {repr(response.choices[0].message.content)}")

Output (~50% fail rate):

1: FAIL
   Content: '[\n  {"name": "get_weather", "parameters": {"city": "NYC"}}  \n   \n \n\n\n\n\n\n\n  \n \n\n    \n \n\n \n\n \n\n...'
2: OK
3: FAIL
   Content: '[\n  {"name": "get_weather", "parameters": {"city": "NYC"}}  \n   \n \n\n\n\n\n\n\n  \n \n\n    \n \n\n \n\n \n\n...'
4: OK
5: FAIL
   Content: '[\n  {"name": "get_weather", "parameters": {"city": "NYC"}}  \n   \n \n\n\n\n\n\n\n  \n \n\n    \n \n\n \n\n \n\n...'

What happens

After completing the first tool call {"name": "get_weather", "parameters": {"city": "NYC"}}, the model wants to generate , to add calls for LA and Chicago. But maxItems: 1 masks the , token. The only valid tokens left are whitespace (\n, ), so the model loops on whitespace until max_tokens.

Root Cause

In sglang/srt/function_call/utils.py:

if isinstance(tool_choice, ToolChoice):
    return {
        "type": "array",
        "minItems": 1,
        "maxItems": 1,  # <-- masks `,` token, causes whitespace stall
        "items": _get_tool_schema(tool),
    }

Suggested Fix

Remove maxItems: 1, or add system prompt guidance that only one call is expected.

Reproduction

As above

Environment

sglang==0.5.8 and that's it

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions