Skip to content

TemplateException when combining system message with response_format json_object (Qwen3.5) #10

@autholykos

Description

@autholykos

Description

Combining a system message with response_format: {"type": "json_object"} in the chat completions API causes:

TemplateException(message: Optional('System message must be at the beginning.'))

Without response_format, the same system + user message works fine.

Reproduction

# Works:
curl -s http://localhost:8002/v1/chat/completions -H 'Content-Type: application/json' \
  -d '{"model":"qwen3.5-9b-mlx","messages":[{"role":"system","content":"Extract facts."},{"role":"user","content":"Raoh supervises Juza."}],"max_tokens":50}'

# Fails:
curl -s http://localhost:8002/v1/chat/completions -H 'Content-Type: application/json' \
  -d '{"model":"qwen3.5-9b-mlx","messages":[{"role":"system","content":"Extract facts."},{"role":"user","content":"Raoh supervises Juza."}],"max_tokens":50,"response_format":{"type":"json_object"}}'

Impact

Blocks Mem0 and Graphiti (and likely other OpenAI-compatible clients) from working with SwiftLM out of the box, since both send system messages + response_format: json_object.

Environment

  • macOS, M2 Ultra
  • Qwen3.5-9B-MLX-4bit, Qwen3.5-35B-A3B-8bit
  • SwiftLM latest release build

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions