Skip to content

Misc. bug: structured json fails when using enums with Qwen3.5-35B-A3B #20178

@Galunid

Description

@Galunid

Name and Version

version: 8167 (37964f4)
built with GNU 15.2.1 for Linux x86_64

Operating systems

Linux

Which llama.cpp modules do you know to be affected?

llama-server

Command line

./llama-server --models-preset /home/kris/models.ini --port 9090 --host 0.0.0.0  -v

Problem description & steps to reproduce

I'm working on a small dataset tagging script, where each problem should have accompanying list of tags (forced using response_format). Expected output looks like this:
{"tags": ["math", "physics"]}.
When using Qwen3.5-35B-A3B script outputs in the following:
{tags: [{'tag': 'math', 'reason': 'Problem involves game theory[...]'}]}, or {tags: [{'tag': 'math'}]} instead.

Interestingly enough with Qwen3 it works like it's supposed to, so perhaps it's some sampler conflict.

Json schema
{
  "$defs": {
    "TagsEnum": {
      "enum": [
        "math",
        "physics",
        "literature"
      ],
      "title": "TagsEnum",
      "type": "string"
    }
  },
  "additionalProperties": true,
  "properties": {
    "tags": {
      "items": {
        "$ref": "#/$defs/TagsEnum"
      },
      "title": "Tags",
      "type": "array"
    }
  },
  "required": [
    "tags"
  ],
  "title": "Problem",
  "type": "object"
}
Python script for easy reproduction
import time
import openai

from enum import Enum

api_url="http://localhost:9090"

client = openai.OpenAI(
    base_url=f"{api_url}/v1",
    api_key = "unused",
)

from pydantic import BaseModel

class TagsEnum(str, Enum):
    math = "math",
    physics = "physics",
    literature = "literature"

class Problem(BaseModel):
    tags: list[TagsEnum]

system_prompt = """I am currently working on tagging dataset. I should tag all the problems accordingly using only tags in this list:
   math - for when problems need specific mathematic solution
   physics - when problem is a physics one
   literature - for literature related problems

I should also pay attention to the way I format the tags. My response should be only tags from the first category separated using comma.
I should also pay attention to the way tags are desribed and apply them only when they match the correct criteria.
I should also pay attention to the way tags are described and make sure all the relevant tags are applied based on their descriptions."""

task = """The game of NIM

Determine the best strategy for each player in the following two-player game. There
are three piles, each of which contains some number of coins. Players alternate turns,
each turn consisting of removing any (non-zero) number of coins from a single pile.
The goal is to be the person to remove the last coin(s)."""

print(Problem.model_json_schema())

t = {}
for i in range(10):
    print("Starting")
    completion = client.beta.chat.completions.parse(
        # model="Qwen3-1.7B-Q8_0.gguf",
        model="Qwen3.5-35B-A3B-Q4_K_M:Instruct-General-Vision",
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": task},
        ],
        response_format=Problem,
    )

    print(completion.choices[0].message)
    # event = completion.choices[0].message.parsed
    # tt = sorted(event.tags, key=lambda x: x.value)
    # for ev in tt:
    #     t[ev.value] = t.get(ev.value, 0) + 1
    # print(t)
print(t)

this results in the below error (because output doesn't conform to pydantic's schema)

pydantic_core._pydantic_core.ValidationError: 1 validation error for Problem
tags.0
  Input should be 'math', 'physics' or 'literature' [type=enum, input_value={'tag': 'math', 'reason':...ne a winning strategy.'}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.11/v/enum
Model.ini config
[Qwen3.5-35B-A3B-Q4_K_M:Instruct-General-Vision]
model = /mnt/disk/llms/Qwen3.5-35B-A3B/ggml-model-Q4_K_M.gguf
mmproj = /mnt/disk/llms/Qwen3.5-35B-A3B/mmproj-Qwen3.5-35b-BF16.gguf
mmproj-offload=false
c = 64000
temp = 0.7
top-p = 0.8
top-k = 20
min-p = 0.0
presence-penalty = 1.5
repeat-penalty = 1.0
n-predict = 32768
chat-template-kwargs = {"enable_thinking": false}

[Qwen3-1.7B-Q8_0.gguf]
model = /mnt/disk/llms/Qwen3-1.7B-Q8_0.gguf
c = 16000
temp = 0.7
top-p = 0.8
top-k = 20
min-p = 0.0
repeat-penalty = 1.0

First Bad Commit

No response

Relevant log output

Logs
srv  log_server_r: request:  {"messages":[{"role":"system","content":"I am currently working on tagging dataset. I should tag all the problems accordingly using only tags in this list:\n   math - for when problems need specific mathematic solution\n   physics - when problem is a physics one\n   literature - for literature related problems\n\nI should also pay attention to the way I format the tags. My response should be only tags from the first category separated using comma.\nI should also pay attention to the way tags are desribed and apply them only when they match the correct criteria.\nI should also pay attention to the way tags are described and make sure all the relevant tags are applied based on their descriptions."},{"role":"user","content":"The game of NIM\n\nDetermine the best strategy for each player in the following two-player game. There\nare three piles, each of which contains some number of coins. Players alternate turns,\neach turn consisting of removing any (non-zero) number of coins from a single pile.\nThe goal is to be the person to remove the last coin(s)."}],"model":"Qwen3.5-35B-A3B-Q4_K_M:Instruct-General-Vision","response_format":{"type":"json_schema","json_schema":{"schema":{"$defs":{"TagsEnum":{"enum":["math","physics","literature"],"title":"TagsEnum","type":"string"}},"properties":{"tags":{"items":{"$ref":"#/$defs/TagsEnum"},"title":"Tags","type":"array"}},"required":["tags"],"title":"Problem","type":"object","additionalProperties":false},"name":"Problem","strict":true}},"stream":false}
srv  log_server_r: response: 
srv    operator(): client request thread ended
srv    operator(): http: streamed chunk: {"choices":[{"finish_reason":"stop","index":0,"message":{"role":"assistant","content":"{\"tags\": [\n  {\n    \"tag\": \"math\",\n    \"reason\": \"The problem involves game theory, combinatorics, and binary arithmetic (Nim-sum) to determine a winning strategy.\"\n  }\n]}"}}],"created":1772837616,"model":"Qwen3.5-35B-A3B-Q4_K_M:Instruct-General-Vision","system_fingerprint":"b8167-37964f44f","object":"chat.completion","usage":{"completion_tokens":51,"prompt_tokens":221,"total_tokens":272},"id":"chatcmpl-4GC82b4TDsrE7UedNgMIIAkqSpe8NERf","__verbose":{"index":0,"content":"{\"tags\": [\n  {\n    \"tag\": \"math\",\n    \"reason\": \"The problem involves game theory, combinatorics, and binary arithmetic (Nim-sum) to determine a winning strategy.\"\n  }\n]}","tokens":[],"id_slot":2,"stop":true,"model":"Qwen3.5-35B-A3B-Q4_K_M:Instruct-General-Vision","tokens_predicted":51,"tokens_evaluated":221,"generation_settings":{"seed":4294967295,"temperature":0.699999988079071,"dynatemp_range":0.0,"dynatemp_exponent":1.0,"top_k":20,"top_p":0.800000011920929,"min_p":0.0,"top_n_sigma":-1.0,"xtc_probability":0.0,"xtc_threshold":0.10000000149011612,"typical_p":1.0,"repeat_last_n":64,"repeat_penalty":1.0,"presence_penalty":1.5,"frequency_penalty":0.0,"dry_multiplier":0.0,"dry_base":1.75,"dry_allowed_length":2,"dry_penalty_last_n":64000,"dry_sequence_breakers":["\n",":","\"","*"],"mirostat":0,"mirostat_tau":5.0,"mirostat_eta":0.10000000149011612,"stop":[],"max_tokens":32768,"n_predict":32768,"n_keep":0,"n_discard":0,"ignore_eos":false,"stream":false,"logit_bias":[],"n_probs":0,"min_keep":0,"grammar":"array ::= \"[\" space ( value (\",\" space value)* )? \"]\" space\nboolean ::= (\"true\" | \"false\") space\nchar ::= [^\"\\\\\\x7F\\x00-\\x1F] | [\\\\] ([\"\\\\bfnrt] | \"u\" [0-9a-fA-F]{4})\ndecimal-part ::= [0-9]{1,16}\nintegral-part ::= [0] | [1-9] [0-9]{0,15}\njson-array ::= \"[\" space (\"]\" | json-value (\",\" space json-value)* space \"]\") space\njson-bool ::= (\"true\" | \"false\") space\njson-null ::= \"null\" space\njson-number ::= \"-\"? (\"0\" | [1-9] [0-9]*) (\".\" [0-9]+)? ((\"e\" | \"E\") [+-]? [0-9]+)? space\njson-object ::= \"{\" space (\"}\" | json-string space \":\" space json-value (space \",\" space json-string space \":\" space json-value)* space \"}\") space\njson-string ::= \"\\\"\" ( [^\"\\\\] | \"\\\\\" ( [\"\\\\/ bfnrt] | \"u\" [0-9a-fA-F]{4} ) )* \"\\\"\" space\njson-value ::= json-object | json-array | json-string | json-number | json-bool | json-null\nnull ::= \"null\" space\nnumber ::= (\"-\"? integral-part) (\".\" decimal-part)? ([eE] [-+]? integral-part)? space\nobject ::= \"{\" space ( string \":\" space value (\",\" space string \":\" space value)* )? \"}\" space\nref-defs-TagsEnum ::= object\nresponse-format ::= \"{\" space response-format-tags-kv \"}\" space\nresponse-format-tags ::= \"[\" space (response-format-tags-item (\",\" space response-format-tags-item)*)? \"]\" space\nresponse-format-tags-item ::= ref-defs-TagsEnum\nresponse-format-tags-kv ::= \"\\\"tags\\\"\" space \":\" space response-format-tags\nroot ::= space response-format\nspace ::= | \" \" | \"\\n\"{1,2} [ \\t]{0,20}\nstring ::= \"\\\"\" char* \"\\\"\" space\nvalue ::= object | array | string | number | boolean | null\n","grammar_lazy":false,"grammar_triggers":[{"type":0,"value":"<tool_call>","token":248058}],"preserved_tokens":[248058,248059,248068,248069],"chat_format":"peg-constructed","reasoning_format":"deepseek","reasoning_in_content":false,"thinking_forced_open":false,"samplers":["penalties","dry","top_n_sigma","top_k","typ_p","top_p","min_p","xtc","temperature"],"speculative.n_max":16,"speculative.n_min":0,"speculative.p_min":0.75,"speculative.type":"none","speculative.ngram_size_n":1024,"speculative.ngram_size_m":1024,"speculative.ngram_m_hits":1024,"timings_per_token":false,"post_sampling_probs":false,"backend_sampling":false,"lora":[]},"prompt":"<|im_start|>system
srv    operator(): http: streamed chunk: \nI am currently working on tagging dataset. I should tag all the problems accordingly using only tags in this list:\n   math - for when problems need specific mathematic solution\n   physics - when problem is a physics one\n   literature - for literature related problems\n\nI should also pay attention to the way I format the tags. My response should be only tags from the first category separated using comma.\nI should also pay attention to the way tags are desribed and apply them only when they match the correct criteria.\nI should also pay attention to the way tags are described and make sure all the relevant tags are applied based on their descriptions.<|im_end|>\n<|im_start|>user\nThe game of NIM\n\nDetermine the best strategy for each player in the following two-player game. There\nare three piles, each of which contains some number of coins. Players alternate turns,\neach turn consisting of removing any (non-zero) number of coins from a single pile.\nThe goal is to be the person to remove the last coin(s).<|im_end|>\n<|im_start|>assistant\n<think>\n\n</think>\n\n","has_new_line":true,"truncated":false,"stop_type":"eos","stopping_word":"","tokens_cached":271,"timings":{"cache_n":0,"prompt_n":221,"prompt_ms":2328.223,"prompt_per_token_ms":10.534945701357465,"prompt_per_second":94.92217884627031,"predicted_n":51,"predicted_ms":3018.341,"predicted_per_token_ms":59.18315686274509,"predicted_per_second":16.896699213243302}},"timings":{"cache_n":0,"prompt_n":221,"prompt_ms":2328.223,"prompt_per_token_ms":10.534945701357465,"prompt_per_second":94.92217884627031,"predicted_n":51,"predicted_ms":3018.341,"predicted_per_token_ms":59.18315686274509,"predicted_per_second":16.896699213243302}}
srv    operator(): http: stream ended

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions