-
Notifications
You must be signed in to change notification settings - Fork 15.3k
Open
Labels
Description
Name and Version
version: 8167 (37964f4)
built with GNU 15.2.1 for Linux x86_64
Operating systems
Linux
Which llama.cpp modules do you know to be affected?
llama-server
Command line
./llama-server --models-preset /home/kris/models.ini --port 9090 --host 0.0.0.0 -vProblem description & steps to reproduce
I'm working on a small dataset tagging script, where each problem should have accompanying list of tags (forced using response_format). Expected output looks like this:
{"tags": ["math", "physics"]}.
When using Qwen3.5-35B-A3B script outputs in the following:
{tags: [{'tag': 'math', 'reason': 'Problem involves game theory[...]'}]}, or {tags: [{'tag': 'math'}]} instead.
Interestingly enough with Qwen3 it works like it's supposed to, so perhaps it's some sampler conflict.
Json schema
{
"$defs": {
"TagsEnum": {
"enum": [
"math",
"physics",
"literature"
],
"title": "TagsEnum",
"type": "string"
}
},
"additionalProperties": true,
"properties": {
"tags": {
"items": {
"$ref": "#/$defs/TagsEnum"
},
"title": "Tags",
"type": "array"
}
},
"required": [
"tags"
],
"title": "Problem",
"type": "object"
}Python script for easy reproduction
import time
import openai
from enum import Enum
api_url="http://localhost:9090"
client = openai.OpenAI(
base_url=f"{api_url}/v1",
api_key = "unused",
)
from pydantic import BaseModel
class TagsEnum(str, Enum):
math = "math",
physics = "physics",
literature = "literature"
class Problem(BaseModel):
tags: list[TagsEnum]
system_prompt = """I am currently working on tagging dataset. I should tag all the problems accordingly using only tags in this list:
math - for when problems need specific mathematic solution
physics - when problem is a physics one
literature - for literature related problems
I should also pay attention to the way I format the tags. My response should be only tags from the first category separated using comma.
I should also pay attention to the way tags are desribed and apply them only when they match the correct criteria.
I should also pay attention to the way tags are described and make sure all the relevant tags are applied based on their descriptions."""
task = """The game of NIM
Determine the best strategy for each player in the following two-player game. There
are three piles, each of which contains some number of coins. Players alternate turns,
each turn consisting of removing any (non-zero) number of coins from a single pile.
The goal is to be the person to remove the last coin(s)."""
print(Problem.model_json_schema())
t = {}
for i in range(10):
print("Starting")
completion = client.beta.chat.completions.parse(
# model="Qwen3-1.7B-Q8_0.gguf",
model="Qwen3.5-35B-A3B-Q4_K_M:Instruct-General-Vision",
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": task},
],
response_format=Problem,
)
print(completion.choices[0].message)
# event = completion.choices[0].message.parsed
# tt = sorted(event.tags, key=lambda x: x.value)
# for ev in tt:
# t[ev.value] = t.get(ev.value, 0) + 1
# print(t)
print(t)this results in the below error (because output doesn't conform to pydantic's schema)
pydantic_core._pydantic_core.ValidationError: 1 validation error for Problem
tags.0
Input should be 'math', 'physics' or 'literature' [type=enum, input_value={'tag': 'math', 'reason':...ne a winning strategy.'}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.11/v/enum
Model.ini config
[Qwen3.5-35B-A3B-Q4_K_M:Instruct-General-Vision]
model = /mnt/disk/llms/Qwen3.5-35B-A3B/ggml-model-Q4_K_M.gguf
mmproj = /mnt/disk/llms/Qwen3.5-35B-A3B/mmproj-Qwen3.5-35b-BF16.gguf
mmproj-offload=false
c = 64000
temp = 0.7
top-p = 0.8
top-k = 20
min-p = 0.0
presence-penalty = 1.5
repeat-penalty = 1.0
n-predict = 32768
chat-template-kwargs = {"enable_thinking": false}
[Qwen3-1.7B-Q8_0.gguf]
model = /mnt/disk/llms/Qwen3-1.7B-Q8_0.gguf
c = 16000
temp = 0.7
top-p = 0.8
top-k = 20
min-p = 0.0
repeat-penalty = 1.0
First Bad Commit
No response
Relevant log output
Logs
srv log_server_r: request: {"messages":[{"role":"system","content":"I am currently working on tagging dataset. I should tag all the problems accordingly using only tags in this list:\n math - for when problems need specific mathematic solution\n physics - when problem is a physics one\n literature - for literature related problems\n\nI should also pay attention to the way I format the tags. My response should be only tags from the first category separated using comma.\nI should also pay attention to the way tags are desribed and apply them only when they match the correct criteria.\nI should also pay attention to the way tags are described and make sure all the relevant tags are applied based on their descriptions."},{"role":"user","content":"The game of NIM\n\nDetermine the best strategy for each player in the following two-player game. There\nare three piles, each of which contains some number of coins. Players alternate turns,\neach turn consisting of removing any (non-zero) number of coins from a single pile.\nThe goal is to be the person to remove the last coin(s)."}],"model":"Qwen3.5-35B-A3B-Q4_K_M:Instruct-General-Vision","response_format":{"type":"json_schema","json_schema":{"schema":{"$defs":{"TagsEnum":{"enum":["math","physics","literature"],"title":"TagsEnum","type":"string"}},"properties":{"tags":{"items":{"$ref":"#/$defs/TagsEnum"},"title":"Tags","type":"array"}},"required":["tags"],"title":"Problem","type":"object","additionalProperties":false},"name":"Problem","strict":true}},"stream":false}
srv log_server_r: response:
srv operator(): client request thread ended
srv operator(): http: streamed chunk: {"choices":[{"finish_reason":"stop","index":0,"message":{"role":"assistant","content":"{\"tags\": [\n {\n \"tag\": \"math\",\n \"reason\": \"The problem involves game theory, combinatorics, and binary arithmetic (Nim-sum) to determine a winning strategy.\"\n }\n]}"}}],"created":1772837616,"model":"Qwen3.5-35B-A3B-Q4_K_M:Instruct-General-Vision","system_fingerprint":"b8167-37964f44f","object":"chat.completion","usage":{"completion_tokens":51,"prompt_tokens":221,"total_tokens":272},"id":"chatcmpl-4GC82b4TDsrE7UedNgMIIAkqSpe8NERf","__verbose":{"index":0,"content":"{\"tags\": [\n {\n \"tag\": \"math\",\n \"reason\": \"The problem involves game theory, combinatorics, and binary arithmetic (Nim-sum) to determine a winning strategy.\"\n }\n]}","tokens":[],"id_slot":2,"stop":true,"model":"Qwen3.5-35B-A3B-Q4_K_M:Instruct-General-Vision","tokens_predicted":51,"tokens_evaluated":221,"generation_settings":{"seed":4294967295,"temperature":0.699999988079071,"dynatemp_range":0.0,"dynatemp_exponent":1.0,"top_k":20,"top_p":0.800000011920929,"min_p":0.0,"top_n_sigma":-1.0,"xtc_probability":0.0,"xtc_threshold":0.10000000149011612,"typical_p":1.0,"repeat_last_n":64,"repeat_penalty":1.0,"presence_penalty":1.5,"frequency_penalty":0.0,"dry_multiplier":0.0,"dry_base":1.75,"dry_allowed_length":2,"dry_penalty_last_n":64000,"dry_sequence_breakers":["\n",":","\"","*"],"mirostat":0,"mirostat_tau":5.0,"mirostat_eta":0.10000000149011612,"stop":[],"max_tokens":32768,"n_predict":32768,"n_keep":0,"n_discard":0,"ignore_eos":false,"stream":false,"logit_bias":[],"n_probs":0,"min_keep":0,"grammar":"array ::= \"[\" space ( value (\",\" space value)* )? \"]\" space\nboolean ::= (\"true\" | \"false\") space\nchar ::= [^\"\\\\\\x7F\\x00-\\x1F] | [\\\\] ([\"\\\\bfnrt] | \"u\" [0-9a-fA-F]{4})\ndecimal-part ::= [0-9]{1,16}\nintegral-part ::= [0] | [1-9] [0-9]{0,15}\njson-array ::= \"[\" space (\"]\" | json-value (\",\" space json-value)* space \"]\") space\njson-bool ::= (\"true\" | \"false\") space\njson-null ::= \"null\" space\njson-number ::= \"-\"? (\"0\" | [1-9] [0-9]*) (\".\" [0-9]+)? ((\"e\" | \"E\") [+-]? [0-9]+)? space\njson-object ::= \"{\" space (\"}\" | json-string space \":\" space json-value (space \",\" space json-string space \":\" space json-value)* space \"}\") space\njson-string ::= \"\\\"\" ( [^\"\\\\] | \"\\\\\" ( [\"\\\\/ bfnrt] | \"u\" [0-9a-fA-F]{4} ) )* \"\\\"\" space\njson-value ::= json-object | json-array | json-string | json-number | json-bool | json-null\nnull ::= \"null\" space\nnumber ::= (\"-\"? integral-part) (\".\" decimal-part)? ([eE] [-+]? integral-part)? space\nobject ::= \"{\" space ( string \":\" space value (\",\" space string \":\" space value)* )? \"}\" space\nref-defs-TagsEnum ::= object\nresponse-format ::= \"{\" space response-format-tags-kv \"}\" space\nresponse-format-tags ::= \"[\" space (response-format-tags-item (\",\" space response-format-tags-item)*)? \"]\" space\nresponse-format-tags-item ::= ref-defs-TagsEnum\nresponse-format-tags-kv ::= \"\\\"tags\\\"\" space \":\" space response-format-tags\nroot ::= space response-format\nspace ::= | \" \" | \"\\n\"{1,2} [ \\t]{0,20}\nstring ::= \"\\\"\" char* \"\\\"\" space\nvalue ::= object | array | string | number | boolean | null\n","grammar_lazy":false,"grammar_triggers":[{"type":0,"value":"<tool_call>","token":248058}],"preserved_tokens":[248058,248059,248068,248069],"chat_format":"peg-constructed","reasoning_format":"deepseek","reasoning_in_content":false,"thinking_forced_open":false,"samplers":["penalties","dry","top_n_sigma","top_k","typ_p","top_p","min_p","xtc","temperature"],"speculative.n_max":16,"speculative.n_min":0,"speculative.p_min":0.75,"speculative.type":"none","speculative.ngram_size_n":1024,"speculative.ngram_size_m":1024,"speculative.ngram_m_hits":1024,"timings_per_token":false,"post_sampling_probs":false,"backend_sampling":false,"lora":[]},"prompt":"<|im_start|>system
srv operator(): http: streamed chunk: \nI am currently working on tagging dataset. I should tag all the problems accordingly using only tags in this list:\n math - for when problems need specific mathematic solution\n physics - when problem is a physics one\n literature - for literature related problems\n\nI should also pay attention to the way I format the tags. My response should be only tags from the first category separated using comma.\nI should also pay attention to the way tags are desribed and apply them only when they match the correct criteria.\nI should also pay attention to the way tags are described and make sure all the relevant tags are applied based on their descriptions.<|im_end|>\n<|im_start|>user\nThe game of NIM\n\nDetermine the best strategy for each player in the following two-player game. There\nare three piles, each of which contains some number of coins. Players alternate turns,\neach turn consisting of removing any (non-zero) number of coins from a single pile.\nThe goal is to be the person to remove the last coin(s).<|im_end|>\n<|im_start|>assistant\n<think>\n\n</think>\n\n","has_new_line":true,"truncated":false,"stop_type":"eos","stopping_word":"","tokens_cached":271,"timings":{"cache_n":0,"prompt_n":221,"prompt_ms":2328.223,"prompt_per_token_ms":10.534945701357465,"prompt_per_second":94.92217884627031,"predicted_n":51,"predicted_ms":3018.341,"predicted_per_token_ms":59.18315686274509,"predicted_per_second":16.896699213243302}},"timings":{"cache_n":0,"prompt_n":221,"prompt_ms":2328.223,"prompt_per_token_ms":10.534945701357465,"prompt_per_second":94.92217884627031,"predicted_n":51,"predicted_ms":3018.341,"predicted_per_token_ms":59.18315686274509,"predicted_per_second":16.896699213243302}}
srv operator(): http: stream ended
Reactions are currently unavailable