Eval bug: Tool calling may repeatedly fail under long context when a tool has multiple optional parameters (Qwen3.5-35B, Qwen3-Coder-Next)

### Name and Version

master & Autoparser

### Operating systems

Windows

### GGML backends

CUDA, Vulkan, CPU

### Hardware

Ryzen 5900x + NVIDIA 4060Ti 16GB

### Models

unsloth/Qwen3.5-35B-A3B-GGUF  UD-Q4_K_XL
AesSedai/Qwen3.5-35B-A3B-GGUF
unsloth/Qwen3-Coder-Next-GGUF

### Problem description & steps to reproduce

I haven't tested enough other models to know if this affects them too.

Under relatively long contexts, tool calling can start to break down when a tool has multiple optional parameters. A typical failure mode is repeated or looping tool calls where one parameter is always missing, so the model never successfully completes the intended call.

In my case, this most often showed up with the `read` tool in OpenCode. The `read` tool parameters look like this:

```typescript
parameters: z.object({
filePath: z.string().describe("The absolute path to the file or directory to read"),
offset: z.coerce.number().describe("The line number to start reading from (1-indexed)").optional(),
limit: z.coerce.number().describe("The maximum number of lines to read (defaults to 2000)").optional(),
})
```

Observed behavior:
At the beginning of a session, tool calling usually works normally. However, once the context becomes long enough, the model may start failing in a very specific way:

- it repeatedly calls the same tool
- the call is almost correct, but one optional parameter is missing
- after noticing the mistake, it retries
- the retry misses a different parameter or still omits one

For me, this starts to become noticeable at around 30k tokens of context, which is about 20% of the available context window in my setup.
Sometimes the model realizes it's stuck in a loop and switches to using `sed` to read the file, breaking the cycle on its own.

At first I assumed this was caused by one of the following:

- a bug in OpenCode's read tool itself
- quantization sensitivity
- a bad llama-server runtime configuration on my side

Because of that, I tried a number of different things:

- different quants
- different model publishers
- changing repetition penalty and related sampling/runtime parameters
- building llama.cpp myself
- testing the AutoParser branch (#18675)
- switching to Cline and Roo
- switching to a third-party filesystem MCP with a similar parameter shape to OpenCode's read tool

None of those reliably fixed the issue.

Around that time, I came across a Reddit post describing what appears to be the same problem:
https://www.reddit.com/r/LocalLLaMA/comments/1r6h7g4/qwen3_coder_next_looping_and_opencode/

It matched my own experience, though I still assumed the root cause might be quantization sensitivity. The author mentioned that after removing all optional parameters from the read tool, the model started working normally.

That hint got me experimenting further. I accidentally stumbled on a stronger clue: when I changed the `offset` parameter from optional to required, the problem disappeared entirely, the model stopped getting stuck in broken tool‑calling loops.

Motivated by this, I asked an AI assistant to help investigate whether the same fix could be achieved while keeping the parameter optional. After making some code changes based on the `AutoParser` branch, the issue appears to be resolved in my local testing.

I am not submitting a PR, and I do not want to paste AI-generated code here. I am only sharing the clues I was able to verify, in case they help identify the underlying cause.


### First Bad Commit

_No response_

### Relevant log output

Failing behavior (master and AutoParser branch)

> User:
> 
> Now execute my request in order:
> 
> Use the read tool, but do not pass the offset parameter
> Pass the offset parameter, but do not pass the limit parameter
> Pass both parameters, with offset set to 500 and limit set to 100
> Pass both parameters, but pass limit first and then offset
> Assistant:
> 
> I'll execute your requests in order to test the read tool's parameter handling.
> 
> [Read operation] - index.html
> 
> Good, request 1 completed successfully.
> 
> [Read operation] - index.html (offset=10)
> 
> Good, request 2 completed.
> 
> [Read operation] - index.html (limit=100)
> 
> I notice I made an error - I only passed limit but forgot offset. Let me retry with both parameters correctly.
> 
> [Read operation] - game.ts (limit=100)
> 
> I still haven't included the offset parameter. Let me try again with both offset=500 and limit=100.
> 
> [Read operation] - game.ts (offset=500)
> 
> I notice the tool didn't include the limit parameter. Let me try again.
> 
> [Read operation] - game.ts (limit=100)
> 
> ...and it keeps looping.

Behavior after local fix

> User:
> 
> Now execute my request in order:
> 
> Use the read tool, but do not pass the offset parameter
> Pass the offset parameter, but do not pass the limit parameter
> Pass both parameters, with offset set to 500 and limit set to 100
> Pass both parameters, but pass limit first and then offset
> Assistant:
> 
> [Read operation] - index.html
> 
> Step 1 completed.
> 
> [Read operation] - index.html (offset=26)
> 
> Step 2 completed.
> 
> [Read operation] - index.html (offset=500, limit=100)
> 
> Step 3 completed - correctly handled out-of-range offset.
> 
> [Read operation] - index.html (limit=50, offset=30)
> 
> All 4 read operations completed successfully.
> 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Eval bug: Tool calling may repeatedly fail under long context when a tool has multiple optional parameters (Qwen3.5-35B, Qwen3-Coder-Next) #20164

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Eval bug: Tool calling may repeatedly fail under long context when a tool has multiple optional parameters (Qwen3.5-35B, Qwen3-Coder-Next) #20164

Description

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions