Skip to content

bug(llm): ContextManagement serialization format wrong for Claude server compaction API #1705

@bug-ops

Description

@bug-ops

Description

--server-compaction flag causes a 400 Bad Request from the Claude API:

{"type":"error","error":{"type":"invalid_request_error","message":"context_management.type: Extra inputs are not permitted"},"request_id":"..."}

Root Cause

The ContextManagement struct in crates/zeph-llm/src/claude.rs (PR #1696) serializes as:

{
  "type": "enabled",
  "trigger_tokens": 50000
}

But the Claude API expects:

{
  "type": "auto_truncate",
  "trigger": {
    "type": "input_tokens",
    "value": 50000
  },
  "pause_after_compaction": false
}

Two problems:

  1. ContextManagementType::Enabled"enabled" (should be "auto_truncate")
  2. trigger_tokens: u32 flat field (should be nested trigger: { type: "input_tokens", value: N })

Reproduction

cargo run --features full -- --config /tmp/zeph-claude-sc-test.toml --server-compaction <<< "hi"

With any Claude provider config. Returns 400 immediately.

Fix

Restructure ContextManagement:

#[derive(Serialize)]
struct ContextManagement {
    #[serde(rename = "type")]
    kind: &'static str,  // "auto_truncate"
    trigger: ContextManagementTrigger,
    pause_after_compaction: bool,
}

#[derive(Serialize)]
struct ContextManagementTrigger {
    #[serde(rename = "type")]
    kind: &'static str,  // "input_tokens"
    value: u32,
}

Severity

High — --server-compaction is completely non-functional due to this format mismatch. All requests with the flag enabled fail with 400.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingllmzeph-llm crate (Ollama, Claude)

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions