Skip to content

fix(llm): graceful degradation when compact-2026-01-12 beta header is rejected#1701

Merged
bug-ops merged 5 commits intomainfrom
fix-llm-graceful-degradation-w
Mar 13, 2026
Merged

fix(llm): graceful degradation when compact-2026-01-12 beta header is rejected#1701
bug-ops merged 5 commits intomainfrom
fix-llm-graceful-degradation-w

Conversation

@bug-ops
Copy link
Copy Markdown
Owner

@bug-ops bug-ops commented Mar 13, 2026

Summary

  • Adds LlmError::BetaHeaderRejected variant returned when the Claude API rejects the compact-2026-01-12 beta header with a 400 (unknown/invalid beta)
  • ClaudeProvider sets a shared Arc<AtomicBool> (server_compaction_rejected) flag on detection; subsequent requests automatically omit the header and context_management field
  • Native tool-use retry loop (call_chat_with_tools_retry) catches BetaHeaderRejected, disables server_compaction_active on the agent, and retries the turn with client-side compaction — user loses at most one turn
  • Detection covers all request paths: send_request, send_stream_request, chat_with_tools, chat_typed
  • The Arc ensures all ClaudeProvider clones (e.g. router replicas) observe the rejection immediately

Fixes #1698. Follow-up to #1696 (SEC-COMPACT-03).

Test plan

  • New unit tests in zeph-llm: is_compact_beta_rejection_*, beta_header_excluded_after_rejection, context_management_absent_after_rejection, clone_shares_rejection_flag
  • New unit tests in zeph-core::agent::error: agent_error_detects_beta_header_rejected
  • New unit tests in zeph-llm::error: beta_header_rejected_is_detected, beta_header_rejected_display
  • All 5310 existing tests pass

bug-ops added 3 commits March 13, 2026 21:58
- Add `ContextManagement` request field and `compact-2026-01-12` beta header
  to all Claude request bodies when `server_compaction = true`
- Make SSE parser stateful with `ClaudeSseState` (via async-stream) to
  accumulate multi-event `compaction` content blocks
- Add `StreamChunk::Compaction` and `MessagePart::Compaction` variants for
  round-trip fidelity; `take_compaction_summary()` added to `LlmProvider` trait
- C2: agent loop (native + legacy streaming) prunes old messages and inserts
  synthetic `MessagePart::Compaction` assistant turn on compaction event
- S1: `maybe_compact` and `maybe_proactive_compress` early-return when
  `server_compaction_active` to avoid duplicate compaction
- Add `server_compaction_events` metric, `--server-compaction` CLI flag,
  `--init` wizard prompt, and `# server_compaction = false` in default.toml
- Extend `debug_dump`, `token_counter`, and `MetricsSnapshot` for new variant

Closes #1626
…eview

- CRIT-1: add 95% safety fallback in maybe_compact/maybe_proactive_compress;
  client-side compaction no longer unconditionally skipped when server
  compaction is active — fires only when tokens exceed 95% of budget
- SEC-COMPACT-01: sanitize compaction summary via ContentSanitizer
  (McpResponse/ExternalUntrusted) before inserting into context in
  both native and legacy tool execution paths
- SEC-COMPACT-02: cap SSE compaction accumulation buffer at 32 KiB;
  excess bytes discarded with a warning log
- IMP-3: wire server_compaction through ACP path — add server_compaction
  field to SharedAgentDeps, initialize from config, pass to agent builder
- IMP-4: TUI integration — [SC: N] status bar indicator when
  server_compaction_events > 0; Compacting context (server-side)...
  spinner in native and legacy paths; /compaction:status command palette
  entry with ServerCompactionStatus TuiCommand
- IMP-5 (already fixed in prior commit): context_window * 80 / 100
  integer arithmetic preserved
- Add ~25 unit tests across sse.rs, claude.rs, provider.rs, config/tests.rs
  covering compaction sequence, 32 KiB cap, serialization, beta header,
  take_compaction_summary, and TOML config parsing
… rejected

When the Claude API rejects the compact-2026-01-12 beta header (deprecated
or removed), ClaudeProvider now detects the 400 response, sets a shared
Arc<AtomicBool> flag, and returns LlmError::BetaHeaderRejected instead of
entering an unrecoverable error loop.

Subsequent requests automatically omit the beta header and context_management
field. The native tool-use retry loop (call_chat_with_tools_retry) catches
BetaHeaderRejected, disables server_compaction_active on the agent, and
retries the turn with client-side compaction — losing at most one turn.

Detection covers all request paths: send_request, send_stream_request,
chat_with_tools, and chat_typed (schema feature). The Arc ensures all
ClaudeProvider clones (e.g. router replicas) observe the rejection
immediately.

Closes #1698 (SEC-COMPACT-03)
@github-actions github-actions Bot added documentation Improvements or additions to documentation llm zeph-llm crate (Ollama, Claude) rust Rust code changes core zeph-core crate bug Something isn't working size/L Large PR (201-500 lines) labels Mar 13, 2026
Base automatically changed from research-llm-claude-server-side to main March 13, 2026 22:13
@github-actions github-actions Bot added memory zeph-memory crate (SQLite) dependencies Dependency updates config Configuration file changes size/XL Extra large PR (500+ lines) and removed size/L Large PR (201-500 lines) labels Mar 13, 2026
bug-ops added 2 commits March 13, 2026 23:14
Resolve conflicts in CHANGELOG.md, config/default.toml,
crates/zeph-core/src/bootstrap/tests.rs, crates/zeph-core/src/config/types.rs,
crates/zeph-llm/src/claude.rs, and src/init.rs.

Both sets of changes are preserved:
- server_compaction + graceful degradation (this branch)
- enable_extended_context / COV-03 test (main)
…on-w

Resolve conflict in crates/zeph-llm/src/claude.rs after PR #1696
(Claude server-side compaction) was merged into main.

Our graceful-degradation additions (server_compaction_rejected Arc<AtomicBool>,
is_compact_beta_rejection, is_server_compaction_rejected, detection in all
request paths, SEC-COMPACT-03 tests) are preserved on top of main's state.
@github-actions github-actions Bot added size/L Large PR (201-500 lines) and removed memory zeph-memory crate (SQLite) dependencies Dependency updates config Configuration file changes size/XL Extra large PR (500+ lines) labels Mar 13, 2026
@bug-ops bug-ops enabled auto-merge (squash) March 13, 2026 22:22
@bug-ops bug-ops merged commit df48b32 into main Mar 13, 2026
15 checks passed
@bug-ops bug-ops deleted the fix-llm-graceful-degradation-w branch March 13, 2026 22:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working core zeph-core crate documentation Improvements or additions to documentation llm zeph-llm crate (Ollama, Claude) rust Rust code changes size/L Large PR (201-500 lines)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix(llm): graceful degradation when compact-2026-01-12 beta header is rejected

1 participant