Add optional Anthropic context editing support for Claude models by aydnOktay · Pull Request #528 · NousResearch/hermes-agent

aydnOktay · 2026-03-06T11:42:39Z

This PR implements a first-phase integration of Anthropic’s server-side Context Editing API for Claude models by conditionally attaching a context_management.edits block to the Messages request body and, for direct Anthropic endpoints, automatically opting into the beta via the anthropic-beta: context-management-2025-06-27 header. The feature is fully opt-in and controlled via a new context_editing section in the CLI config plus CONTEXT_EDITING_* environment variables, which configure two edits (clear_thinking_20251015 and clear_tool_uses_20250919) with conservative defaults derived from the model’s context window to clear old thinking turns and tool use/result pairs while preserving prompt cache prefixes. This keeps the default behavior unchanged for non-Anthropic models, gives Claude users a simple switch to enable cache-friendly automatic context cleanup, and directly addresses the design and requirements described in Issue #526

aydnOktay · 2026-03-06T21:41:00Z

This issues done sir : #526

Check pls @teknium1

teknium1 · 2026-03-10T06:16:15Z

Thanks for putting this together @aydnOktay — the implementation is solid and well-structured, and this is definitely a feature we want to support (per #526).

However, after reviewing the integration path, we've identified a fundamental issue: Hermes currently uses the OpenAI SDK (openai.OpenAI) for all providers, sending requests to /v1/chat/completions. The context_management parameter is specific to Anthropic's native Messages API (/v1/messages).

This means there's no reliable working path today:

OpenRouter (our primary path): Their docs list supported Anthropic beta headers, but context-management-2025-06-27 isn't among them. It's unclear whether context_management in extra_body would be forwarded to Anthropic's backend.
Direct Anthropic: Anthropic does have an OpenAI-compatible endpoint, but it's pretty niche and unlikely to support Anthropic-specific parameters like context_management.
LiteLLM: Does support context_management passthrough, but that's a niche deployment.

What we'd like to do: We're planning to add native Anthropic API support (using the anthropic SDK directly) to Hermes. Once that's in place, this feature would have a clean, reliable path — sending context_management directly via the Anthropic Messages API where it's natively supported.

We're going to leave this PR open. If you're interested in updating it with the assumption that it will target the native Anthropic client (once available), that would be great. Otherwise we'll circle back to it once that foundation is in place.

A few code-level notes for whenever this gets revisited:

Bug: exclude_tools list serialization — str(["memory", "skill_manage", "todo"]) produces Python repr ("['memory', 'skill_manage', 'todo']"), but run_agent.py parses it with .split(","). Needs ,".join() instead.
Missing config key: clear_at_least_tokens is read from env var but has no corresponding key in cli.py defaults.
The unused instance attributes _is_openrouter / _is_nous_portal can be dropped (only _is_anthropic_model is used).

Thanks again for the contribution — the architecture and config design are good, it just needs the right transport layer underneath.

Integrate Anthropic's server-side context management (beta) for Claude models. When enabled, the API automatically clears old tool use/result pairs and thinking blocks AFTER prompt cache lookup but BEFORE token counting — this preserves prompt cache prefixes while freeing context space, something impossible with client-side stripping. Implementation: - anthropic_adapter: add context-management-2025-06-27 to beta headers; build context_management edits in build_anthropic_kwargs() via extra_body; only include clear_thinking edit when reasoning is enabled (API requires it) - run_agent: pipe context_editing config through AIAgent to the adapter - cli/gateway: load context_editing config from config.yaml and pass to agent - config: add context_editing section to DEFAULT_CONFIG with conservative defaults (disabled, auto-scale triggers to 60%/10% of context window, keep 5 tool uses and 2 thinking turns, exclude memory/skill_manage/todo) Config (opt-in, add to config.yaml): context_editing: enabled: true trigger_tokens: null # auto: 60% of context window keep_tool_uses: 5 keep_thinking_turns: 2 exclude_tools: [memory, skill_manage, todo] clear_tool_inputs: false clear_at_least_tokens: null # auto: 10% of context window Live tested with Anthropic API: - Single turn with context_management: accepted, response normal - Multi-turn with tool calls + thinking + context_management: works - clear_thinking correctly omitted when thinking is disabled - Config plumbing verified through AIAgent._build_api_kwargs() Refs: #526, supersedes #528

Integrate Anthropic's server-side context management (beta) for Claude models. When enabled, the API automatically clears old tool use/result pairs and thinking blocks AFTER prompt cache lookup but BEFORE token counting — this preserves prompt cache prefixes while freeing context space, something impossible with client-side stripping. Implementation: - anthropic_adapter: add context-management-2025-06-27 to beta headers; build context_management edits in build_anthropic_kwargs() via extra_body; only include clear_thinking edit when reasoning is enabled (API requires it) - run_agent: pipe context_editing config through AIAgent to the adapter - cli/gateway: load context_editing config from config.yaml and pass to agent - config: add context_editing section to DEFAULT_CONFIG with conservative defaults (disabled, auto-scale triggers to 60%/10% of context window, keep 5 tool uses and 2 thinking turns, exclude memory/skill_manage/todo) Config (opt-in, add to config.yaml): context_editing: enabled: true trigger_tokens: null # auto: 60% of context window keep_tool_uses: 5 keep_thinking_turns: 2 exclude_tools: [memory, skill_manage, todo] clear_tool_inputs: false clear_at_least_tokens: null # auto: 10% of context window Live tested with Anthropic API: - Single turn with context_management: accepted, response normal - Multi-turn with tool calls + thinking + context_management: works - clear_thinking correctly omitted when thinking is disabled - Config plumbing verified through AIAgent._build_api_kwargs() Refs: NousResearch#526, supersedes NousResearch#528

Add optional Anthropic context editing support

eb19ed8

aydnOktay mentioned this pull request Mar 6, 2026

Feature: Anthropic Context Editing API Integration — Server-Side, Cache-Friendly Tool/Thinking Cleanup for Claude Models #526

Open

aydnOktay added 2 commits March 6, 2026 14:46

Document context editing and provider routing in CLI config

825fe29

Merge upstream main into feature/anthropic-context-editing

49c4b91

teknium1 mentioned this pull request Mar 13, 2026

feat: add Anthropic Context Editing API support #1147

Merged

teknium1 closed this Mar 13, 2026

EffortlessSteven mentioned this pull request Apr 24, 2026

feat: phase 3 self-correction + handoff docs #1042

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add optional Anthropic context editing support for Claude models#528

Add optional Anthropic context editing support for Claude models#528
aydnOktay wants to merge 3 commits into
NousResearch:mainfrom
aydnOktay:feature/anthropic-context-editing

aydnOktay commented Mar 6, 2026

Uh oh!

aydnOktay commented Mar 6, 2026

Uh oh!

teknium1 commented Mar 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

aydnOktay commented Mar 6, 2026

Uh oh!

aydnOktay commented Mar 6, 2026

Uh oh!

teknium1 commented Mar 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants