Skip to content

compaction + caches: add Anthropic native compaction and mixed cache TTLs#65288

Closed
100yenadmin wants to merge 9 commits intoopenclaw:mainfrom
electricsheephq:codex/anthropic-native-compaction
Closed

compaction + caches: add Anthropic native compaction and mixed cache TTLs#65288
100yenadmin wants to merge 9 commits intoopenclaw:mainfrom
electricsheephq:codex/anthropic-native-compaction

Conversation

@100yenadmin
Copy link
Copy Markdown
Contributor

@100yenadmin 100yenadmin commented Apr 12, 2026

Summary

Closes #65287.

This adds Anthropic-native compaction support and a mixed Anthropic cache-retention policy that uses 1h retention for the stable system/tool/workspace prefix and 5m retention for high-churn conversation content. It also round-trips Anthropic compaction blocks through request history and streaming so provider-native active-context compaction can manage the live Claude prompt without giving up durable searchable memory in lossless-claw.

What Changed

  • Added Anthropic run/model params:
    • anthropicServerCompaction
    • anthropicCompactThreshold
    • anthropicCompactPauseAfter
    • anthropicCompactInstructions
  • Injected context_management.edits with compact_20260112 when Anthropic native compaction is enabled.
  • Added the required Anthropic beta features only when native compaction is active or compaction blocks must be round-tripped.
  • Extended Anthropic history conversion and streaming to preserve compaction blocks.
  • Updated Anthropic cache policy so long retention keeps the stable prefix at 1h while trailing conversation content remains 5m/ephemeral.

Why

  • Anthropic prompt caching is exact-prefix, so local prompt rewrites can burn a hot conversation cache.
  • Long-running Claude coding sessions benefit from a stable 1h cached prefix, but paying 1h write cost for fast-changing conversation blocks is wasteful.
  • This pairs well with the lossless-claw deferred/background compaction work: provider-native compaction manages the active prompt, while lossless-claw stays the searchable lossless sidecar.

Validation

  • pnpm test -- --run src/agents/anthropic-payload-policy.test.ts src/agents/anthropic-transport-stream.test.ts src/agents/pi-embedded-runner-extraparams.test.ts passed
  • NODE_OPTIONS=--max-old-space-size=8192 pnpm exec tsc -p tsconfig.json --noEmit reports only the pre-existing unrelated Telegram AbortSignal baseline in extensions/telegram/src/bot.ts

Related Work

Copilot AI review requested due to automatic review settings April 12, 2026 09:57
@openclaw-barnacle openclaw-barnacle Bot added agents Agent runtime and tooling size: L labels Apr 12, 2026
@100yenadmin
Copy link
Copy Markdown
Contributor Author

This PR is intentionally the provider-native complement to the no-stall/background-maintenance work in #65233 and the lossless-claw sidecar changes in Martian-Engineering/lossless-claw#408.

The goal is a hybrid model:

  • provider-native Anthropic compaction manages the live Claude prompt
  • mixed cache TTLs keep the stable prefix hot without paying 1-hour write cost for volatile conversation content
  • lossless-claw remains the searchable lossless sidecar instead of rewriting the active prompt on every turn

Validation on this branch:

  • pnpm test -- --run src/agents/anthropic-payload-policy.test.ts src/agents/anthropic-transport-stream.test.ts src/agents/pi-embedded-runner-extraparams.test.ts passed
  • NODE_OPTIONS=--max-old-space-size=8192 pnpm exec tsc -p tsconfig.json --noEmit reports only the pre-existing unrelated Telegram AbortSignal baseline in extensions/telegram/src/bot.ts

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds Anthropic provider-native “context management” compaction support and updates Anthropic prompt-cache tagging to use mixed TTL behavior (stable system prefix kept warm longer while high-churn message content stays short-lived), while preserving compaction blocks through history + streaming.

Changes:

  • Exposes new Anthropic compaction knobs via embedded runner extra params and threads them into transport options.
  • Extends Anthropic transport request/response handling to preserve compaction content blocks (history + streaming) and to inject context_management.edits when enabled.
  • Splits Anthropic cache-control behavior into system vs trailing-message tagging to support mixed cache retention.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/agents/pi-embedded-runner/extra-params.ts Adds extra params for Anthropic native compaction and forwards them into stream options.
src/agents/pi-embedded-runner-extraparams.test.ts Adds test coverage ensuring new Anthropic compaction options pass through for direct Anthropic models.
src/agents/anthropic-transport-stream.ts Implements compaction config resolution, required beta headers, compaction block round-tripping, and mixed cache TTL behavior integration.
src/agents/anthropic-transport-stream.test.ts Adds tests for mixed TTL tagging + compaction edit injection, and for compaction block round-tripping.
src/agents/anthropic-payload-policy.ts Introduces system vs message cache control, compaction edit injection helpers, and beta feature resolution.
src/agents/anthropic-payload-policy.test.ts Updates cache-control expectations and adds tests for compaction edit injection + beta feature selection.

Comment thread src/agents/anthropic-payload-policy.ts
Comment thread src/agents/anthropic-transport-stream.ts Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 89c0b06fba

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/agents/anthropic-transport-stream.ts
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 12, 2026

Greptile Summary

This PR adds Anthropic-native context compaction support and a mixed cache-TTL policy: the stable system/tool prefix gets a 1h TTL on eligible endpoints (api.anthropic.com, Vertex AI) while trailing conversation messages keep plain ephemeral. Compaction blocks are round-tripped through history conversion and streaming so Anthropic's active-context compaction can run without discarding prior context.

  • The GitHub Copilot transport path computes betaFeatures (including compaction betas when history contains compaction blocks), but then builds a separate copilotBetas array from scratch and sends that instead. Any session where a prior Anthropic turn produced a compaction block would cause the Copilot API to receive type: \"compaction\" messages without the required compact-2026-01-12 beta header, resulting in an API error.

Confidence Score: 4/5

Safe to merge for pure Anthropic sessions; one edge case needs attention before mixed-provider sessions with compaction history are possible.

The core implementation — mixed cache TTLs, compaction config injection, beta feature gating, and streaming round-trip — is correct and well-tested. One P2 finding (Copilot path ignoring compaction betas when history has compaction blocks) would produce an API error in a cross-provider scenario that this PR makes reachable for the first time, warranting a score of 4 rather than 5.

src/agents/anthropic-transport-stream.ts — specifically the createAnthropicTransportClient GitHub Copilot branch and the resolveAnthropicCompactionThreshold return type.

Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/agents/anthropic-transport-stream.ts
Line: 514-538

Comment:
**Compaction beta headers silently dropped for GitHub Copilot when history contains compaction blocks**

`betaFeatures` is computed at lines 503–509 using `hasCompactionBlocks(context.messages)`, which can return `true` if the conversation history carries compaction blocks from a previous Anthropic session. However, the Copilot branch ignores `betaFeatures` entirely and builds `copilotBetas` from scratch (only including `interleaved-thinking-2025-05-14`). If compaction blocks are present, the Copilot API receives messages with `type: "compaction"` content blocks but without `compact-2026-01-12` in the `anthropic-beta` header, which will cause an API error.

The auto-enable guard in `shouldEnableAnthropicServerCompaction` already blocks compaction from being injected for non-Anthropic providers, but `hasCompactionBlocks` independently triggers the beta requirement regardless of provider. A simple fix is to skip the compaction betas for the Copilot path, or filter compaction blocks from messages before conversion when the provider doesn't support them.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/agents/anthropic-transport-stream.ts
Line: 427-442

Comment:
**Return type is wider than actual behaviour**

`resolveAnthropicCompactionThreshold` is declared as returning `number | undefined`, but every code path returns a `number` (either the user-supplied value, `150_000`, or the computed `contextWindow * 0.75` result). The `| undefined` in the signature doesn't match the implementation and may mislead callers into adding unnecessary nullish checks.

```suggestion
function resolveAnthropicCompactionThreshold(
  model: Pick<AnthropicTransportModel, "contextWindow">,
  requested: number | undefined,
): number {
```

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "Add Anthropic native compaction support" | Re-trigger Greptile

Comment thread src/agents/anthropic-transport-stream.ts
Comment thread src/agents/anthropic-transport-stream.ts
@100yenadmin
Copy link
Copy Markdown
Contributor Author

@codex review

Latest head makes Anthropic server compaction opt-in, stops streaming compaction blocks as visible text, and strips compaction blocks from the GitHub Copilot Anthropic path.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Comment thread src/agents/anthropic-transport-stream.ts
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9e6f3584c0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/agents/anthropic-payload-policy.ts
Comment thread src/agents/anthropic-transport-stream.ts
Copy link
Copy Markdown
Contributor Author

Pass 2 fixes are up in fd20e73521.

What changed:

  • clamp Anthropic native-compaction triggers to the documented 50_000 token minimum
  • preserve stop_reason: "compaction"
  • make Copilot compaction stripping safe for string assistant content
  • normalize plain-string assistant messages into text blocks in the shared transport transform

Validation:

  • pnpm test -- --run src/agents/anthropic-payload-policy.test.ts src/agents/anthropic-transport-stream.test.ts src/agents/pi-embedded-runner-extraparams.test.ts
  • passed (116 focused tests)

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fd20e73521

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/agents/anthropic-transport-stream.ts
Copy link
Copy Markdown
Contributor Author

@codex review

Follow-up fixes are in:

  • Anthropic compaction replay is now gated by actual transport capability instead of replaying on any anthropic-messages provider
  • non-Anthropic transports strip compaction blocks before conversion and suppress compaction/context-management betas
  • added a MiniMax regression covering the non-Anthropic replay path

Focused tests:

  • pnpm test -- --run src/agents/anthropic-transport-stream.test.ts src/agents/anthropic-payload-policy.test.ts

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Nice work!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Can't wait for the next one!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 74a2450fa5

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/agents/anthropic-transport-stream.ts
Copy link
Copy Markdown
Contributor Author

@codex review

Final follow-up is in:

  • compaction-only Anthropic stops are now treated as paused incomplete turns instead of successful empty replies
  • runner retries once automatically after a server-compaction-only stop
  • focused coverage added for paused-state classification and one-shot continuation retry

Focused tests:

  • pnpm test -- --run src/agents/pi-embedded-runner/run.incomplete-turn.test.ts src/agents/anthropic-transport-stream.test.ts src/agents/anthropic-payload-policy.test.ts

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. 🎉

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@100yenadmin
Copy link
Copy Markdown
Contributor Author

100yenadmin commented Apr 12, 2026

@jalehman this is what I use locally for best practices/much better experience with Anthropic models. It's technically two features. The most valuable of which is the mixed cache system that Anthropic is working on to cut cost for users (1hr for durable system prompting and 5m stuff that is in/out quick task wise).

This DOES not add this feature to native compaction or touch anything outside of when you're using an anthropic model. IT also works hand in hand with LCM to still save durable memory and work as context engine but it does NOT run compaction nor damage the cache.

It saves so far testing wise about over 50%> of cache write cost in my initial testing. It can make Anthropic more "usable" for Openclaw users now off max and affordable.

@jalehman jalehman self-assigned this Apr 12, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b566c58ab7

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/agents/pi-embedded-runner/run.ts
@100yenadmin 100yenadmin changed the title Add Anthropic native compaction and mixed cache TTLs compaction + caches: Add Anthropic native compaction and mixed cache TTLs Apr 13, 2026
@100yenadmin 100yenadmin changed the title compaction + caches: Add Anthropic native compaction and mixed cache TTLs compaction + caches: add Anthropic native compaction and mixed cache TTLs Apr 13, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 1 comment.

if (contentBlock?.type === "compaction") {
const block: TransportContentBlock = {
type: "compaction",
content: typeof contentBlock.content === "string" ? contentBlock.content : "",
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In content_block_start for compaction, non-string content_block.content is currently coerced to an empty string. If Anthropic sends content: null (or omits content entirely) and no subsequent compaction_delta arrives, this silently changes the semantics from null to "" and makes null round-tripping inconsistent with the declared string | null type. Consider initializing content as null when the source value isn’t a string (and preserving explicit null).

Suggested change
content: typeof contentBlock.content === "string" ? contentBlock.content : "",
content: typeof contentBlock.content === "string" ? contentBlock.content : null,

Copilot uses AI. Check for mistakes.
@100yenadmin
Copy link
Copy Markdown
Contributor Author

due to limit in my PR's and focus toward GPT 5.4 improvements. @jalehman closing this, since you're handling just copy it yourself and or reopen when you get to it. I need the extra slot for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling size: XL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add Anthropic native compaction and mixed prompt-cache TTL support

3 participants