Skip to content

fix(openrouter): gate prompt cache markers by endpoint#60761

Merged
vincentkoc merged 2 commits into
mainfrom
fix/openrouter-cache-endpoint-gating
Apr 4, 2026
Merged

fix(openrouter): gate prompt cache markers by endpoint#60761
vincentkoc merged 2 commits into
mainfrom
fix/openrouter-cache-endpoint-gating

Conversation

@vincentkoc

Copy link
Copy Markdown
Member

AI-assisted: Codex
Testing: focused test + build
Session log: available in Codex session history

Summary

Describe the problem and fix in 2–5 bullets:

  • Problem: OpenRouter prompt-cache cache_control markers were keyed to provider/plugin identity instead of endpoint classification.
  • Why it matters: provider: openrouter on a custom OpenAI-compatible proxy could receive OpenRouter/Anthropic-only payload mutations, while custom provider ids pointed at native OpenRouter missed the cache optimization.
  • What changed: cache marker wrapping now runs in the generic post-plugin path and gates on shared request policy so only native/default OpenRouter routes receive the mutation.
  • What did NOT change (scope boundary): reasoning/routing behavior, non-Anthropic models, and non-OpenRouter transports.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes #
  • Related #
  • This PR fixes a bug or regression

Root Cause (if applicable)

  • Root cause: the cache wrapper was only installed by the OpenRouter plugin and only checked provider === "openrouter", so it diverged from the transport stack's endpoint-based OpenRouter detection.
  • Missing detection / guardrail: there was no direct regression test covering default OpenRouter, custom proxy URL, and custom provider id on native OpenRouter as separate cases.
  • Contributing context (if known): other OpenRouter-specific behavior already used endpoint classification, so the cache path drifted from the shared policy.

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file: src/agents/pi-embedded-runner/proxy-stream-wrappers.test.ts
  • Scenario the test should lock in: default-route OpenRouter keeps Anthropic cache markers, custom proxy URLs do not get the mutation, and custom provider ids on native OpenRouter still get the mutation.
  • Why this is the smallest reliable guardrail: the behavior is determined in the stream-wrapper/policy boundary before any live provider call.
  • Existing test that already covers this (if any): the same file already covered header behavior on the OpenRouter wrapper.
  • If no new test is added, why not: N/A

User-visible / Behavior Changes

  • Native/default OpenRouter routes keep Anthropic prompt-cache cache_control markers.
  • Custom OpenAI-compatible proxy deployments declared as OpenRouter stop receiving OpenRouter-only cache payload mutations.
  • Custom provider ids pointed at native OpenRouter now get the same cache marker behavior as the builtin OpenRouter plugin.
  • Changelog entry added under the active release block: Thanks @vincentkoc.

Diagram (if applicable)

Before:
[provider=openrouter] -> [cache wrapper checks provider id only] -> [custom proxy may get OpenRouter-only cache mutation]
[custom provider id -> openrouter.ai] -> [no openrouter plugin wrapper] -> [cache mutation missing]

After:
[request policy resolves endpoint class] -> [native/default OpenRouter only] -> [cache mutation applied when Anthropic-compatible]
[custom proxy endpoint] -> [not classified as OpenRouter] -> [payload left untouched]

Security Impact (required)

  • New permissions/capabilities? (No)
  • Secrets/tokens handling changed? (No)
  • New/changed network calls? (No)
  • Command/tool execution surface changed? (No)
  • Data access scope changed? (No)
  • If any Yes, explain risk + mitigation: N/A

Repro + Verification

Environment

  • OS: macOS 15.6.1
  • Runtime/container: local shared Codex worktree, Node v25.8.2, pnpm 10.32.1
  • Model/provider: OpenRouter + Anthropic-model request path
  • Integration/channel (if any): N/A
  • Relevant config (redacted): builtin openrouter provider on default route vs custom baseUrl, plus custom provider id targeting https://openrouter.ai/api/v1

Steps

  1. Send an Anthropic-model request through builtin OpenRouter on the default route.
  2. Send the same shape through provider: openrouter with a custom OpenAI-compatible baseUrl.
  3. Send the same shape through a custom provider id whose baseUrl is native OpenRouter.

Expected

  • Step 1 keeps prompt-cache markers.
  • Step 2 does not inject OpenRouter-only cache markers into proxy traffic.
  • Step 3 keeps prompt-cache markers despite the custom provider id.

Actual

  • Matches expected with the new wrapper gating and regression tests.

Evidence

Attach at least one:

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Verification snippets:

  • pnpm test src/agents/pi-embedded-runner/proxy-stream-wrappers.test.ts
  • pnpm build
  • pnpm check currently fails in untouched files on this branch (src/agents/pi-embedded-runner/compact.hooks.harness.ts, src/agents/pi-embedded-runner/run/setup.ts) with existing TS2883 portability errors; those files are not part of this diff.

Human Verification (required)

What you personally verified (not just CI), and how:

  • Verified scenarios: focused regression coverage for default OpenRouter, custom proxy URL, and custom provider id on native OpenRouter; post-rebase pnpm build passes.
  • Edge cases checked: non-Anthropic requests remain unmodified by the cache wrapper gate; OpenRouter plugin wrapper composition still works with the generic post-plugin wrapper path.
  • What you did not verify: live network calls to OpenRouter or a third-party proxy; full pnpm check is blocked by untouched TS2883 errors outside this change.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

If a bot review conversation is addressed by this PR, resolve that conversation yourself. Do not leave bot review conversation cleanup for maintainers.

Compatibility / Migration

  • Backward compatible? (Yes)
  • Config/env changes? (No)
  • Migration needed? (No)
  • If yes, exact upgrade steps: N/A

Risks and Mitigations

  • Risk: unusual provider configs could still be misclassified if request-policy detection changes independently later.
    • Mitigation: the cache path now reuses the shared policy helper and has direct regression coverage for the three relevant endpoint classes.

@vincentkoc vincentkoc self-assigned this Apr 4, 2026
@openclaw-barnacle openclaw-barnacle Bot added agents Agent runtime and tooling size: S maintainer Maintainer-authored PR labels Apr 4, 2026
@vincentkoc vincentkoc marked this pull request as ready for review April 4, 2026 08:29
@greptile-apps

greptile-apps Bot commented Apr 4, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR fixes a bug where OpenRouter Anthropic prompt-cache cache_control markers were keyed to the provider plugin identity rather than the actual endpoint classification. It moves the cache marker wrapper from the OpenRouter plugin's wrapStreamFn into the generic applyPostPluginStreamWrappers path in extra-params.ts, and gates the mutation on resolveProviderRequestPolicy's endpointClass so only genuine OpenRouter routes (endpoint class "openrouter" or "default" with provider "openrouter") receive the cache payload mutation.

Key changes:

  • src/agents/pi-embedded-runner/proxy-stream-wrappers.ts: New createOpenRouterSystemCacheWrapper function that resolves endpoint class via resolveProviderRequestPolicy and gates on both isAnthropicModelRef(modelId) and the resolved endpoint class before applying applyAnthropicEphemeralCacheControlMarkers.
  • src/agents/pi-embedded-runner/extra-params.ts: applyPostPluginStreamWrappers now unconditionally installs createOpenRouterSystemCacheWrapper as the outermost wrapper; it's a no-op for non-OpenRouter, non-Anthropic model combinations.
  • extensions/openrouter/index.ts: The wrapStreamFn no longer includes the cache wrapper — only attribution headers + reasoning normalization remain in the plugin path.
  • src/agents/pi-embedded-runner/proxy-stream-wrappers.test.ts: Adds three focused regression tests covering the default OpenRouter route, a custom proxy URL, and a custom provider id pointing at the native OpenRouter host.
  • CHANGELOG.md: Entry appended to the end of the Unreleased ### Fixes block.

Confidence Score: 4/5

Safe to merge; the fix is logically correct, well-scoped, and backed by targeted regression tests.

The core change is sound: endpoint class from resolveProviderRequestPolicy is the right signal to gate cache marker injection, and the three new tests lock in all the relevant scenarios. The wrapper is now unconditionally installed for all providers in applyPostPluginStreamWrappers but its fast-path is cheap. The only deductions are a minor style issue with test model constants and the fact that isOpenRouterAnthropicModelRef may now be unused in this path.

src/agents/pi-embedded-runner/proxy-stream-wrappers.test.ts — update test model constants per repo guideline. Optionally audit isOpenRouterAnthropicModelRef in anthropic-family-cache-semantics.ts for any now-dead callers.

Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/agents/pi-embedded-runner/proxy-stream-wrappers.test.ts
Line: 57

Comment:
**Test model constant out of sync with repo guideline**

`CLAUDE.md` says: _"When tests need example Anthropic/OpenAI model constants, prefer `sonnet-4.6` and `gpt-5.4`; update older Anthropic/GPT examples when you touch those tests."_ The same model string appears in all three new test cases (`anthropic/claude-sonnet-4`). Since these are OpenRouter-prefixed model refs the standalone `sonnet-4.6` constant doesn't apply directly, but the spirit of the rule is to keep test constants current. Consider using `anthropic/claude-sonnet-4-5` (or whatever the current canonical OpenRouter Anthropic path is) so the tests don't require updating again soon.

This applies to the same `id` field at lines 57, 82, and 106.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "Merge branch 'main' into fix/openrouter-..." | Re-trigger Greptile

Comment thread src/agents/pi-embedded-runner/proxy-stream-wrappers.test.ts
@vincentkoc vincentkoc force-pushed the fix/openrouter-cache-endpoint-gating branch from b0625de to 99c0c30 Compare April 4, 2026 10:30
@vincentkoc vincentkoc merged commit 0a3211d into main Apr 4, 2026
25 of 32 checks passed
@vincentkoc vincentkoc deleted the fix/openrouter-cache-endpoint-gating branch April 4, 2026 10:32
KimGLee pushed a commit to KimGLee/openclaw that referenced this pull request Apr 4, 2026
* fix(openrouter): gate prompt cache markers by endpoint

* test(openrouter): use claude sonnet 4.6 cache model
lovewanwan pushed a commit to lovewanwan/openclaw that referenced this pull request Apr 28, 2026
* fix(openrouter): gate prompt cache markers by endpoint

* test(openrouter): use claude sonnet 4.6 cache model
ogt-redknie pushed a commit to ogt-redknie/OPENX that referenced this pull request May 2, 2026
* fix(openrouter): gate prompt cache markers by endpoint

* test(openrouter): use claude sonnet 4.6 cache model
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 9, 2026
* fix(openrouter): gate prompt cache markers by endpoint

* test(openrouter): use claude sonnet 4.6 cache model
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 24, 2026
* fix(openrouter): gate prompt cache markers by endpoint

* test(openrouter): use claude sonnet 4.6 cache model
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling maintainer Maintainer-authored PR size: S

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant