-
-
Notifications
You must be signed in to change notification settings - Fork 52.5k
Description
Summary
contextPruning with mode: "cache-ttl" only works with Anthropic models due to hardcoded check in cache-ttl.ts. Both GLM-5 (ZAI API) and Kimi K2 (Moonshot API) support automatic prompt/context caching natively at the API level and should be supported as well.
Problem to solve
Users running GLM-5 via ZAI or Kimi K2 via Moonshot cannot benefit from contextPruning even though both providers support native context caching. GLM-5 automatically caches repeated context prefixes at $0.20/M (1/5 of normal price). Kimi K2 supports prompt caching on Groq. The current hardcoded check in isCacheTtlEligibleProvider() only allows anthropic and openrouter+anthropic/* combinations, silently ignoring the setting for all other providers with no warning to the user.
Proposed solution
Extend isCacheTtlEligibleProvider() in cache-ttl.ts to include:
- provider: "zai" (GLM-5)
- provider: "moonshot" or "moonshotai" (Kimi K2, Kimi K2.5)
Optionally: add a warning log when contextPruning mode is set to "cache-ttl" but the provider is not supported, so users are aware the setting has no effect.
Alternatives considered
Keeping the Anthropic-only restriction and asking users to manually structure prompts for better caching. This is impractical as it requires deep knowledge of each provider's caching internals and cannot be enforced consistently across agents.
Impact
Affected: All users running OpenClaw with non-Anthropic providers (ZAI, Moonshot)
Severity: Medium — contextPruning setting silently has no effect, users believe it is working
Frequency: Always — affects every request when using GLM-5 or Kimi models
Consequence: No context trimming occurs, token usage grows unbounded in long sessions, higher API costs
Evidence/examples
GLM-5 native caching docs: https://docs.z.ai — cached input priced at $0.20/M (1/5 of standard)
Kimi K2 prompt caching on Groq: https://groq.com/blog/introducing-prompt-caching-on-groqcloud
Current config that silently does nothing with GLM-5:
"contextPruning": {
"mode": "cache-ttl",
"ttl": "1h",
"keepLastAssistants": 2,
"hardClearRatio": 0.4
}
Hardcoded check in src/agents/pi-embedded-runner/cache-ttl.ts:
isCacheTtlEligibleProvider("zai", "glm-5") → always returns false
Additional information
No response