Skip to content

[Feature]: Add ZAI (GLM-5) and Moonshot (Kimi) providers to cache-ttl context pruning #24497

@alexorbuno

Description

@alexorbuno

Summary

contextPruning with mode: "cache-ttl" only works with Anthropic models due to hardcoded check in cache-ttl.ts. Both GLM-5 (ZAI API) and Kimi K2 (Moonshot API) support automatic prompt/context caching natively at the API level and should be supported as well.

Problem to solve

Users running GLM-5 via ZAI or Kimi K2 via Moonshot cannot benefit from contextPruning even though both providers support native context caching. GLM-5 automatically caches repeated context prefixes at $0.20/M (1/5 of normal price). Kimi K2 supports prompt caching on Groq. The current hardcoded check in isCacheTtlEligibleProvider() only allows anthropic and openrouter+anthropic/* combinations, silently ignoring the setting for all other providers with no warning to the user.

Proposed solution

Extend isCacheTtlEligibleProvider() in cache-ttl.ts to include:

  • provider: "zai" (GLM-5)
  • provider: "moonshot" or "moonshotai" (Kimi K2, Kimi K2.5)

Optionally: add a warning log when contextPruning mode is set to "cache-ttl" but the provider is not supported, so users are aware the setting has no effect.

Alternatives considered

Keeping the Anthropic-only restriction and asking users to manually structure prompts for better caching. This is impractical as it requires deep knowledge of each provider's caching internals and cannot be enforced consistently across agents.

Impact

Affected: All users running OpenClaw with non-Anthropic providers (ZAI, Moonshot)
Severity: Medium — contextPruning setting silently has no effect, users believe it is working
Frequency: Always — affects every request when using GLM-5 or Kimi models
Consequence: No context trimming occurs, token usage grows unbounded in long sessions, higher API costs

Evidence/examples

GLM-5 native caching docs: https://docs.z.ai — cached input priced at $0.20/M (1/5 of standard)
Kimi K2 prompt caching on Groq: https://groq.com/blog/introducing-prompt-caching-on-groqcloud

Current config that silently does nothing with GLM-5:
"contextPruning": {
"mode": "cache-ttl",
"ttl": "1h",
"keepLastAssistants": 2,
"hardClearRatio": 0.4
}

Hardcoded check in src/agents/pi-embedded-runner/cache-ttl.ts:
isCacheTtlEligibleProvider("zai", "glm-5") → always returns false

Additional information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions