feat(cache): per-agent params for cacheRetention control (#17112)#17470
feat(cache): per-agent params for cacheRetention control (#17112)#17470rrenamed wants to merge 1 commit intoopenclaw:mainfrom
Conversation
Add optional `params` field to AgentConfig so individual agents can override stream params like cacheRetention. Agent params merge on top of global model defaults, allowing low-traffic agents (e.g., cron jobs) to disable caching and avoid wasted cache write costs. Fixes openclaw#17112
|
This pull request has been automatically marked as stale due to inactivity. |
|
Closing as covered by landed equivalent changes. Implemented and landed:
This delivers per-agent cache tuning (e.g. |
* upstream/main: (1467 commits) fix(doctor): use gateway health status for memory search key check (openclaw#22327) refactor: harden reset notice + cron delivery target flow refactor(exec): simplify env-prefixed wrapper modifier check fix(skills): support multiline frontmatter fallback without PyYAML fix(skills): make quick_validate work without PyYAML fix(exec): bind env-prefixed shell wrappers to full approval text fix(browser): derive relay auth token from gateway token in Chrome extension Browser relay: accept raw gateway token in extension auth fix(gateway): include platform and reason in node command rejection error CLI: fix gateway restart health ownership for child listener pids (openclaw#24696) docs: detail per-agent prompt caching configuration fix(config): tighten bedrock cache-retention type narrowing feat(agents): add per-agent stream params overrides for cache tuning (openclaw#17470) (thanks @rrenamed) fix(providers): support Bedrock Anthropic cacheRetention defaults/pass-through (openclaw#22303) (thanks @snese) fix(providers): disable Bedrock prompt caching for non-Anthropic models (openclaw#20866) (thanks @pierreeurope) docs(changelog): note /new and /reset auth-label removal (openclaw#24409) fix(reply): omit auth labels in /new and /reset docs(changelog): correct kimi issue references test(tools): fix kimi web_search mock typing feat(media): add moonshot video provider and wiring ... # Conflicts: # ui/src/ui/app-render.ts # ui/src/ui/controllers/agents.ts
|
Are the available params documented anywhere? And how would you set it for cron jobs specifically? |
|
@BillChirico Docs are here: https://docs.openclaw.ai/reference/prompt-caching and the config reference covers the schema too. For cron jobs, they inherit params from whichever agent they run on. So you'd set it on the agent in If your crons don't target a specific agent you can also set it per-model under |
|
Ah, so I'd make a specific cron agent @rrenamed, thank you! Do you have any other recommendations on how to use this to save tokens or anything else? Different types of agents or something. |
|
Depends on your setup but the main win is disabling cache on anything low-traffic. We have a couple of cron agents that fire every 30min to 1h with small prompts, setting cacheRetention: "none" on those saved us roughly 25% on daily API spend since the cache writes were just expiring unused. For high-traffic agents (main orchestrator, anything getting frequent messages) keep the default caching, that's where it actually pays off. |
|
Other than that, model routing helps more than cache tuning honestly. We run Haiku for simple cron jobs (daily reports, health checks) and only use Opus/Sonnet for the agents that need it. That alone cut our costs way more than the cache stuff did. We went from like $8/day to around $2-3/day just from routing the right models to the right jobs. @BillChirico Hope this helps:) |
Thank does a ton thank you! You're able to route models that low level? |
Yeah, each agent and cron can have its own model. In your config: {
"agents": {
"list": [
{
"id": "main",
"model": "openrouter/openai/gpt-5.2"
},
{
"id": "daily-report",
"model": "ollama/llama3:8b"
}
]
}
}For crons you set the model directly when you create it: So you can mix and match however you want. Heavy reasoning on Opus or GPT-5.2, coding on Sonnet or Codex, lightweight crons on Haiku or a local Ollama model. Each one gets its own model + params independently. One thing that's worked well for us is having a second agent validate the first one's output. Like our main agent runs on one model and a separate tester agent on a cheaper one double-checks its work. Catches a lot of stuff the primary misses. |
|
@rrenamed Can I contact you on Discord? This is a huge help and just have a couple more questions. My Discord is Bapes |
|
@rrenamed I accidentally ignored your Discord invitation. Just sent you another one! |
PR openclaw#17470 added per-agent `params` overrides to the TypeScript type (`AgentConfig`) and runtime logic (`extra-params.ts`) but missed adding the field to the Zod validation schema. Because `AgentEntrySchema` uses `.strict()`, any `params` key in the agent config is silently rejected at parse time, making the feature unusable. Add `params: z.record(z.string(), z.unknown()).optional()` to `AgentEntrySchema` to match the existing type definition. Closes openclaw#25903 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
PR openclaw#17470 added per-agent `params` overrides to the TypeScript type (`AgentConfig`) and runtime logic (`extra-params.ts`) but missed adding the field to the Zod validation schema. Because `AgentEntrySchema` uses `.strict()`, any `params` key in the agent config is silently rejected at parse time, making the feature unusable. Add `params: z.record(z.string(), z.unknown()).optional()` to `AgentEntrySchema` to match the existing type definition. Closes openclaw#25903 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
Fixes #17112. No way to set
cacheRetentionper-agent. Low-traffic agents (e.g., cron jobs running every 30 min) waste money on cache writes ($1.25/MTok) that expire before the next request.Root Cause
resolveExtraParams()only reads from global model defaults (agents.defaults.models["provider/model"].params). There is no per-agent override — all agents sharing the same model get identical cache settings.Fix
Add optional
paramsfield toAgentConfigand resolve it inresolveExtraParams():Agent params merge on top of global model defaults. Three files changed, no new dependencies.
Config example:
{ "agents": { "list": [{ "id": "risk-reviewer", "params": { "cacheRetention": "none" } }] } }Test plan
Local Validation
pnpm build✅pnpm check(format + tsgo + lint) ✅Contribution Checklist
pnpm build && pnpm check)AI-assisted (Claude). Reviewed and tested by human.
Greptile Summary
Adds per-agent
paramsoverride support toresolveExtraParams, allowing individual agents to customize stream parameters (e.g.,cacheRetention,temperature) that previously could only be set globally per model. Agent-level params merge on top of global model defaults, with agent values winning on conflict.params?: Record<string, unknown>field toAgentConfigtyperesolveExtraParamsandapplyExtraParamsToAgentwith optionalagentIdparametersessionAgentIdthrough at the call site inattempt.tsagentIdis not providedConfidence Score: 5/5
Last reviewed commit: 8c13172
(4/5) You can add custom instructions or style guidelines for the agent here!