Skip to content

Commit 235e3c2

Browse files
committed
fix(models-config): close cache-vs-config drift, hash fingerprint, +tests
Three review-driven fixes per @zeroaltitude direction (b)+(c) + secret hygiene + tests: (b) Validate disk-vs-config before short-circuiting [Aisle High #2 / Codex P1 / Greptile P1 on PR #72869] The previous targetProvider short-circuit fired whenever the on-disk provider entry contained ANY non-empty credential. That silently bypassed: - rotated apiKey: cold start with new key, old key on disk, short-circuit fires, all calls fail until something else invalidates - attacker-tampered baseUrl: redirect to exfil endpoint kept - attacker-injected headers: arbitrary auth material kept New readExistingProviderMatchesConfig() does a strict structural comparison: apiKey - resolved through resolveSecretInputRef (env-ref expansion via createConfigRuntimeEnv) before string equality vs. disk baseUrl - exact string equality headers - stable structural equality (key-order independent) auth - stable structural equality Any mismatch (or any state we cannot conclusively verify, like a non-env secret ref) returns false and falls through to full planning. The short-circuit is now safe to use on cold start and after gateway restart. (c) Hash models.json content into the cache key [Codex P1 on PR #72869] Previous fingerprint had no models.json input \u2014 once the cache was populated, unchanged config/auth returned cached success even after the file was edited externally / partially corrupted / manually tampered. Now readyCache stores both the input fingerprint AND the post-write models.json SHA-256. Cache hit requires both to match; any external edit invalidates. Captures the hash at three points (skip path, noop path, write path) so the second factor is always recorded. Aisle medium #5: hash fingerprint before storage Raw stable-stringified config (including apiKey strings) used to sit verbatim in MODELS_JSON_STATE.readyCache. SHA-256 over the canonical payload is now the cache key \u2014 deterministic but not reversible, so heap snapshots / debug telemetry / core dumps can't leak secrets via the readyCache state. Greptile P2: targetProvider short-circuit tests New file models-config.target-provider-short-circuit.test.ts with 6 cases: - hit-on-match (full structural match short-circuits) - miss-on-rotated-key (config apiKey change forces plan) - miss-on-baseUrl-change (tampered disk baseUrl rejects) - miss-on-tampered-headers (any header drift rejects) - miss-on-cold-cache (no disk file forces plan) - hit-after-warm-fingerprint + invalidation on external models.json edit (modelsJsonHash second-factor verified) Existing fingerprint-cache test updated: The 'volatile fields rotate' test mixed type:oauth (correctly volatile) with type:token (now correctly NOT volatile after d505fa0). Split into two tests: - OAuth session-field rotation does NOT invalidate (existing intent, narrowed to oauth-only profiles) - Static type:token credential rotation DOES invalidate (Codex/Greptile P2 - new correct behavior) State shape change: MODELS_JSON_STATE.readyCache value extended with modelsJsonHash: { fingerprint, modelsJsonHash, result } All three return paths in the plan closure capture this. Tests: 13/13 (6 new + 7 existing fingerprint-cache + file-mode). Lint: 0 errors. TS: clean.
1 parent d505fa0 commit 235e3c2

4 files changed

Lines changed: 489 additions & 45 deletions

File tree

src/agents/models-config-state.ts

Lines changed: 24 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,28 @@
11
const MODELS_JSON_STATE_KEY = Symbol.for("openclaw.modelsJsonState");
22

3+
/**
4+
* Cache entry shape captured at write/plan completion. The `fingerprint` is
5+
* a SHA-256 hex digest of the canonical input shape (config + auth-profiles
6+
* stable hash + plugin metadata) — NOT the raw stable-stringified payload.
7+
* Hashing it before storage keeps raw secrets out of process memory
8+
* (Aisle medium #5 on PR #72869) so heap snapshots / debug telemetry / core
9+
* dumps cannot leak `apiKey` material via the readyCache.
10+
*
11+
* `modelsJsonHash` is captured immediately after the plan-and-write
12+
* completes successfully. The cache check verifies that the current
13+
* on-disk models.json still hashes to this value before treating the
14+
* entry as a hit (Codex P1 on PR #72869). Any external edit / partial
15+
* corruption / manual tamper changes the hash and invalidates the cache.
16+
*/
317
type ModelsJsonState = {
418
writeLocks: Map<string, Promise<void>>;
519
readyCache: Map<
620
string,
7-
Promise<{ fingerprint: string; result: { agentDir: string; wrote: boolean } }>
21+
Promise<{
22+
fingerprint: string;
23+
modelsJsonHash: string | null;
24+
result: { agentDir: string; wrote: boolean };
25+
}>
826
>;
927
};
1028

@@ -17,7 +35,11 @@ export const MODELS_JSON_STATE = (() => {
1735
writeLocks: new Map<string, Promise<void>>(),
1836
readyCache: new Map<
1937
string,
20-
Promise<{ fingerprint: string; result: { agentDir: string; wrote: boolean } }>
38+
Promise<{
39+
fingerprint: string;
40+
modelsJsonHash: string | null;
41+
result: { agentDir: string; wrote: boolean };
42+
}>
2143
>(),
2244
};
2345
}

src/agents/models-config.fingerprint-cache.test.ts

Lines changed: 45 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -110,18 +110,13 @@ describe("ensureOpenClawModelsJson fingerprint cache", () => {
110110
expect(resolveImplicitProvidersCallCount).toBe(firstCount);
111111
});
112112

113-
it("does not invalidate the cache when auth-profiles volatile fields rotate", async () => {
113+
it("does not invalidate the cache when OAuth session fields rotate", async () => {
114114
const agentDir = await fixtureSuite.createCaseDir("agent");
115115
const cfg = createOpenAiConfig();
116116

117117
await writeAuthProfiles(agentDir, {
118118
version: 1,
119119
profiles: {
120-
"anthropic:default": {
121-
type: "token",
122-
provider: "anthropic",
123-
token: "sk-ant-first-token-value", // pragma: allowlist secret
124-
},
125120
"openai-codex:default": {
126121
type: "oauth",
127122
provider: "openai-codex",
@@ -137,17 +132,13 @@ describe("ensureOpenClawModelsJson fingerprint cache", () => {
137132
const firstCount = resolveImplicitProvidersCallCount;
138133
expect(firstCount).toBe(1);
139134

140-
// Simulate an OAuth token refresh: volatile fields (access/refresh/expires/token)
135+
// Simulate an OAuth token refresh: access/refresh/expires fields
141136
// rotate, but the set of providers the user can use does not change.
137+
// These fields stay in AUTH_PROFILE_VOLATILE_FIELDS.
142138
await new Promise((resolve) => setTimeout(resolve, 10));
143139
await writeAuthProfiles(agentDir, {
144140
version: 1,
145141
profiles: {
146-
"anthropic:default": {
147-
type: "token",
148-
provider: "anthropic",
149-
token: "sk-ant-rotated-token-value", // pragma: allowlist secret
150-
},
151142
"openai-codex:default": {
152143
type: "oauth",
153144
provider: "openai-codex",
@@ -163,6 +154,48 @@ describe("ensureOpenClawModelsJson fingerprint cache", () => {
163154
expect(resolveImplicitProvidersCallCount).toBe(firstCount);
164155
});
165156

157+
it("DOES invalidate the cache when a static type:token credential rotates (Codex/Greptile P2)", async () => {
158+
// Counterpart to the OAuth-rotation test above. Profiles with
159+
// `type: "token"` use the literal `token` key as a long-lived static
160+
// credential. The user rotating this credential must invalidate the
161+
// cache so the implicit-provider-discovery pipeline re-runs against
162+
// the new value (Codex/Greptile P2 on PR #72869: "token" used to be
163+
// in the volatile fields set, masking real auth-state changes).
164+
const agentDir = await fixtureSuite.createCaseDir("agent");
165+
const cfg = createOpenAiConfig();
166+
167+
await writeAuthProfiles(agentDir, {
168+
version: 1,
169+
profiles: {
170+
"anthropic:default": {
171+
type: "token",
172+
provider: "anthropic",
173+
token: "sk-ant-first-token-value", // pragma: allowlist secret
174+
},
175+
},
176+
});
177+
178+
await ensureOpenClawModelsJson(cfg, agentDir);
179+
const firstCount = resolveImplicitProvidersCallCount;
180+
expect(firstCount).toBe(1);
181+
182+
await new Promise((resolve) => setTimeout(resolve, 10));
183+
await writeAuthProfiles(agentDir, {
184+
version: 1,
185+
profiles: {
186+
"anthropic:default": {
187+
type: "token",
188+
provider: "anthropic",
189+
token: "sk-ant-rotated-token-value", // pragma: allowlist secret
190+
},
191+
},
192+
});
193+
194+
await ensureOpenClawModelsJson(cfg, agentDir);
195+
// Static-credential rotation must trigger a re-plan.
196+
expect(resolveImplicitProvidersCallCount).toBe(firstCount + 1);
197+
});
198+
166199
it("invalidates the cache when an auth profile is added or removed", async () => {
167200
const agentDir = await fixtureSuite.createCaseDir("agent");
168201
const cfg = createOpenAiConfig();
Lines changed: 226 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,226 @@
1+
import fs from "node:fs/promises";
2+
import path from "node:path";
3+
import { afterAll, afterEach, beforeAll, describe, expect, it, vi } from "vitest";
4+
import type { OpenClawConfig } from "../config/types.openclaw.js";
5+
import { createFixtureSuite } from "../test-utils/fixture-suite.js";
6+
import {
7+
installModelsConfigTestHooks,
8+
MODELS_CONFIG_IMPLICIT_ENV_VARS,
9+
unsetEnv,
10+
} from "./models-config.e2e-harness.js";
11+
12+
vi.mock("../plugins/manifest-registry.js", () => ({
13+
clearPluginManifestRegistryCache: () => undefined,
14+
loadPluginManifestRegistry: () => ({ plugins: [] }),
15+
}));
16+
17+
vi.mock("./model-auth-env-vars.js", () => ({
18+
listKnownProviderEnvApiKeyNames: () => ["OPENAI_API_KEY"],
19+
PROVIDER_ENV_API_KEY_CANDIDATES: { openai: ["OPENAI_API_KEY"] },
20+
resolveProviderEnvApiKeyCandidates: () => ({ openai: ["OPENAI_API_KEY"] }),
21+
}));
22+
23+
vi.mock("../plugins/provider-runtime.js", () => ({
24+
applyProviderConfigDefaultsWithPlugin: (config: OpenClawConfig) => config,
25+
applyProviderNativeStreamingUsageCompatWithPlugin: () => undefined,
26+
normalizeProviderConfigWithPlugin: () => undefined,
27+
resetProviderRuntimeHookCacheForTest: () => undefined,
28+
resolveProviderConfigApiKeyWithPlugin: () => undefined,
29+
resolveProviderSyntheticAuthWithPlugin: () => undefined,
30+
}));
31+
32+
/**
33+
* Track implicit-provider-discovery invocations so we can verify whether
34+
* the targetProvider short-circuit fired (no call) or fell through to
35+
* full planning (one call per ensureOpenClawModelsJson invocation).
36+
*/
37+
let resolveImplicitProvidersCallCount = 0;
38+
vi.mock("./models-config.providers.js", async () => {
39+
const actual = await vi.importActual<typeof import("./models-config.providers.js")>(
40+
"./models-config.providers.js",
41+
);
42+
return {
43+
...actual,
44+
resolveImplicitProviders: async () => {
45+
resolveImplicitProvidersCallCount += 1;
46+
return {};
47+
},
48+
};
49+
});
50+
51+
let clearConfigCache: typeof import("../config/config.js").clearConfigCache;
52+
let clearRuntimeConfigSnapshot: typeof import("../config/config.js").clearRuntimeConfigSnapshot;
53+
let ensureOpenClawModelsJson: typeof import("./models-config.js").ensureOpenClawModelsJson;
54+
let resetModelsJsonReadyCacheForTest: typeof import("./models-config.js").resetModelsJsonReadyCacheForTest;
55+
56+
const fixtureSuite = createFixtureSuite("openclaw-models-target-provider-");
57+
58+
function createOpenAiConfig(apiKey = "sk-test-static-value"): OpenClawConfig {
59+
return {
60+
models: {
61+
providers: {
62+
openai: {
63+
baseUrl: "https://api.openai.com/v1",
64+
// pragma: allowlist secret
65+
apiKey,
66+
api: "openai-completions" as const,
67+
models: [],
68+
},
69+
},
70+
},
71+
};
72+
}
73+
74+
beforeAll(async () => {
75+
await fixtureSuite.setup();
76+
({ ensureOpenClawModelsJson, resetModelsJsonReadyCacheForTest } =
77+
await import("./models-config.js"));
78+
({ clearConfigCache, clearRuntimeConfigSnapshot } = await import("../config/config.js"));
79+
installModelsConfigTestHooks();
80+
});
81+
82+
afterEach(() => {
83+
clearRuntimeConfigSnapshot();
84+
clearConfigCache();
85+
resetModelsJsonReadyCacheForTest();
86+
resolveImplicitProvidersCallCount = 0;
87+
unsetEnv([...MODELS_CONFIG_IMPLICIT_ENV_VARS]);
88+
});
89+
90+
afterAll(async () => {
91+
await fixtureSuite.cleanup();
92+
});
93+
94+
/**
95+
* Six tests for the targetProvider short-circuit semantics on PR #72869
96+
* (Greptile P2 + Aisle High #2 + Codex P1).
97+
*
98+
* The short-circuit was previously a "presence-only" check that fired when
99+
* any non-empty credential was on disk for the requested provider. That
100+
* silently bypassed configuration drift (rotated keys, attacker-tampered
101+
* baseUrl/headers/auth). The fix structurally compares disk vs. config
102+
* before short-circuiting and falls through to full planning on any
103+
* mismatch.
104+
*/
105+
describe("ensureOpenClawModelsJson targetProvider short-circuit", () => {
106+
it("hit-on-match: full disk-vs-config match short-circuits planning", async () => {
107+
const agentDir = await fixtureSuite.createCaseDir("agent");
108+
const cfg = createOpenAiConfig();
109+
110+
// First call: cold start, must run plan and write models.json.
111+
await ensureOpenClawModelsJson(cfg, agentDir, { targetProvider: "openai" });
112+
expect(resolveImplicitProvidersCallCount).toBe(1);
113+
114+
// Second call with identical config + intact disk state: short-circuit
115+
// path now sees a structural match and returns without re-planning.
116+
resetModelsJsonReadyCacheForTest();
117+
resolveImplicitProvidersCallCount = 0;
118+
await ensureOpenClawModelsJson(cfg, agentDir, { targetProvider: "openai" });
119+
expect(resolveImplicitProvidersCallCount).toBe(0);
120+
});
121+
122+
it("miss-on-rotated-key: config apiKey change forces a full plan", async () => {
123+
const agentDir = await fixtureSuite.createCaseDir("agent");
124+
// pragma: allowlist secret
125+
const cfgOriginal = createOpenAiConfig("sk-test-original-key");
126+
127+
await ensureOpenClawModelsJson(cfgOriginal, agentDir, { targetProvider: "openai" });
128+
expect(resolveImplicitProvidersCallCount).toBe(1);
129+
130+
// Rotate the key in config, simulate a gateway restart (clear in-memory
131+
// cache), and verify the next call falls through to planning instead of
132+
// returning stale on-disk state with the OLD key.
133+
resetModelsJsonReadyCacheForTest();
134+
resolveImplicitProvidersCallCount = 0;
135+
// pragma: allowlist secret
136+
const cfgRotated = createOpenAiConfig("sk-test-rotated-key");
137+
await ensureOpenClawModelsJson(cfgRotated, agentDir, { targetProvider: "openai" });
138+
expect(resolveImplicitProvidersCallCount).toBe(1);
139+
});
140+
141+
it("miss-on-baseUrl-change: tampered disk baseUrl rejects the short-circuit", async () => {
142+
const agentDir = await fixtureSuite.createCaseDir("agent");
143+
const cfg = createOpenAiConfig();
144+
145+
await ensureOpenClawModelsJson(cfg, agentDir, { targetProvider: "openai" });
146+
expect(resolveImplicitProvidersCallCount).toBe(1);
147+
148+
// Simulate an attacker editing models.json to redirect baseUrl to an
149+
// exfiltration endpoint. Clear the in-memory cache (e.g. gateway
150+
// restart) so the short-circuit path is the only thing that could
151+
// trust this disk state.
152+
const targetPath = path.join(agentDir, "models.json");
153+
const raw = await fs.readFile(targetPath, "utf8");
154+
const parsed = JSON.parse(raw);
155+
parsed.providers.openai.baseUrl = "https://attacker.example/v1";
156+
await fs.writeFile(targetPath, JSON.stringify(parsed));
157+
158+
resetModelsJsonReadyCacheForTest();
159+
resolveImplicitProvidersCallCount = 0;
160+
await ensureOpenClawModelsJson(cfg, agentDir, { targetProvider: "openai" });
161+
// Falls through to plan, which will rewrite the file with the correct
162+
// baseUrl from config.
163+
expect(resolveImplicitProvidersCallCount).toBe(1);
164+
});
165+
166+
it("miss-on-tampered-headers: any disk header drift rejects the short-circuit", async () => {
167+
const agentDir = await fixtureSuite.createCaseDir("agent");
168+
const cfg = createOpenAiConfig();
169+
170+
await ensureOpenClawModelsJson(cfg, agentDir, { targetProvider: "openai" });
171+
expect(resolveImplicitProvidersCallCount).toBe(1);
172+
173+
// Inject attacker-supplied headers (e.g. Authorization override) onto
174+
// the disk row. Config has none, so the structural comparison must
175+
// reject this and force a full plan that overwrites with config shape.
176+
const targetPath = path.join(agentDir, "models.json");
177+
const raw = await fs.readFile(targetPath, "utf8");
178+
const parsed = JSON.parse(raw);
179+
parsed.providers.openai.headers = { "X-Injected-Auth": "attacker-token" };
180+
await fs.writeFile(targetPath, JSON.stringify(parsed));
181+
182+
resetModelsJsonReadyCacheForTest();
183+
resolveImplicitProvidersCallCount = 0;
184+
await ensureOpenClawModelsJson(cfg, agentDir, { targetProvider: "openai" });
185+
expect(resolveImplicitProvidersCallCount).toBe(1);
186+
});
187+
188+
it("miss-on-cold-cache: empty in-memory cache + missing disk file forces a plan", async () => {
189+
const agentDir = await fixtureSuite.createCaseDir("agent");
190+
const cfg = createOpenAiConfig();
191+
192+
// No prior writes — disk has no models.json. Even with targetProvider
193+
// set, the short-circuit cannot match against a non-existent file
194+
// and must fall through to the full plan.
195+
resetModelsJsonReadyCacheForTest();
196+
resolveImplicitProvidersCallCount = 0;
197+
await ensureOpenClawModelsJson(cfg, agentDir, { targetProvider: "openai" });
198+
expect(resolveImplicitProvidersCallCount).toBe(1);
199+
});
200+
201+
it("hit-after-warm-fingerprint: identical inputs reuse the in-memory cache without re-running plan", async () => {
202+
const agentDir = await fixtureSuite.createCaseDir("agent");
203+
const cfg = createOpenAiConfig();
204+
205+
await ensureOpenClawModelsJson(cfg, agentDir, { targetProvider: "openai" });
206+
expect(resolveImplicitProvidersCallCount).toBe(1);
207+
208+
// Same config + same disk + warm in-memory cache: both the
209+
// targetProvider short-circuit AND the fingerprint cache hit are
210+
// available. Either way, no re-plan should fire.
211+
await ensureOpenClawModelsJson(cfg, agentDir, { targetProvider: "openai" });
212+
expect(resolveImplicitProvidersCallCount).toBe(1);
213+
214+
// External edit to models.json after the warm cache populated must
215+
// invalidate the in-memory cache via the modelsJsonHash second factor
216+
// (Codex P1: previously the fingerprint alone was the key, so external
217+
// edits silently returned stale cached success).
218+
const targetPath = path.join(agentDir, "models.json");
219+
await fs.writeFile(
220+
targetPath,
221+
JSON.stringify({ providers: { openai: { baseUrl: "https://attacker.example/v1" } } }),
222+
);
223+
await ensureOpenClawModelsJson(cfg, agentDir, { targetProvider: "openai" });
224+
expect(resolveImplicitProvidersCallCount).toBe(2);
225+
});
226+
});

0 commit comments

Comments
 (0)