fix(config): accept video and audio in models.input schema#39338
fix(config): accept video and audio in models.input schema#39338steipete merged 3 commits intoopenclaw:mainfrom
Conversation
Greptile SummaryThis PR has two distinct goals bundled together: (1) extend Key changes:
Confidence Score: 4/5
Last reviewed commit: 7f2be15 |
| // Reverse-lookup: if apiKey looks like a resolved secret value (not an env | ||
| // var name), check whether it matches the canonical env var for this provider. | ||
| // This prevents resolveConfigEnvVars()-resolved secrets from being persisted | ||
| // to models.json as plaintext. (Fixes #38757) | ||
| const currentApiKey = normalizedProvider.apiKey; | ||
| if ( | ||
| typeof currentApiKey === "string" && | ||
| currentApiKey.trim() && | ||
| !ENV_VAR_NAME_RE.test(currentApiKey.trim()) | ||
| ) { | ||
| const envVarName = resolveEnvApiKeyVarName(normalizedKey); | ||
| if (envVarName && process.env[envVarName] === currentApiKey) { | ||
| mutated = true; | ||
| normalizedProvider = { ...normalizedProvider, apiKey: envVarName }; | ||
| } | ||
| } |
There was a problem hiding this comment.
Reverse-lookup only covers canonical env vars
The fix correctly handles the case where resolveConfigEnvVars() has expanded a canonical env var reference (e.g. ${OPENAI_API_KEY} → the real key value). However, if the user referenced a non-canonical env var like ${MY_CUSTOM_OPENAI_KEY}, resolveEnvApiKeyVarName(normalizedKey) will return the canonical name (OPENAI_API_KEY), whose value won't match the resolved secret, so the plaintext will still be persisted.
This is worth documenting as a known limitation in the comment, so future maintainers understand the scope of the fix:
// Note: this reverse-lookup only fires for the *canonical* env var
// for the provider (e.g. OPENAI_API_KEY). Secrets resolved from
// non-canonical env vars (e.g. ${MY_CUSTOM_KEY}) are not recovered
// here and will still be written as plaintext.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/models-config.providers.ts
Line: 523-538
Comment:
**Reverse-lookup only covers canonical env vars**
The fix correctly handles the case where `resolveConfigEnvVars()` has expanded a canonical env var reference (e.g. `${OPENAI_API_KEY}` → the real key value). However, if the user referenced a non-canonical env var like `${MY_CUSTOM_OPENAI_KEY}`, `resolveEnvApiKeyVarName(normalizedKey)` will return the canonical name (`OPENAI_API_KEY`), whose value won't match the resolved secret, so the plaintext will still be persisted.
This is worth documenting as a known limitation in the comment, so future maintainers understand the scope of the fix:
```
// Note: this reverse-lookup only fires for the *canonical* env var
// for the provider (e.g. OPENAI_API_KEY). Secrets resolved from
// non-canonical env vars (e.g. ${MY_CUSTOM_KEY}) are not recovered
// here and will still be written as plaintext.
```
How can I resolve this? If you propose a fix, please make it concise.| if ( | ||
| typeof currentApiKey === "string" && | ||
| currentApiKey.trim() && | ||
| !ENV_VAR_NAME_RE.test(currentApiKey.trim()) | ||
| ) { | ||
| const envVarName = resolveEnvApiKeyVarName(normalizedKey); | ||
| if (envVarName && process.env[envVarName] === currentApiKey) { | ||
| mutated = true; | ||
| normalizedProvider = { ...normalizedProvider, apiKey: envVarName }; | ||
| } |
There was a problem hiding this comment.
Possible unintended mutation for explicitly-hardcoded keys
If a user has deliberately hardcoded apiKey: "sk-abc" in their config (i.e. it was never an env var reference), but OPENAI_API_KEY happens to be set to the same value in the environment, the reverse-lookup will silently rewrite the config entry to "OPENAI_API_KEY". This means that if the user later rotates the env var, the formerly-hardcoded key will start resolving to the new value — a potentially surprising behaviour change.
Adding a guard that skips the replacement when the config was loaded from a source that never contained ${…} syntax would eliminate this false-positive, but even a code comment noting this scenario would reduce surprise for future reviewers.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/models-config.providers.ts
Line: 528-537
Comment:
**Possible unintended mutation for explicitly-hardcoded keys**
If a user has deliberately hardcoded `apiKey: "sk-abc"` in their config (i.e. it was never an env var reference), but `OPENAI_API_KEY` happens to be set to the same value in the environment, the reverse-lookup will silently rewrite the config entry to `"OPENAI_API_KEY"`. This means that if the user later rotates the env var, the formerly-hardcoded key will start resolving to the new value — a potentially surprising behaviour change.
Adding a guard that skips the replacement when the config was loaded from a source that never contained `${…}` syntax would eliminate this false-positive, but even a code comment noting this scenario would reduce surprise for future reviewers.
How can I resolve this? If you propose a fix, please make it concise.
xkonjin
left a comment
There was a problem hiding this comment.
Quick review pass:
- Main risk area here is auth/session state and stale credential handling.
- Good to see test coverage move with the code; I’d still make sure it exercises the unhappy path around auth/session state and stale credential handling rather than only the happy path.
- Before merge, I’d smoke-test the behavior touched by models-config.fills-missing-provider-apikey-from-env-var.test.ts, models-config.providers.normalize-keys.test.ts, models-config.providers.ts (+2 more) with malformed input / retry / rollback cases, since that’s where this class of change usually breaks.
|
Thanks for the review @xkonjin, really appreciate the thorough pass. You raise a good point about unhappy-path coverage. The current tests focus on the happy path for both the schema change and the API key reverse-lookup, but we should definitely exercise edge cases like:
I'll add coverage for these edge cases before merge. The auth/session state concern is well-taken -- that's exactly the area where silent failures would be hardest to diagnose in production. |
7f2be15 to
e245163
Compare
e245163 to
3d2e325
Compare
|
Thanks for the feedback @xkonjin — I'll add unhappy-path test coverage for malformed input and stale credential handling. Will push an update shortly. |
|
Working on adding unhappy-path test coverage for malformed input, stale credentials, and retry/rollback scenarios. Will also address the two code comments from greptile. Update coming. |
altaywtf
left a comment
There was a problem hiding this comment.
Thanks for fixing the schema/type validation side of #20721. I found one remaining behavior gap before this is ready.
src/agents/model-catalog.ts:100 still filters configured model input down to text | image | document:
const normalized = input.filter(
(item): item is ModelInputType => item === "text" || item === "image" || item === "document",
);That means configured opt-in provider models still lose audio and video when they flow through loadModelCatalog(), even after this PR makes the config schema accept them.
I verified it with this repro:
node --import tsx --eval 'import { loadModelCatalog } from "./src/agents/model-catalog.ts"; const result = await loadModelCatalog({ useCache: false, config: { models: { providers: { kilocode: { baseUrl: "https://api.kilo.ai/api/gateway/", api: "openai-completions", models: [{ id: "google/gemini-3-pro-preview", name: "Gemini 3 Pro Preview", input: ["text", "image", "video", "audio"], reasoning: true, cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 }, contextWindow: 1048576, maxTokens: 65536 }] } } } } }); console.log(JSON.stringify(result.filter((entry) => entry.provider === "kilocode"), null, 2));'Observed result:
[
{
"id": "google/gemini-3-pro-preview",
"name": "Gemini 3 Pro Preview",
"provider": "kilocode",
"contextWindow": 1048576,
"reasoning": true,
"input": ["text", "image"]
}
]Since models.list and other callers consume loadGatewayModelCatalog() / loadModelCatalog(), the new capabilities are still lost on that path.
I’d recommend widening ModelInputType and normalizeConfiguredModelInput() in src/agents/model-catalog.ts too, and adding a regression test in src/agents/model-catalog.test.ts that asserts configured provider models retain audio / video through loadModelCatalog().
|
Codex automated review: keeping this open. Keep open: current Best possible solution: Continue tracking this item until the missing behavior is implemented or a maintainer decides the product direction. What I checked:
Remaining risk / open question:
Codex Review notes: reviewed against 017252e4f8e7. |
3d2e325 to
70d6616
Compare
|
Thanks for the review. I fixed the remaining catalog path: configured provider catalog entries now preserve Verified locally:
|
ae23c25 to
ee15ca1
Compare
The models.providers[].models[].input zod schema only accepted 'text' and 'image', rejecting 'video' and 'audio' values. This blocked users from declaring Gemini native multimodal capabilities, even though the runtime already supports these media types (MAX_VIDEO_BYTES, MAX_AUDIO_BYTES, mediaKindFromMime). Align the config schema and TypeScript type with the existing MediaUnderstandingCapabilitiesSchema which already accepts all four media types. Fixes openclaw#20721 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
ee15ca1 to
5ed830a
Compare
Summary
"video"and"audio"to themodels.providers[].models[].inputzod schema and TypeScript type"text"and"image"were accepted, blocking Gemini native multimodal capabilitiesMAX_VIDEO_BYTES,MAX_AUDIO_BYTES,mediaKindFromMime())MediaUnderstandingCapabilitiesSchemawhich already accepts all four typesFixes #20721
Test plan
"input": ["text", "image", "video"]passes validation"input": ["text", "image"]configs still workopenclaw doctorwith video/audio model input declarations🤖 Generated with Claude Code