fix: match Codex verbose tool logs to Pi#70966
Conversation
🔒 Aisle Security AnalysisWe found 5 potential security issue(s) in this PR:
1. 🟠 Verbose tool output forwarded to user-visible onToolResult without redaction (potential secret leakage)
DescriptionThe Codex app-server event projector now forwards full tool output to the This is risky because New behavior includes:
Unlike tool summary metadata (which goes through Vulnerable code (output forwarded verbatim): this.emitToolResultMessage({
itemId,
text: formatToolOutput(toolName, itemMeta(item), output),
output: true,
});and output sourcing: if (item.type === "commandExecution") {
return item.aggregatedOutput?.trim() || undefined;
}Data flow:
RecommendationApply secret redaction (and ideally stricter gating) to any tool output before emitting it to Suggested changes:
import { redactSensitiveText } from "../../logging/redact.js"; // adjust path
function formatToolOutput(toolName: string, meta: string | undefined, output: string): string {
const redacted = redactSensitiveText(output);
const trimmed = redacted.trim();
...
}
2. 🟡 Sensitive data exposure via JSON-stringified MCP tool error/result emitted to onToolResult
DescriptionThe event projector emits MCP tool call outputs to the This can unintentionally expose sensitive information to any consumer of
Vulnerable code (MCP path): if (item.type === "mcpToolCall") {
if (item.error) {
return stringifyJsonValue(item.error);
}
return item.result ? stringifyJsonValue(item.result) : undefined;
}and: return JSON.stringify(value, null, 2);Because this output is forwarded via RecommendationRedact or minimize MCP tool outputs before emitting them to Options (can be combined):
Example redaction approach: const SENSITIVE_KEYS = new Set([
"authorization","cookie","set-cookie",
"token","access_token","refresh_token",
"api_key","apikey","password","secret",
]);
function redact(value: unknown): unknown {
if (Array.isArray(value)) return value.map(redact);
if (value && typeof value === "object") {
const out: Record<string, unknown> = {};
for (const [k, v] of Object.entries(value as Record<string, unknown>)) {
out[k] = SENSITIVE_KEYS.has(k.toLowerCase()) ? "[REDACTED]" : redact(v);
}
return out;
}
return value;
}
function stringifyJsonValueSafe(value: unknown): string | undefined {
try {
return JSON.stringify(redact(value), null, 2);
} catch {
return undefined;
}
}Then use 3. 🟡 Terminal escape/control character injection via forwarded tool output in Codex event projector
DescriptionThe Codex app-server event projector forwards raw tool output (including arbitrary program output) into the This affects multiple paths:
If downstream consumers render this text in a terminal/log viewer that interprets ANSI/OSC sequences (or even in some rich UIs), an attacker-controlled tool/output can:
Vulnerable code (raw output embedded into a markdown code fence and emitted): this.emitToolResultMessage({
itemId,
text: formatToolOutput(toolName, undefined, delta),
output: true,
});and return `${formatToolSummary(toolName, meta)}\n\`\`\`txt\n${trimmed}\n\`\`\``;RecommendationSanitize/neutralize control sequences before emitting tool output to Options (choose based on desired UX):
import { stripAnsi } from "src/terminal/ansi.js";
function sanitizeToolOutput(s: string): string {
// Remove ANSI escapes, then drop remaining C0/C1 control chars.
const noAnsi = stripAnsi(s);
return noAnsi.replace(/[\u0000-\u001F\u007F-\u009F]/g, "");
}
// usage
const safeDelta = sanitizeToolOutput(delta);
this.emitToolResultMessage({ itemId, text: formatToolOutput(toolName, undefined, safeDelta), output: true });(There is already
Additionally, consider escaping markdown code fences/backticks in 4. 🟡 Potential secret leakage via inferToolMetaFromArgs() tool-arg summarization (insufficient redaction)
Description
While As a result:
Vulnerable code (new behavior): export function inferToolMetaFromArgs(toolName: string, args: unknown): string | undefined {
const display = resolveToolDisplay({ name: toolName, args });
return formatToolDetail(display);
}And its new usage for dynamic/MCP tools: return inferToolMetaFromArgs(toolName, item.arguments);RecommendationHarden tool-meta derivation so secrets in args cannot be surfaced:
Example mitigation (mask sensitive keys at the source): function maskSensitiveArgs(args: unknown): unknown {
if (!args || typeof args !== "object") return args;
const out: Record<string, unknown> = { ...(args as Record<string, unknown>) };
for (const k of Object.keys(out)) {
if (/^(apiKey|token|secret|password|passwd|accessToken|refreshToken|authorization)$/i.test(k)) {
out[k] = "***";
}
}
return out;
}
export function inferToolMetaFromArgs(toolName: string, args: unknown): string | undefined {
const display = resolveToolDisplay({ name: toolName, args: maskSensitiveArgs(args) });
return formatToolDetail(display);
}This prevents secrets from ever reaching string formatting/redaction, reducing reliance on best-effort regex patterns. 5. 🟡 Markdown injection in tool result summaries/output forwarded via onToolResult
DescriptionThe Codex app-server projector formats tool meta and tool output as markdown and forwards it via
Vulnerable code: return `${formatToolSummary(toolName, meta)}\n\`\`\`txt\n${trimmed}\n\`\`\``;RecommendationTreat tool meta/output as untrusted and escape/encode it before embedding in markdown. Options:
function safeInlineCode(value: string): string {
// Escape backticks by wrapping with a longer delimiter.
const ticks = (value.match(/`+/g) ?? []).map((s) => s.length);
const max = ticks.length ? Math.max(...ticks) : 0;
const delim = "`".repeat(Math.max(1, max + 1));
return `${delim}${value}${delim}`;
}
function safeFencedCodeBlock(text: string, lang = "txt"): string {
// Choose a fence longer than any run of backticks in the content.
const ticks = (text.match(/`+/g) ?? []).map((s) => s.length);
const max = ticks.length ? Math.max(...ticks) : 0;
const fence = "`".repeat(Math.max(3, max + 1));
return `${fence}${lang}\n${text}\n${fence}`;
}Then use these helpers when formatting meta and output. Also consider stripping/normalizing control characters and applying length limits to avoid UI abuse/DoS. Analyzed PR: #70966 at commit Last updated on: 2026-04-24T06:41:14Z |
Greptile SummaryThis PR aligns the Codex app-server's verbose tool logging with the Pi harness by emitting formatted tool summaries and full output through Confidence Score: 5/5Safe to merge; changes are well-encapsulated behind verbose flags and best-effort delivery. All new paths are guarded by shouldEmitToolResult/shouldEmitToolOutput checks, deduplication is correct, error handling is best-effort (no turn impact on failure), and the three new test cases cover the key behaviors. No files require special attention. Reviews (1): Last reviewed commit: "fix: match codex verbose tool logs to pi" | Re-trigger Greptile |
e7b16f3 to
52d28d5
Compare
52d28d5 to
7e96d85
Compare
|
Landed, plus a follow-up hardening commit after the PR merged.
Thanks @jalehman! |
* 'main' of https://github.com/openclaw/openclaw: (521 commits) perf: slim reset model selection imports perf: slim directive parse test imports ci(release): clarify npm telegram approval label test: relax live tail readiness checks ci(release): use release approval for npm telegram e2e Project Codex hook notifications into agent events (openclaw#70969) fix: match codex verbose tool logs to pi (openclaw#70966) feat(diagnostics): attach trace context to otel logs (openclaw#70961) docs(faq): tighten FAQ stub pointers and Related cards docs(help): rewrite help index to match new tab structure, drop redundant troubleshooting H1 refactor(release): distill npm telegram docker runner ci(release): gate npm telegram e2e by release team ci(release): harden npm telegram beta e2e test(release): address npm telegram e2e review test(release): avoid docker argv secret values ci(release): add manual npm telegram beta e2e test(release): support convex npm telegram credentials docs: clarify private ws node setup fix: filter gateway node list locally feat(plugins): move Bonjour discovery into bundled plugin ...
What
Match Codex app-server verbose tool logging to the Pi harness formatting path, including emoji/tool labels, markdown detail rendering, and
/verbose fulltool output delivery.Why
Codex harness verbose logs were plain and often displayed lifecycle statuses like
inProgressorcompletedinstead of useful tool details. This made Codex-backed channel logs noticeably worse than Pi-backed logs.Changes
/verbose fulltool outputTesting
mise x -- pnpm docs:listmise x -- pnpm exec oxfmt --check CHANGELOG.md docs/plugins/sdk-subpaths.md src/plugin-sdk/agent-harness-runtime.ts extensions/codex/src/app-server/event-projector.ts extensions/codex/src/app-server/event-projector.test.tsmise x -- pnpm test extensions/codex/src/app-server/event-projector.test.tsmise x -- pnpm plugin-sdk:api:checkmise x -- pnpm tsgo:allmise x -- pnpm lintmise x -- pnpm buildgit diff --check