Skip to content

fix: match Codex verbose tool logs to Pi#70966

Merged
jalehman merged 1 commit intoopenclaw:mainfrom
jalehman:josh/fix-codex-verbose-logging
Apr 24, 2026
Merged

fix: match Codex verbose tool logs to Pi#70966
jalehman merged 1 commit intoopenclaw:mainfrom
jalehman:josh/fix-codex-verbose-logging

Conversation

@jalehman
Copy link
Copy Markdown
Contributor

What

Match Codex app-server verbose tool logging to the Pi harness formatting path, including emoji/tool labels, markdown detail rendering, and /verbose full tool output delivery.

Why

Codex harness verbose logs were plain and often displayed lifecycle statuses like inProgress or completed instead of useful tool details. This made Codex-backed channel logs noticeably worse than Pi-backed logs.

Changes

  • Emit Codex verbose tool summaries
  • Emit /verbose full tool output
  • Use Pi-style tool formatting helpers
  • Infer detail from tool arguments
  • Stop displaying lifecycle status
  • Cover status-regression cases

Testing

  • mise x -- pnpm docs:list
  • mise x -- pnpm exec oxfmt --check CHANGELOG.md docs/plugins/sdk-subpaths.md src/plugin-sdk/agent-harness-runtime.ts extensions/codex/src/app-server/event-projector.ts extensions/codex/src/app-server/event-projector.test.ts
  • mise x -- pnpm test extensions/codex/src/app-server/event-projector.test.ts
  • mise x -- pnpm plugin-sdk:api:check
  • mise x -- pnpm tsgo:all
  • mise x -- pnpm lint
  • mise x -- pnpm build
  • git diff --check

@aisle-research-bot
Copy link
Copy Markdown

aisle-research-bot Bot commented Apr 24, 2026

🔒 Aisle Security Analysis

We found 5 potential security issue(s) in this PR:

# Severity Title
1 🟠 High Verbose tool output forwarded to user-visible onToolResult without redaction (potential secret leakage)
2 🟡 Medium Sensitive data exposure via JSON-stringified MCP tool error/result emitted to onToolResult
3 🟡 Medium Terminal escape/control character injection via forwarded tool output in Codex event projector
4 🟡 Medium Potential secret leakage via inferToolMetaFromArgs() tool-arg summarization (insufficient redaction)
5 🟡 Medium Markdown injection in tool result summaries/output forwarded via onToolResult
1. 🟠 Verbose tool output forwarded to user-visible onToolResult without redaction (potential secret leakage)
Property Value
Severity High
CWE CWE-200
Location extensions/codex/src/app-server/event-projector.ts:543-561

Description

The Codex app-server event projector now forwards full tool output to the onToolResult callback when verbose output is enabled (verboseLevel === "full" or shouldEmitToolOutput() returns true).

This is risky because onToolResult is wired to user-facing streaming/delivery in the agent runner (agent-runner-execution.ts), so these messages can end up in chat channels.

New behavior includes:

  • commandExecution.aggregatedOutput (shell stdout/stderr)
  • dynamicToolCall.contentItems[].text (e.g., file reads like read returning file contents)
  • mcpToolCall.result / mcpToolCall.error (can contain stack traces, credentials, internal data)
  • incremental output deltas from item/commandExecution/outputDelta and item/fileChange/outputDelta

Unlike tool summary metadata (which goes through redactToolDetail inside inferToolMetaFromArgs), tool output is inserted verbatim into a markdown code block and delivered. This can leak secrets such as API keys printed by commands, .env file contents, private keys, access tokens in error payloads, etc.

Vulnerable code (output forwarded verbatim):

this.emitToolResultMessage({
  itemId,
  text: formatToolOutput(toolName, itemMeta(item), output),
  output: true,
});

and output sourcing:

if (item.type === "commandExecution") {
  return item.aggregatedOutput?.trim() || undefined;
}

Data flow:

  • source: tool outputs from Codex items (aggregatedOutput, contentItems, mcp result/error, delta events)
  • sink: this.params.onToolResult?.({ text: ... }) (ultimately sent to end users via auto-reply tool streaming)
  • missing protection: no redaction/sanitization of secrets in outputs; only meta redaction exists

Recommendation

Apply secret redaction (and ideally stricter gating) to any tool output before emitting it to onToolResult.

Suggested changes:

  1. Redact sensitive patterns (reuse the existing redaction utilities used for tool details/logging) before formatting:
import { redactSensitiveText } from "../../logging/redact.js"; // adjust path

function formatToolOutput(toolName: string, meta: string | undefined, output: string): string {
  const redacted = redactSensitiveText(output);
  const trimmed = redacted.trim();
  ...
}
  1. Consider restricting verbose/full tool output to trusted contexts only (e.g., owner/admin sessions), not merely a request flag.

  2. Consider adding a hard size cap and/or allow-list of tools whose output may be streamed to users.

  3. Ensure downstream delivery (onToolResult) is treated as user-visible and is never fed raw secrets or file contents without explicit authorization.

2. 🟡 Sensitive data exposure via JSON-stringified MCP tool error/result emitted to onToolResult
Property Value
Severity Medium
CWE CWE-201
Location extensions/codex/src/app-server/event-projector.ts:880-916

Description

The event projector emits MCP tool call outputs to the onToolResult callback when verboseLevel === "full" (or when shouldEmitToolOutput() returns true). For MCP tool calls, itemOutputText() converts item.error or item.result to a pretty-printed JSON string using JSON.stringify(value, null, 2) and includes it in the emitted tool output message.

This can unintentionally expose sensitive information to any consumer of onToolResult (typically UI clients/logs):

  • item.error may include stack traces, internal file paths, internal endpoints, request metadata, etc.
  • item.result may include structured responses containing secrets (access tokens, API keys, session cookies, Authorization headers), depending on the MCP server/tool.
  • No redaction/allowlisting is applied before emission.

Vulnerable code (MCP path):

if (item.type === "mcpToolCall") {
  if (item.error) {
    return stringifyJsonValue(item.error);
  }
  return item.result ? stringifyJsonValue(item.result) : undefined;
}

and:

return JSON.stringify(value, null, 2);

Because this output is forwarded via onToolResult (best-effort) it can end up in logs or user-visible traces, increasing risk of credential leakage.

Recommendation

Redact or minimize MCP tool outputs before emitting them to onToolResult.

Options (can be combined):

  1. Allowlist fields to show (preferred): emit only high-level status and a short message.
  2. Redact known sensitive keys recursively before stringifying (e.g. authorization, cookie, set-cookie, token, access_token, refresh_token, api_key, password, secret).
  3. Truncate large outputs and avoid including stack traces by default.

Example redaction approach:

const SENSITIVE_KEYS = new Set([
  "authorization","cookie","set-cookie",
  "token","access_token","refresh_token",
  "api_key","apikey","password","secret",
]);

function redact(value: unknown): unknown {
  if (Array.isArray(value)) return value.map(redact);
  if (value && typeof value === "object") {
    const out: Record<string, unknown> = {};
    for (const [k, v] of Object.entries(value as Record<string, unknown>)) {
      out[k] = SENSITIVE_KEYS.has(k.toLowerCase()) ? "[REDACTED]" : redact(v);
    }
    return out;
  }
  return value;
}

function stringifyJsonValueSafe(value: unknown): string | undefined {
  try {
    return JSON.stringify(redact(value), null, 2);
  } catch {
    return undefined;
  }
}

Then use stringifyJsonValueSafe() (or allowlisting) for item.error/item.result before emitting.

3. 🟡 Terminal escape/control character injection via forwarded tool output in Codex event projector
Property Value
Severity Medium
CWE CWE-150
Location extensions/codex/src/app-server/event-projector.ts:446-457

Description

The Codex app-server event projector forwards raw tool output (including arbitrary program output) into the onToolResult callback without neutralizing ANSI escape sequences or other control characters.

This affects multiple paths:

  • Streaming deltas from item/commandExecution/outputDelta and item/fileChange/outputDelta (delta)
  • Aggregated command output (item.aggregatedOutput)
  • Tool content text for dynamic tools (contentItems[].text)
  • Serialized MCP results/errors (JSON.stringify)

If downstream consumers render this text in a terminal/log viewer that interprets ANSI/OSC sequences (or even in some rich UIs), an attacker-controlled tool/output can:

  • Spoof or obscure log/UI content (e.g., clear screen, move cursor, rewrite previous lines)
  • Inject clickable terminal hyperlinks (OSC 8) or other escape-driven tricks
  • Potentially trigger terminal emulation vulnerabilities in some environments

Vulnerable code (raw output embedded into a markdown code fence and emitted):

this.emitToolResultMessage({
  itemId,
  text: formatToolOutput(toolName, undefined, delta),
  output: true,
});

and

return `${formatToolSummary(toolName, meta)}\n\`\`\`txt\n${trimmed}\n\`\`\``;

Recommendation

Sanitize/neutralize control sequences before emitting tool output to onToolResult.

Options (choose based on desired UX):

  1. Strip ANSI/control characters for safety (recommended for logs/terminal rendering):
import { stripAnsi } from "src/terminal/ansi.js";

function sanitizeToolOutput(s: string): string {// Remove ANSI escapes, then drop remaining C0/C1 control chars.
  const noAnsi = stripAnsi(s);
  return noAnsi.replace(/[\u0000-\u001F\u007F-\u009F]/g, "");
}// usage
const safeDelta = sanitizeToolOutput(delta);
this.emitToolResultMessage({ itemId, text: formatToolOutput(toolName, undefined, safeDelta), output: true });

(There is already sanitizeTerminalText() in src/terminal/safe-text.ts that can be reused for terminal/log contexts.)

  1. If ANSI output is desired, encode it for display (e.g., replace \u001b with a visible token like ^[), or render with a dedicated ANSI-to-HTML renderer in the UI.

Additionally, consider escaping markdown code fences/backticks in formatToolOutput to prevent output from breaking out of the fenced block when rendered as markdown.

4. 🟡 Potential secret leakage via inferToolMetaFromArgs() tool-arg summarization (insufficient redaction)
Property Value
Severity Medium
CWE CWE-532
Location src/plugin-sdk/agent-harness-runtime.ts:95-98

Description

inferToolMetaFromArgs() derives a user-facing tool detail string from raw tool arguments and is now used by the Codex event projector for dynamicToolCall/mcpToolCall items.

While formatToolDetail() applies redactToolDetail(), that redaction is regex-based and does not reliably cover many common secret encodings, e.g. CLI flags of the form --token=<secret> or summaries that render as token <secret>.

As a result:

  • Untrusted or user-provided tool args can be converted into a progress/meta string and emitted to onToolResult (tool result summary), potentially exposing credentials to clients/transcripts.
  • This is a behavioral regression from previously returning item.status (often non-sensitive) for tool meta.

Vulnerable code (new behavior):

export function inferToolMetaFromArgs(toolName: string, args: unknown): string | undefined {
  const display = resolveToolDisplay({ name: toolName, args });
  return formatToolDetail(display);
}

And its new usage for dynamic/MCP tools:

return inferToolMetaFromArgs(toolName, item.arguments);

Recommendation

Harden tool-meta derivation so secrets in args cannot be surfaced:

  1. Key-based masking: when summarizing args, automatically mask values for sensitive keys (case-insensitive), e.g. token, apiKey, secret, password, authorization, etc., before formatting into detail.
  2. Extend redaction patterns to cover additional common forms such as --token=<value> and token <value>.
  3. Consider a strict allowlist of keys permitted in meta for untrusted/dynamic tools, defaulting to undefined unless explicitly configured.

Example mitigation (mask sensitive keys at the source):

function maskSensitiveArgs(args: unknown): unknown {
  if (!args || typeof args !== "object") return args;
  const out: Record<string, unknown> = { ...(args as Record<string, unknown>) };
  for (const k of Object.keys(out)) {
    if (/^(apiKey|token|secret|password|passwd|accessToken|refreshToken|authorization)$/i.test(k)) {
      out[k] = "***";
    }
  }
  return out;
}

export function inferToolMetaFromArgs(toolName: string, args: unknown): string | undefined {
  const display = resolveToolDisplay({ name: toolName, args: maskSensitiveArgs(args) });
  return formatToolDetail(display);
}

This prevents secrets from ever reaching string formatting/redaction, reducing reliance on best-effort regex patterns.

5. 🟡 Markdown injection in tool result summaries/output forwarded via onToolResult
Property Value
Severity Medium
CWE CWE-79
Location extensions/codex/src/app-server/event-projector.ts:921-934

Description

The Codex app-server projector formats tool meta and tool output as markdown and forwards it via onToolResult without escaping untrusted content.

  • Inputs: tool arguments (item.arguments via inferToolMetaFromArgs) and tool outputs (aggregatedOutput, contentItems[].text, MCP result/error) can contain attacker-controlled strings (e.g., user-provided prompts, file contents, remote tool output).
  • Issue: formatToolOutput() embeds raw output inside a fenced code block and formatToolAggregate(..., { markdown: true }) may include raw meta when it contains backticks.
  • Impact: A malicious tool output containing ``` (or crafted markdown) can break out of the intended code fence and inject arbitrary markdown into the UI/chat rendering surface (link spoofing/phishing, UI confusion). If any downstream renderer allows unsafe HTML, this can escalate to XSS.

Vulnerable code:

return `${formatToolSummary(toolName, meta)}\n\`\`\`txt\n${trimmed}\n\`\`\``;

Recommendation

Treat tool meta/output as untrusted and escape/encode it before embedding in markdown.

Options:

  1. Safest: don’t emit markdown at all for tool result events; emit structured fields {toolName, meta, output} and let the UI render with proper escaping.

  2. If you must output markdown, escape backticks and fence-breakers:

function safeInlineCode(value: string): string {// Escape backticks by wrapping with a longer delimiter.
  const ticks = (value.match(/`+/g) ?? []).map((s) => s.length);
  const max = ticks.length ? Math.max(...ticks) : 0;
  const delim = "`".repeat(Math.max(1, max + 1));
  return `${delim}${value}${delim}`;
}

function safeFencedCodeBlock(text: string, lang = "txt"): string {// Choose a fence longer than any run of backticks in the content.
  const ticks = (text.match(/`+/g) ?? []).map((s) => s.length);
  const max = ticks.length ? Math.max(...ticks) : 0;
  const fence = "`".repeat(Math.max(3, max + 1));
  return `${fence}${lang}\n${text}\n${fence}`;
}

Then use these helpers when formatting meta and output.

Also consider stripping/normalizing control characters and applying length limits to avoid UI abuse/DoS.


Analyzed PR: #70966 at commit 7e96d85

Last updated on: 2026-04-24T06:41:14Z

@openclaw-barnacle openclaw-barnacle Bot added docs Improvements or additions to documentation extensions: codex size: M maintainer Maintainer-authored PR labels Apr 24, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 24, 2026

Greptile Summary

This PR aligns the Codex app-server's verbose tool logging with the Pi harness by emitting formatted tool summaries and full output through onToolResult, using Pi-style helpers (formatToolAggregate, inferToolMetaFromArgs) and proper deduplication via two new Set trackers. It also stops surfacing raw lifecycle status strings (inProgress, completed) in tool metadata and adds streaming delta output for bash/file-change events under /verbose full.

Confidence Score: 5/5

Safe to merge; changes are well-encapsulated behind verbose flags and best-effort delivery.

All new paths are guarded by shouldEmitToolResult/shouldEmitToolOutput checks, deduplication is correct, error handling is best-effort (no turn impact on failure), and the three new test cases cover the key behaviors.

No files require special attention.

Reviews (1): Last reviewed commit: "fix: match codex verbose tool logs to pi" | Re-trigger Greptile

@jalehman jalehman force-pushed the josh/fix-codex-verbose-logging branch from e7b16f3 to 52d28d5 Compare April 24, 2026 06:31
@jalehman jalehman force-pushed the josh/fix-codex-verbose-logging branch from 52d28d5 to 7e96d85 Compare April 24, 2026 06:33
@jalehman jalehman merged commit 925d11d into openclaw:main Apr 24, 2026
64 checks passed
@jalehman jalehman deleted the josh/fix-codex-verbose-logging branch April 24, 2026 07:00
@steipete
Copy link
Copy Markdown
Contributor

steipete commented Apr 24, 2026

Landed, plus a follow-up hardening commit after the PR merged.

  • PR merge commit: 925d11d
  • Follow-up main commit: 50e3698
  • Local proof: focused Codex/tool-meta tests pass; full pnpm check:changed passed typecheck/lint/import-cycles and all tests except existing main-red extensions/zalo/src/monitor.lifecycle.test.ts (reproduced on current origin/main).
  • GitHub CI: main CI for 50e36983bb completed green.

Thanks @jalehman!

zhongmairen pushed a commit to agencyos-cn/openclaw-bad that referenced this pull request Apr 24, 2026
* 'main' of https://github.com/openclaw/openclaw: (521 commits)
  perf: slim reset model selection imports
  perf: slim directive parse test imports
  ci(release): clarify npm telegram approval label
  test: relax live tail readiness checks
  ci(release): use release approval for npm telegram e2e
  Project Codex hook notifications into agent events (openclaw#70969)
  fix: match codex verbose tool logs to pi (openclaw#70966)
  feat(diagnostics): attach trace context to otel logs (openclaw#70961)
  docs(faq): tighten FAQ stub pointers and Related cards
  docs(help): rewrite help index to match new tab structure, drop redundant troubleshooting H1
  refactor(release): distill npm telegram docker runner
  ci(release): gate npm telegram e2e by release team
  ci(release): harden npm telegram beta e2e
  test(release): address npm telegram e2e review
  test(release): avoid docker argv secret values
  ci(release): add manual npm telegram beta e2e
  test(release): support convex npm telegram credentials
  docs: clarify private ws node setup
  fix: filter gateway node list locally
  feat(plugins): move Bonjour discovery into bundled plugin
  ...
Angfr95 pushed a commit to Angfr95/openclaw that referenced this pull request Apr 25, 2026
Angfr95 pushed a commit to Angfr95/openclaw that referenced this pull request Apr 25, 2026
ogt-redknie pushed a commit to ogt-redknie/OPENX that referenced this pull request May 2, 2026
ogt-redknie pushed a commit to ogt-redknie/OPENX that referenced this pull request May 2, 2026
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 9, 2026
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs Improvements or additions to documentation extensions: codex maintainer Maintainer-authored PR size: M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants