fix: match Codex verbose tool logs to Pi by jalehman · Pull Request #70966 · openclaw/openclaw

jalehman · 2026-04-24T06:22:18Z

What

Match Codex app-server verbose tool logging to the Pi harness formatting path, including emoji/tool labels, markdown detail rendering, and /verbose full tool output delivery.

Why

Codex harness verbose logs were plain and often displayed lifecycle statuses like inProgress or completed instead of useful tool details. This made Codex-backed channel logs noticeably worse than Pi-backed logs.

Changes

Emit Codex verbose tool summaries
Emit /verbose full tool output
Use Pi-style tool formatting helpers
Infer detail from tool arguments
Stop displaying lifecycle status
Cover status-regression cases

Testing

mise x -- pnpm docs:list
mise x -- pnpm exec oxfmt --check CHANGELOG.md docs/plugins/sdk-subpaths.md src/plugin-sdk/agent-harness-runtime.ts extensions/codex/src/app-server/event-projector.ts extensions/codex/src/app-server/event-projector.test.ts
mise x -- pnpm test extensions/codex/src/app-server/event-projector.test.ts
mise x -- pnpm plugin-sdk:api:check
mise x -- pnpm tsgo:all
mise x -- pnpm lint
mise x -- pnpm build
git diff --check

aisle-research-bot · 2026-04-24T06:22:24Z

🔒 Aisle Security Analysis

We found 5 potential security issue(s) in this PR:

#	Severity	Title
1	🟠 High	Verbose tool output forwarded to user-visible onToolResult without redaction (potential secret leakage)
2	🟡 Medium	Sensitive data exposure via JSON-stringified MCP tool error/result emitted to onToolResult
3	🟡 Medium	Terminal escape/control character injection via forwarded tool output in Codex event projector
4	🟡 Medium	Potential secret leakage via inferToolMetaFromArgs() tool-arg summarization (insufficient redaction)
5	🟡 Medium	Markdown injection in tool result summaries/output forwarded via onToolResult

1. 🟠 Verbose tool output forwarded to user-visible onToolResult without redaction (potential secret leakage)

Property	Value
Severity	High
CWE	CWE-200
Location	`extensions/codex/src/app-server/event-projector.ts:543-561`

Description

The Codex app-server event projector now forwards full tool output to the onToolResult callback when verbose output is enabled (verboseLevel === "full" or shouldEmitToolOutput() returns true).

This is risky because onToolResult is wired to user-facing streaming/delivery in the agent runner (agent-runner-execution.ts), so these messages can end up in chat channels.

New behavior includes:

commandExecution.aggregatedOutput (shell stdout/stderr)
dynamicToolCall.contentItems[].text (e.g., file reads like read returning file contents)
mcpToolCall.result / mcpToolCall.error (can contain stack traces, credentials, internal data)
incremental output deltas from item/commandExecution/outputDelta and item/fileChange/outputDelta

Unlike tool summary metadata (which goes through redactToolDetail inside inferToolMetaFromArgs), tool output is inserted verbatim into a markdown code block and delivered. This can leak secrets such as API keys printed by commands, .env file contents, private keys, access tokens in error payloads, etc.

Vulnerable code (output forwarded verbatim):

this.emitToolResultMessage({
  itemId,
  text: formatToolOutput(toolName, itemMeta(item), output),
  output: true,
});

and output sourcing:

if (item.type === "commandExecution") {
  return item.aggregatedOutput?.trim() || undefined;
}

Data flow:

source: tool outputs from Codex items (aggregatedOutput, contentItems, mcp result/error, delta events)
sink: this.params.onToolResult?.({ text: ... }) (ultimately sent to end users via auto-reply tool streaming)
missing protection: no redaction/sanitization of secrets in outputs; only meta redaction exists

Recommendation

Apply secret redaction (and ideally stricter gating) to any tool output before emitting it to onToolResult.

Suggested changes:

Redact sensitive patterns (reuse the existing redaction utilities used for tool details/logging) before formatting:

import { redactSensitiveText } from "../../logging/redact.js"; // adjust path

function formatToolOutput(toolName: string, meta: string | undefined, output: string): string {
  const redacted = redactSensitiveText(output);
  const trimmed = redacted.trim();
  ...
}

Consider restricting verbose/full tool output to trusted contexts only (e.g., owner/admin sessions), not merely a request flag.
Consider adding a hard size cap and/or allow-list of tools whose output may be streamed to users.
Ensure downstream delivery (onToolResult) is treated as user-visible and is never fed raw secrets or file contents without explicit authorization.

2. 🟡 Sensitive data exposure via JSON-stringified MCP tool error/result emitted to onToolResult

Property	Value
Severity	Medium
CWE	CWE-201
Location	`extensions/codex/src/app-server/event-projector.ts:880-916`

Description

The event projector emits MCP tool call outputs to the onToolResult callback when verboseLevel === "full" (or when shouldEmitToolOutput() returns true). For MCP tool calls, itemOutputText() converts item.error or item.result to a pretty-printed JSON string using JSON.stringify(value, null, 2) and includes it in the emitted tool output message.

This can unintentionally expose sensitive information to any consumer of onToolResult (typically UI clients/logs):

item.error may include stack traces, internal file paths, internal endpoints, request metadata, etc.
item.result may include structured responses containing secrets (access tokens, API keys, session cookies, Authorization headers), depending on the MCP server/tool.
No redaction/allowlisting is applied before emission.

Vulnerable code (MCP path):

if (item.type === "mcpToolCall") {
  if (item.error) {
    return stringifyJsonValue(item.error);
  }
  return item.result ? stringifyJsonValue(item.result) : undefined;
}

and:

return JSON.stringify(value, null, 2);

Because this output is forwarded via onToolResult (best-effort) it can end up in logs or user-visible traces, increasing risk of credential leakage.

Recommendation

Redact or minimize MCP tool outputs before emitting them to onToolResult.

Options (can be combined):

Allowlist fields to show (preferred): emit only high-level status and a short message.
Redact known sensitive keys recursively before stringifying (e.g. authorization, cookie, set-cookie, token, access_token, refresh_token, api_key, password, secret).
Truncate large outputs and avoid including stack traces by default.

Example redaction approach:

const SENSITIVE_KEYS = new Set([
  "authorization","cookie","set-cookie",
  "token","access_token","refresh_token",
  "api_key","apikey","password","secret",
]);

function redact(value: unknown): unknown {
  if (Array.isArray(value)) return value.map(redact);
  if (value && typeof value === "object") {
    const out: Record<string, unknown> = {};
    for (const [k, v] of Object.entries(value as Record<string, unknown>)) {
      out[k] = SENSITIVE_KEYS.has(k.toLowerCase()) ? "[REDACTED]" : redact(v);
    }
    return out;
  }
  return value;
}

function stringifyJsonValueSafe(value: unknown): string | undefined {
  try {
    return JSON.stringify(redact(value), null, 2);
  } catch {
    return undefined;
  }
}

Then use stringifyJsonValueSafe() (or allowlisting) for item.error/item.result before emitting.

3. 🟡 Terminal escape/control character injection via forwarded tool output in Codex event projector

Property	Value
Severity	Medium
CWE	CWE-150
Location	`extensions/codex/src/app-server/event-projector.ts:446-457`

Description

The Codex app-server event projector forwards raw tool output (including arbitrary program output) into the onToolResult callback without neutralizing ANSI escape sequences or other control characters.

This affects multiple paths:

Streaming deltas from item/commandExecution/outputDelta and item/fileChange/outputDelta (delta)
Aggregated command output (item.aggregatedOutput)
Tool content text for dynamic tools (contentItems[].text)
Serialized MCP results/errors (JSON.stringify)

If downstream consumers render this text in a terminal/log viewer that interprets ANSI/OSC sequences (or even in some rich UIs), an attacker-controlled tool/output can:

Spoof or obscure log/UI content (e.g., clear screen, move cursor, rewrite previous lines)
Inject clickable terminal hyperlinks (OSC 8) or other escape-driven tricks
Potentially trigger terminal emulation vulnerabilities in some environments

Vulnerable code (raw output embedded into a markdown code fence and emitted):

this.emitToolResultMessage({
  itemId,
  text: formatToolOutput(toolName, undefined, delta),
  output: true,
});

and

return `${formatToolSummary(toolName, meta)}\n\`\`\`txt\n${trimmed}\n\`\`\``;

Recommendation

Sanitize/neutralize control sequences before emitting tool output to onToolResult.

Options (choose based on desired UX):

Strip ANSI/control characters for safety (recommended for logs/terminal rendering):

import { stripAnsi } from "src/terminal/ansi.js";

function sanitizeToolOutput(s: string): string {
  // Remove ANSI escapes, then drop remaining C0/C1 control chars.
  const noAnsi = stripAnsi(s);
  return noAnsi.replace(/[\u0000-\u001F\u007F-\u009F]/g, "");
}

// usage
const safeDelta = sanitizeToolOutput(delta);
this.emitToolResultMessage({ itemId, text: formatToolOutput(toolName, undefined, safeDelta), output: true });

(There is already sanitizeTerminalText() in src/terminal/safe-text.ts that can be reused for terminal/log contexts.)

If ANSI output is desired, encode it for display (e.g., replace \u001b with a visible token like ^[), or render with a dedicated ANSI-to-HTML renderer in the UI.

Additionally, consider escaping markdown code fences/backticks in formatToolOutput to prevent output from breaking out of the fenced block when rendered as markdown.

4. 🟡 Potential secret leakage via inferToolMetaFromArgs() tool-arg summarization (insufficient redaction)

Property	Value
Severity	Medium
CWE	CWE-532
Location	`src/plugin-sdk/agent-harness-runtime.ts:95-98`

Description

inferToolMetaFromArgs() derives a user-facing tool detail string from raw tool arguments and is now used by the Codex event projector for dynamicToolCall/mcpToolCall items.

While formatToolDetail() applies redactToolDetail(), that redaction is regex-based and does not reliably cover many common secret encodings, e.g. CLI flags of the form --token=<secret> or summaries that render as token <secret>.

As a result:

Untrusted or user-provided tool args can be converted into a progress/meta string and emitted to onToolResult (tool result summary), potentially exposing credentials to clients/transcripts.
This is a behavioral regression from previously returning item.status (often non-sensitive) for tool meta.

Vulnerable code (new behavior):

export function inferToolMetaFromArgs(toolName: string, args: unknown): string | undefined {
  const display = resolveToolDisplay({ name: toolName, args });
  return formatToolDetail(display);
}

And its new usage for dynamic/MCP tools:

return inferToolMetaFromArgs(toolName, item.arguments);

Recommendation

Harden tool-meta derivation so secrets in args cannot be surfaced:

Key-based masking: when summarizing args, automatically mask values for sensitive keys (case-insensitive), e.g. token, apiKey, secret, password, authorization, etc., before formatting into detail.
Extend redaction patterns to cover additional common forms such as --token=<value> and token <value>.
Consider a strict allowlist of keys permitted in meta for untrusted/dynamic tools, defaulting to undefined unless explicitly configured.

Example mitigation (mask sensitive keys at the source):

function maskSensitiveArgs(args: unknown): unknown {
  if (!args || typeof args !== "object") return args;
  const out: Record<string, unknown> = { ...(args as Record<string, unknown>) };
  for (const k of Object.keys(out)) {
    if (/^(apiKey|token|secret|password|passwd|accessToken|refreshToken|authorization)$/i.test(k)) {
      out[k] = "***";
    }
  }
  return out;
}

export function inferToolMetaFromArgs(toolName: string, args: unknown): string | undefined {
  const display = resolveToolDisplay({ name: toolName, args: maskSensitiveArgs(args) });
  return formatToolDetail(display);
}

This prevents secrets from ever reaching string formatting/redaction, reducing reliance on best-effort regex patterns.

5. 🟡 Markdown injection in tool result summaries/output forwarded via onToolResult

Property	Value
Severity	Medium
CWE	CWE-79
Location	`extensions/codex/src/app-server/event-projector.ts:921-934`

Description

The Codex app-server projector formats tool meta and tool output as markdown and forwards it via onToolResult without escaping untrusted content.

Inputs: tool arguments (item.arguments via inferToolMetaFromArgs) and tool outputs (aggregatedOutput, contentItems[].text, MCP result/error) can contain attacker-controlled strings (e.g., user-provided prompts, file contents, remote tool output).
Issue: formatToolOutput() embeds raw output inside a fenced code block and formatToolAggregate(..., { markdown: true }) may include raw meta when it contains backticks.
Impact: A malicious tool output containing ``` (or crafted markdown) can break out of the intended code fence and inject arbitrary markdown into the UI/chat rendering surface (link spoofing/phishing, UI confusion). If any downstream renderer allows unsafe HTML, this can escalate to XSS.

Vulnerable code:

return `${formatToolSummary(toolName, meta)}\n\`\`\`txt\n${trimmed}\n\`\`\``;

Recommendation

Treat tool meta/output as untrusted and escape/encode it before embedding in markdown.

Options:

Safest: don’t emit markdown at all for tool result events; emit structured fields {toolName, meta, output} and let the UI render with proper escaping.
If you must output markdown, escape backticks and fence-breakers:

function safeInlineCode(value: string): string {
  // Escape backticks by wrapping with a longer delimiter.
  const ticks = (value.match(/`+/g) ?? []).map((s) => s.length);
  const max = ticks.length ? Math.max(...ticks) : 0;
  const delim = "`".repeat(Math.max(1, max + 1));
  return `${delim}${value}${delim}`;
}

function safeFencedCodeBlock(text: string, lang = "txt"): string {
  // Choose a fence longer than any run of backticks in the content.
  const ticks = (text.match(/`+/g) ?? []).map((s) => s.length);
  const max = ticks.length ? Math.max(...ticks) : 0;
  const fence = "`".repeat(Math.max(3, max + 1));
  return `${fence}${lang}\n${text}\n${fence}`;
}

Then use these helpers when formatting meta and output.

Also consider stripping/normalizing control characters and applying length limits to avoid UI abuse/DoS.

Analyzed PR: #70966 at commit 7e96d85

_{Last updated on: 2026-04-24T06:41:14Z}

greptile-apps · 2026-04-24T06:24:13Z

Greptile Summary

This PR aligns the Codex app-server's verbose tool logging with the Pi harness by emitting formatted tool summaries and full output through onToolResult, using Pi-style helpers (formatToolAggregate, inferToolMetaFromArgs) and proper deduplication via two new Set trackers. It also stops surfacing raw lifecycle status strings (inProgress, completed) in tool metadata and adds streaming delta output for bash/file-change events under /verbose full.

Confidence Score: 5/5

Safe to merge; changes are well-encapsulated behind verbose flags and best-effort delivery.

All new paths are guarded by shouldEmitToolResult/shouldEmitToolOutput checks, deduplication is correct, error handling is best-effort (no turn impact on failure), and the three new test cases cover the key behaviors.

No files require special attention.

_{Reviews (1): Last reviewed commit: "fix: match codex verbose tool logs to pi" | Re-trigger Greptile}

steipete · 2026-04-24T07:14:04Z

Landed, plus a follow-up hardening commit after the PR merged.

PR merge commit: 925d11d
Follow-up main commit: 50e3698
Local proof: focused Codex/tool-meta tests pass; full pnpm check:changed passed typecheck/lint/import-cycles and all tests except existing main-red extensions/zalo/src/monitor.lifecycle.test.ts (reproduced on current origin/main).
GitHub CI: main CI for 50e36983bb completed green.

Thanks @jalehman!

* 'main' of https://github.com/openclaw/openclaw: (521 commits) perf: slim reset model selection imports perf: slim directive parse test imports ci(release): clarify npm telegram approval label test: relax live tail readiness checks ci(release): use release approval for npm telegram e2e Project Codex hook notifications into agent events (openclaw#70969) fix: match codex verbose tool logs to pi (openclaw#70966) feat(diagnostics): attach trace context to otel logs (openclaw#70961) docs(faq): tighten FAQ stub pointers and Related cards docs(help): rewrite help index to match new tab structure, drop redundant troubleshooting H1 refactor(release): distill npm telegram docker runner ci(release): gate npm telegram e2e by release team ci(release): harden npm telegram beta e2e test(release): address npm telegram e2e review test(release): avoid docker argv secret values ci(release): add manual npm telegram beta e2e test(release): support convex npm telegram credentials docs: clarify private ws node setup fix: filter gateway node list locally feat(plugins): move Bonjour discovery into bundled plugin ...

…ehman

…ehman)

…ehman

…ehman)

…ehman

…ehman)

openclaw-barnacle Bot added docs Improvements or additions to documentation extensions: codex size: M maintainer Maintainer-authored PR labels Apr 24, 2026

jalehman force-pushed the josh/fix-codex-verbose-logging branch from e7b16f3 to 52d28d5 Compare April 24, 2026 06:31

fix: match codex verbose tool logs to pi

7e96d85

jalehman force-pushed the josh/fix-codex-verbose-logging branch from 52d28d5 to 7e96d85 Compare April 24, 2026 06:33

jalehman merged commit 925d11d into openclaw:main Apr 24, 2026
64 checks passed

jalehman deleted the josh/fix-codex-verbose-logging branch April 24, 2026 07:00

steipete added a commit that referenced this pull request Apr 24, 2026

fix: harden codex verbose tool progress (#70966) (thanks @jalehman)

50e3698

github-actions Bot mentioned this pull request Apr 24, 2026

📡 Upstream Digest — 2026-04-24 07:50 UTC curtismercier/openclaw-mods#664

Open

Angfr95 pushed a commit to Angfr95/openclaw that referenced this pull request Apr 25, 2026

fix: match codex verbose tool logs to pi (openclaw#70966)

cd5d96e

Angfr95 pushed a commit to Angfr95/openclaw that referenced this pull request Apr 25, 2026

fix: harden codex verbose tool progress (openclaw#70966) (thanks @jal…

fe6a2bf

…ehman)

Disaster-Terminator mentioned this pull request Apr 28, 2026

fix(codex): bridge app-server tool stream events #73449

Closed

25 tasks

ogt-redknie pushed a commit to ogt-redknie/OPENX that referenced this pull request May 2, 2026

fix: match codex verbose tool logs to pi (openclaw#70966)

6ae689d

ogt-redknie pushed a commit to ogt-redknie/OPENX that referenced this pull request May 2, 2026

fix: harden codex verbose tool progress (openclaw#70966) (thanks @jal…

8facf77

…ehman)

github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 9, 2026

fix: match codex verbose tool logs to pi (openclaw#70966)

72125e6

github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 9, 2026

fix: harden codex verbose tool progress (openclaw#70966) (thanks @jal…

bac0163

…ehman)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: match Codex verbose tool logs to Pi#70966

fix: match Codex verbose tool logs to Pi#70966
jalehman merged 1 commit intoopenclaw:mainfrom
jalehman:josh/fix-codex-verbose-logging

jalehman commented Apr 24, 2026

Uh oh!

aisle-research-bot Bot commented Apr 24, 2026 •

edited

Loading

Description

Recommendation

Description

Recommendation

Description

Recommendation

Description

Recommendation

Description

Recommendation

Uh oh!

greptile-apps Bot commented Apr 24, 2026

Uh oh!

Uh oh!

steipete commented Apr 24, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

jalehman commented Apr 24, 2026

What

Why

Changes

Testing

Uh oh!

aisle-research-bot Bot commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔒 Aisle Security Analysis

Description

Recommendation

Description

Recommendation

Description

Recommendation

Description

Recommendation

Description

Recommendation

Uh oh!

greptile-apps Bot commented Apr 24, 2026

Greptile Summary

Confidence Score: 5/5

Uh oh!

Uh oh!

steipete commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

aisle-research-bot Bot commented Apr 24, 2026 •

edited

Loading

steipete commented Apr 24, 2026 •

edited

Loading