fix(agents): normalize mangled tool names and IDs from OpenAI-compati…#39245
fix(agents): normalize mangled tool names and IDs from OpenAI-compati…#39245deepujain wants to merge 7 commits intoopenclaw:mainfrom
Conversation
Greptile SummaryThis PR fixes tool call ID and name mismatches caused by OpenAI-compatible providers (e.g. Kimi) that send Key changes:
Confidence Score: 4/5
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7de7ed80a6
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| const normalizedName = trimmedName | ||
| ? normalizePrefixedToolName(trimmedName) | ||
| : trimmedName; | ||
| if (rawName !== normalizedName && normalizedName) { | ||
| const renamed = { ...(block as object), name: normalizedName } as typeof block; |
There was a problem hiding this comment.
Redact prefixed sessions_spawn calls before renaming
When a provider emits a prefixed tool name like functions.sessions_spawn, the blockName check has already skipped the sessions_spawn redaction path, and this new branch only renames the tool to sessions_spawn. Because sanitizeToolCallBlock is never invoked here, input.attachments[].content is persisted unredacted for these prefixed calls, which regresses transcript privacy for OpenAI-compatible providers that mangle tool names.
Useful? React with 👍 / 👎.
|
Codex P1: redaction now runs for prefixed tool names (e.g. functions.sessions_spawn). |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9b488bcaaf
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| @@ -73,11 +86,11 @@ export function extractToolResultId( | |||
| ): string | null { | |||
| const toolCallId = (msg as { toolCallId?: unknown }).toolCallId; | |||
| if (typeof toolCallId === "string" && toolCallId) { | |||
| return toolCallId; | |||
| return normalizeMangledToolCallId(toolCallId); | |||
There was a problem hiding this comment.
Preserve raw tool result IDs for pending-state lookups
Normalizing IDs inside extractToolResultId makes downstream code see a different key than what was tracked from assistant tool calls. In installSessionToolResultGuard, pending calls are stored by raw call.id (session-tool-result-state.ts) but removed via extractToolResultId (session-tool-result-guard.ts), so a mangled ID like functions exec:0 now gets looked up as functions.exec:0 and is never deleted. When the next non-tool message arrives, the stale pending entry can trigger a synthetic "missing tool result" even though the real tool result was already written.
Useful? React with 👍 / 👎.
|
This pull request has been automatically marked as stale due to inactivity. |
|
Rebased this onto current main and tightened the normalization path for OpenAI-compatible function-prefixed tool names and IDs. That now covers the mangled functions exec:0 pairing case and keeps functions.sessions_spawn on the attachment-redaction path. Focused oxlint on the touched files is clean, and I also did a module import sanity check for the updated agent helpers. |
9b488bc to
3fb7cc6
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3fb7cc66db
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| @@ -87,11 +98,11 @@ export function extractToolResultId( | |||
| ): string | null { | |||
| const toolCallId = (msg as { toolCallId?: unknown }).toolCallId; | |||
| if (typeof toolCallId === "string" && toolCallId) { | |||
| return toolCallId; | |||
| return normalizeMangledToolCallId(toolCallId); | |||
There was a problem hiding this comment.
Keep tool-result IDs raw in extractToolResultId
Normalizing toolCallId here breaks pairing when repairToolUseResultPairing runs on transcripts whose assistant/toolResult IDs are still in the same mangled form (for example functions exec:0 on both sides): that repair pass builds toolCallIds from raw assistant IDs in src/agents/session-transcript-repair.ts, but now compares them to normalized IDs from extractToolResultId, so legitimate results are misclassified as orphans and replaced with synthetic missing results. This can happen in flows that call pairing before ID sanitization (e.g. truncation/compaction paths), so the transcript is corrupted even though a real tool result existed.
Useful? React with 👍 / 👎.
|
Refreshing this thread so stale does not close it out. This branch is rebased on current main, the normalization/redaction fix is in, and the current GitHub checks are green. |
|
Kept tool-result IDs raw in |
9a5b8fc to
c47eb1b
Compare
|
Rebased this onto current main and kept the tool-name and tool-ID normalization fixes intact. Focused tests still pass: pnpm test -- src/agents/tool-call-id.test.ts src/agents/session-transcript-repair.test.ts. The red checks I saw before the rebase were failing in untouched src/agents/model-auth.ts during the merge build, so this should be in a good spot for a fresh CI pass and another look. |
|
The unreachable |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 45c6049a6c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (typeof id !== "string" || !id) { | ||
| return id; | ||
| } | ||
| return id.replace(/functions\s+/g, "functions."); |
There was a problem hiding this comment.
Normalize only leading functions-prefix in tool call IDs
normalizeMangledToolCallId currently applies id.replace(/functions\s+/g, "functions."), which rewrites every functions substring, not just the provider prefix this change is meant to repair. Because createOccurrenceAwareResolver uses that normalized string as the key for assistantOccurrences and pendingByRawId, distinct raw IDs can collapse to one key (for example xfunctions exec:0 and xfunctions.exec:0), causing tool results to be matched by queue order instead of true ID and potentially paired to the wrong call.
Useful? React with 👍 / 👎.
45c6049 to
41df72f
Compare
|
Rebased on current main to pick up the talk silence-timeout docs baseline that was failing the runtime-config shard. I verified the expected 700 ms/900 ms docs and help text are present on this head; GitHub checks are rerunning now. |
|
Tightened the tool-call ID normalization so only the expected functions-space prefix is rewritten, avoiding embedded-ID collisions. Focused tests pass locally: pnpm test -- src/agents/tool-call-id.test.ts src/agents/session-transcript-repair.test.ts. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a71a2245c6
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const normalizedId = normalizeMangledToolCallId(id); | ||
| used.add(normalizedId); | ||
| const pending = pendingByRawId.get(normalizedId); | ||
| if (pending) { | ||
| pending.push(id); | ||
| pending.push(normalizedId); |
There was a problem hiding this comment.
Keep preserved replay-safe IDs in raw form
When preserveReplaySafeThinkingToolCallIds is enabled, replay-safe assistant turns are intentionally left byte-for-byte unchanged, but this branch now normalizes id to functions.* before storing it in pendingByRawId. If a preserved turn contains a mangled ID like functions exec:0, the assistant block keeps the raw ID while the matching toolResult is later rewritten to the normalized ID, so the pair no longer matches and downstream pairing repair can treat the real result as orphaned and inject a synthetic missing result.
Useful? React with 👍 / 👎.
|
Fixed the remaining docs-sync failures: added the MiniMax M2.7 troubleshooting entry to docs/help/faq.md and corrected the two broken internal docs links reported by check-docs. Verified docs-link-audit now reports broken_links=0; GitHub checks are rerunning on the pushed branch. |
13ed479 to
f17f6f1
Compare
|
Rebased this branch onto current main and tightened the replay-safe tool-call ID handling so preserved signed-thinking IDs stay in their raw provider form while paired tool results still line up. Local check: pnpm test -- src/agents/tool-call-id.test.ts passes. |
|
Fixed the live CI lint failure after the rebase by restoring the assembled-context diagnostic sanitizer case from current main. Local check: pnpm lint --threads=8 passes. |
|
Codex review: needs changes before merge. Summary Reproducibility: yes. source-reproducible. Current main exact-matches raw IDs in the sanitizer and does not strip Next step before merge Security Review findings
Review detailsBest possible solution: Land a narrow agent transcript/tool-call ID normalization with regression coverage and a changelog entry after removing the node exec approval-path change or moving it to a separate explicitly approved PR. Do we have a high-confidence way to reproduce the issue? Yes, source-reproducible. Current main exact-matches raw IDs in the sanitizer and does not strip Is this the best way to solve the issue? No. The normalization approach is the right narrow fix, but this PR is not the best merge shape while it also changes node exec approval routing and omits the required changelog entry. Full review comments:
Overall correctness: patch is incorrect Security concerns:
Acceptance criteria:
What I checked:
Likely related people:
Remaining risk / open question:
Codex review notes: model gpt-5.5, reasoning high; reviewed against 605e89468ebf. |
9ec4a32 to
d262790
Compare
|
Rebased this branch onto current main and resolved the merge conflict. The assembled-context logging case is already present on main now, so the duplicate CI-fix commit was dropped during the rebase. Local checks: pnpm test -- src/agents/tool-call-id.test.ts and pnpm lint --threads=8 pass. |
|
Fixed the post-rebase CI failures from current main by preserving explicit node workdir through prepare and moving the shared node exec params into a leaf type module to remove the import cycle. Local checks: pnpm test -- src/agents/bash-tools.exec.approval-id.test.ts -t "preserves explicit workdir for node exec|does not forward the gateway default cwd", pnpm check:madge-import-cycles, pnpm test -- src/agents/tool-call-id.test.ts, and pnpm lint --threads=8 pass. |
|
Follow-up for CI lint: removed the stale node exec type imports after extracting the shared params type. Local check: pnpm lint --threads=8 passes. |
0f62969 to
723038b
Compare
|
Rebased once more onto the latest main after it moved again. Adapted the node exec CI fix to main's new .types.ts split, while keeping the tool-call ID fixes intact. Local checks pass: node workdir test, madge import-cycle check, tool-call ID tests, and pnpm lint --threads=8. |
|
Fixed the remaining node exec test failure by distinguishing explicit host=node workdirs from internal/default workdirs. This keeps full/off node exec on the fast path unless the tool call actually supplied a remote workdir. Local checks: targeted node exec tests, madge import-cycle check, tool-call ID tests, and pnpm lint --threads=8 pass. |
13b16e2 to
749e8bd
Compare
|
Rebased onto latest main again and fixed the contracts-plugins failure by declaring Lobster's new |
|
Fixed the remaining support-boundary failure from latest main by aligning the Docker release-check trust assertion with the current reusable workflow behavior. Local validation: |
010bb43 to
512853b
Compare
|
Rebased once more onto current main after the upstream release-validation follow-up landed, dropping the duplicate local CI-fix commits now present upstream. Local validation: agent/tool-id targeted tests plus |
|
Swept this PR again. Existing CI is green, and the inline Codex comments are addressed on the current head: prefixed |
Summary
functions.exec:0asfunctions exec:0(space instead of dot) and tool names likeexecasfunctions exec. Tool result matching fails, causing "Tool functions exec not found" errors.normalizeMangledToolCallId()intool-call-id.tsto rewritefunctions->functions.in IDs; use it inextractToolResultId()and as the key insanitizeToolCallIdsForCloudCodeAssist()so mangled and canonical IDs map to the same sanitized value. AddednormalizePrefixedToolName()insession-transcript-repair.tsto stripfunctions./functionsfrom tool names; applied in assistant block handling,hasToolCallName, andnormalizeToolResultName()so tool-result names are normalized and pairing works.Change Type (select all)
Scope (select all touched areas)
Linked Issue/PR
User-visible / Behavior Changes
Tool-using flows with OpenAI-compatible providers that mangle tool IDs/names (e.g. Kimi) now match tool results to tool calls correctly; "Tool functions exec not found" is resolved.
Security Impact (required)
Repro + Verification
Environment
functions exec:0/functions execstyle IDs and namesSteps
Expected
Tool results pair with assistant tool calls; no spurious "tool not found" from ID/name mangling.
Actual
As expected after the fix.
Evidence
normalizeMangledToolCallIdandextractToolResultId/sanitizeToolCallIdsForCloudCodeAssisttests intool-call-id.test.ts.sanitizeToolUseResultPairingtest with mangled ID and prefixed name insession-transcript-repair.test.ts.Human Verification (required)
Compatibility / Migration
Failure Recovery (if this breaks)
src/agents/tool-call-id.ts,src/agents/session-transcript-repair.ts.Risks and Mitigations
None. Normalization is additive and only affects matching/sanitization; no new execution paths.