Skip to content

fix(codex): handle v0.136+ TUI footer and skip MCP tool-call markers …#274

Merged
haofeif merged 4 commits into
mainfrom
fix/codex-tui-footer-suggestion-hint
Jun 4, 2026
Merged

fix(codex): handle v0.136+ TUI footer and skip MCP tool-call markers …#274
haofeif merged 4 commits into
mainfrom
fix/codex-tui-footer-suggestion-hint

Conversation

@haofeif

@haofeif haofeif commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

…in extraction

Two bugs in CodexProvider, both surfaced by the e2e suite on codex 0.136+:

  1. TUI footer detection — codex 0.136 dropped the "N% left" segment; the footer is now just "model · path". TUI_FOOTER_PATTERN never matched, so get_status() couldn't distinguish the suggestion hint ("› Run /review on my current changes") from a real user message and pinned terminals at IDLE forever. Pattern extended to anchor on "·\s+[~/]".

  2. extract_last_message_from_script() anchored on the first "•" after the user prompt, which can be a "• Called <mcp_server>.(...)" tool-call marker. The lines that follow are tool output, not the model's reply, so when codex called load_skill before responding the extracted message leaked the cao-worker-protocols skill body (which mentions "[CAO Handoff]") into the test output. Now iterates "•" matches and skips MCP_TOOL_CALL_PATTERN. Also tightened ASSISTANT_PREFIX_PATTERN from "\s*•" to "[^\S\n]*•" so the match anchors on the bullet line itself, not a preceding blank line.

Verified end-to-end: 11/12 codex e2e pass (1 xfailed expected), 81/81 unit tests including 3 new regressions for the v0.136 footer and tool-call marker filtering.

Issue #, if available:

Description of changes:

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

…in extraction

Two bugs in CodexProvider, both surfaced by the e2e suite on codex 0.136+:

1. TUI footer detection — codex 0.136 dropped the "N% left" segment; the
   footer is now just "model · path". TUI_FOOTER_PATTERN never matched, so
   get_status() couldn't distinguish the suggestion hint ("› Run /review on
   my current changes") from a real user message and pinned terminals at
   IDLE forever. Pattern extended to anchor on "·\s+[~/]".

2. extract_last_message_from_script() anchored on the first "•" after the
   user prompt, which can be a "• Called <mcp_server>.<tool>(...)" tool-call
   marker. The lines that follow are tool output, not the model's reply, so
   when codex called load_skill before responding the extracted message
   leaked the cao-worker-protocols skill body (which mentions
   "[CAO Handoff]") into the test output. Now iterates "•" matches and
   skips MCP_TOOL_CALL_PATTERN. Also tightened ASSISTANT_PREFIX_PATTERN
   from "\s*•" to "[^\S\n]*•" so the match anchors on the bullet line
   itself, not a preceding blank line.

Verified end-to-end: 11/12 codex e2e pass (1 xfailed expected), 81/81
unit tests including 3 new regressions for the v0.136 footer and tool-call
marker filtering.
Copilot AI review requested due to automatic review settings June 4, 2026 11:04
Run black on codex.py and the unit test file to satisfy the Code
Quality CI check on PR #274.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the Codex provider’s parsing logic to handle Codex v0.136+ TUI output changes and to avoid treating MCP tool-call markers as assistant replies during last-message extraction, with accompanying unit test coverage and changelog updates.

Changes:

  • Extend Codex TUI footer detection to recognize v0.136+ model · path format so status detection doesn’t misclassify footer hint text as user input.
  • Update message extraction to skip • Called <mcp_server>.<tool>(...) tool-call markers so tool output (e.g., skill bodies) doesn’t leak into extracted assistant responses.
  • Add unit tests covering v0.136 footer behavior and tool-call marker filtering; update changelog entry date and fixed items.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
test/providers/test_codex_provider_unit.py Adds regression tests for v0.136 footer parsing and for skipping MCP tool-call markers during extraction.
src/cli_agent_orchestrator/providers/codex.py Updates regex patterns and extraction logic to handle v0.136 footer changes and avoid anchoring on MCP tool-call markers.
CHANGELOG.md Updates the 2.2.0 date and records fixes related to Codex footer detection and tool-call marker filtering.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/cli_agent_orchestrator/providers/codex.py Outdated
…tion

Address Copilot review on PR #274 — two related concerns:

1. MCP_TOOL_CALL_PATTERN was too loose: "^[^\S\n]*•\s+Called\s+\S+"
   matched any "• Called <word>" so a real model bullet like
   "• Called attention to the bug" would be filtered out as a tool call.
   Tightened to require the documented "<server>.<tool>(" shape:
   "^[^\S\n]*•\s+Called\s+[\w-]+\.[\w-]+\(".

2. get_status()'s COMPLETED check searched ASSISTANT_PREFIX_PATTERN
   without filtering tool-call markers. A "• Called <server>.<tool>(...)"
   bullet emitted before the model has actually replied could trip
   COMPLETED on its own. Factored the per-line tool-call filter into a
   _find_assistant_marker() helper and applied it in get_status()
   (both COMPLETED detection and assistant_after_last_user gating) and
   in extract_last_message_from_script().

Tests:
- test_extract_does_not_filter_called_as_english_word — guards against
  over-broad filtering of legitimate model bullets.
- test_get_status_idle_when_only_tool_call_after_user — IDLE when the
  only "•" after the user prompt is a tool-call marker.
- test_get_status_completed_when_real_reply_after_tool_call — COMPLETED
  when a real "•" reply follows a tool-call marker.

84 unit tests pass.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

Comment thread src/cli_agent_orchestrator/providers/codex.py Outdated
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 4, 2026 11:21
@haofeif haofeif merged commit ccc8a92 into main Jun 4, 2026
22 checks passed
@haofeif haofeif deleted the fix/codex-tui-footer-suggestion-hint branch June 4, 2026 11:23

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated no new comments.

call-me-ram added a commit to call-me-ram/cli-agent-orchestrator that referenced this pull request Jun 4, 2026
Resolve the codex test conflict introduced by awslabs#274 (codex v0.136 TUI
footer + MCP tool-call marker filtering). awslabs#274 added its get_status
regressions against the old tmux-reading contract (mock_tmux.get_history
+ get_status()); adapt them to the event-driven get_status(buffer)
contract this branch introduces. codex.py auto-merged cleanly — the awslabs#274
regex/extraction fixes layer onto buffer-based get_status unchanged.

Full unit suite green: 1948 passed, 1 skipped.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants