fix: Handle CLI prompts with trailing text#61
Merged
Conversation
- Remove end-of-line anchors from idle prompt patterns to support trailing text like 'How can I help?' and 'What would you like to do next?'
- Fix response extraction to find first prompt AFTER last green arrow instead of using last prompt in output
- Update error messages to be more descriptive ('Incomplete response' vs 'Empty response')
- Add test coverage for trailing text scenarios
- Apply fixes to q_cli, kiro_cli, and codex providers
Fixes initialization timeout and handoff failures caused by new CLI prompt formats.
🤖 Assisted by Amazon Q Developer (https://aws.amazon.com/q/developer)
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #61 +/- ##
=======================================
Coverage ? 19.48%
=======================================
Files ? 30
Lines ? 1391
Branches ? 0
=======================================
Hits ? 271
Misses ? 1120
Partials ? 0
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
haofeif
approved these changes
Feb 5, 2026
haofeif
left a comment
Contributor
There was a problem hiding this comment.
Great fix! The regex updates correctly handle the trailing text, and the logic change to find the prompt specifically after the last response marker is a solid
improvement to ensure we're extracting the right content. The new tests cover the edge cases well.
sriharshaarangi
added a commit
to sriharshaarangi/cli-agent-orchestrator
that referenced
this pull request
Feb 8, 2026
The permission_prompt_pattern falsely matched stale 'Allow this action? [y/n/t]:' text in terminal history, causing status to return WAITING_USER_ANSWER instead of IDLE. This blocked inbox message delivery. Root cause: PR awslabs#61 removed the $ anchor from idle_prompt_pattern. The anchor previously prevented stale [y/n/t]: from matching because the adjacent idle prompt wasn't at end-of-string. Without it, \s* between ]: and the idle prompt bridged across newlines to match stale prompts. Fix: Replace \s* with [ \t]* so the pattern only bridges spaces/tabs, not newlines. re.DOTALL is intentionally kept — the permission prompt text can wrap across lines in narrow terminals, so .* must span newlines to match [y/n/t]: on a different line than 'Allow this action?'. Changes: - kiro_cli.py: \s* -> [ \t]* in permission_prompt_pattern - q_cli.py: same fix - test_kiro_cli_unit.py: add test for stale permission prompt rejection
sriharshaarangi
added a commit
to sriharshaarangi/cli-agent-orchestrator
that referenced
this pull request
Feb 8, 2026
The permission_prompt_pattern falsely matched stale 'Allow this action? [y/n/t]:' text in terminal history, causing status to return WAITING_USER_ANSWER instead of IDLE. This blocked inbox message delivery. Root cause: PR awslabs#61 removed the $ anchor from idle_prompt_pattern. The anchor previously prevented stale [y/n/t]: from matching because the adjacent idle prompt wasn't at end-of-string. Without it, \s* between ]: and the idle prompt bridged across newlines to match stale prompts. Fix: Replace \s* with [ \t]* so the pattern only bridges spaces/tabs, not newlines. re.DOTALL is intentionally kept — the permission prompt text can wrap across lines in narrow terminals, so .* must span newlines to match [y/n/t]: on a different line than 'Allow this action?'. Changes: - kiro_cli.py: \s* -> [ \t]* in permission_prompt_pattern - q_cli.py: same fix - test_kiro_cli_unit.py: add test for stale permission prompt rejection
sriharshaarangi
added a commit
to sriharshaarangi/cli-agent-orchestrator
that referenced
this pull request
Feb 8, 2026
The permission_prompt_pattern falsely matched stale 'Allow this action? [y/n/t]:' text in terminal history, causing status to return WAITING_USER_ANSWER instead of IDLE. This blocked inbox message delivery. Root cause: PR awslabs#61 removed the $ anchor from idle_prompt_pattern. The anchor previously prevented stale [y/n/t]: from matching because the adjacent idle prompt wasn't at end-of-string. Without it, \s* between ]: and the idle prompt bridged across newlines to match stale prompts. Fix: Replace \s* with [ \t]* so the pattern only bridges spaces/tabs, not newlines. re.DOTALL is intentionally kept — the permission prompt text can wrap across lines in narrow terminals, so .* must span newlines to match [y/n/t]: on a different line than 'Allow this action?'. Changes: - kiro_cli.py: \s* -> [ \t]* in permission_prompt_pattern - q_cli.py: same fix - test_kiro_cli_unit.py: add test for stale permission prompt rejection
haofeif
pushed a commit
that referenced
this pull request
Feb 8, 2026
The permission_prompt_pattern falsely matched stale 'Allow this action? [y/n/t]:' text in terminal history, causing status to return WAITING_USER_ANSWER instead of IDLE. This blocked inbox message delivery. Root cause: PR #61 removed the $ anchor from idle_prompt_pattern. The anchor previously prevented stale [y/n/t]: from matching because the adjacent idle prompt wasn't at end-of-string. Without it, \s* between ]: and the idle prompt bridged across newlines to match stale prompts. Fix: Replace \s* with [ \t]* so the pattern only bridges spaces/tabs, not newlines. re.DOTALL is intentionally kept — the permission prompt text can wrap across lines in narrow terminals, so .* must span newlines to match [y/n/t]: on a different line than 'Allow this action?'. Changes: - kiro_cli.py: \s* -> [ \t]* in permission_prompt_pattern - q_cli.py: same fix - test_kiro_cli_unit.py: add test for stale permission prompt rejection
sriharshaarangi
added a commit
to sriharshaarangi/cli-agent-orchestrator
that referenced
this pull request
Feb 10, 2026
The combined permission+idle regex pattern failed to detect active permission prompts because kiro-cli always renders the idle prompt on the next line after [y/n/t]:. PR awslabs#61 removed the $ anchor from the idle pattern (to support trailing text), which caused stale prompts to match. The workaround of restricting \s* to [ \t]* broke active prompt detection since the newline could no longer be bridged. Replace the combined regex with line-based counting: find the last [y/n/t]: in output, split by \n, count lines containing idle prompt. If <=1, the prompt is active (WAITING_USER_ANSWER). If >1, the user answered and the agent continued (stale, fall through to normal detection). Line-based counting handles \r (carriage return) redraws correctly — kiro-cli redraws the same prompt line multiple times without \n, which would create multiple regex matches but counts as 1 line. Tested against all cases from 605 real terminal logs: - P1: Empty prompt, unanswered - P2: Trailing text on idle prompt, unanswered - P3/P4: CAO injection delivered during active prompt - P5: User answered y, agent continued - P6: User typed long response instead of y/n/t - P7: kiro-cli re-renders [y/n/t]: per keystroke (15x) - P8: User typing partial text, no enter yet - N1-N9: All non-permission states unaffected Added 24 tests including 3 with real ANSI-coded terminal output reproducing exact byte sequences from production logs (00ce37f3.log, 0895b67b.log, 4d9d97cf.log).
sriharshaarangi
added a commit
to sriharshaarangi/cli-agent-orchestrator
that referenced
this pull request
Feb 10, 2026
Decouple permission pattern from idle pattern in get_status(). Count idle prompt lines after last [y/n/t]: to distinguish active (<=1 line) from stale (>1 lines) permission prompts. Previous approaches failed because: - awslabs#69: [ \t]* can't bridge newlines, active prompts never detected - awslabs#61: \s* bridges newlines but also matches stale prompts Line-based counting handles \r redraws correctly (same line, no \n) and uses the LAST [y/n/t]: match to handle re-rendered prompts. Applied to both kiro_cli.py and q_cli.py. Includes 24 unit tests with real terminal fixtures and 7 integration tests against real kiro-cli (opt-in live view via CAO_TEST_WATCH=1).
sriharshaarangi
added a commit
to sriharshaarangi/cli-agent-orchestrator
that referenced
this pull request
Feb 10, 2026
Decouple permission pattern from idle pattern in get_status(). Count idle prompt lines after last [y/n/t]: to distinguish active (<=1 line) from stale (>1 lines) permission prompts. Previous approaches failed because: - awslabs#69: [ \t]* can't bridge newlines, active prompts never detected - awslabs#61: \s* bridges newlines but also matches stale prompts Line-based counting handles \r redraws correctly (same line, no \n) and uses the LAST [y/n/t]: match to handle re-rendered prompts. Applied to both kiro_cli.py and q_cli.py. Unit tests (24 tests, test_permission_prompt_detection.py): Category | Our Fix | awslabs#61 | awslabs#69 Active prompts (P1-P4,P8) | 5/5 | 5/5 | 0/5 Stale prompts (P5-P7) | 4/4 | 0/4 | 4/4* Non-permission (N1-N9) | 8/8 | 8/8 | 8/8 Edge cases (ANSI, multi) | 7/7 | 4/7 | 4/7 Total | 24/24 | 18/24 | 15/24 * awslabs#69 stale tests pass only because prompts are never detected. Integration tests (7 tests, test_kiro_cli_integration.py, real kiro-cli): Test | Our Fix | awslabs#61 | awslabs#69 P1/P2 active | PASS | PASS | FAIL (IDLE) P3/P4 injection | PASS | PASS | FAIL (IDLE) P5/P6 stale | PASS | FAIL (WAITING_USER_ANSWER) | PASS* P7 multiple prompts | PASS | FAIL (never completes) | FAIL (not detected) N4/N5 processing | PASS | PASS | PASS INIT smoke | PASS | PASS | PASS QUERY smoke | PASS | PASS | PASS Total | 7/7 | 5/7 | 4/7 * awslabs#69 P5/P6 passes for wrong reason (nothing ever matches). Opt-in live terminal view via CAO_TEST_WATCH=1.
sriharshaarangi
added a commit
to sriharshaarangi/cli-agent-orchestrator
that referenced
this pull request
Feb 10, 2026
Decouple permission pattern from idle pattern in get_status(). Count idle prompt lines after last [y/n/t]: to distinguish active (<=1 line) from stale (>1 lines) permission prompts. Previous approaches failed because: - awslabs#69: [ \t]* can't bridge newlines, active prompts never detected - awslabs#61: \s* bridges newlines but also matches stale prompts Line-based counting handles \r redraws correctly (same line, no \n) and uses the LAST [y/n/t]: match to handle re-rendered prompts. Applied to both kiro_cli.py and q_cli.py. Unit tests (24 tests, test_permission_prompt_detection.py): Category | Our Fix | awslabs#61 | awslabs#69 Active prompts (P1-P4,P8) | 5/5 | 5/5 | 0/5 Stale prompts (P5-P7) | 4/4 | 0/4 | 4/4* Non-permission (N1-N9) | 8/8 | 8/8 | 8/8 Edge cases (ANSI, multi) | 7/7 | 4/7 | 4/7 Total | 24/24 | 18/24 | 15/24 * awslabs#69 stale tests pass only because prompts are never detected. Integration tests (7 tests, test_kiro_cli_integration.py, real kiro-cli): Test | Our Fix | awslabs#61 | awslabs#69 P1/P2 active | PASS | PASS | FAIL (IDLE) P3/P4 injection | PASS | PASS | FAIL (IDLE) P5/P6 stale | PASS | FAIL (WAITING_USER_ANSWER) | PASS* P7 multiple prompts | PASS | FAIL (never completes) | FAIL (not detected) N4/N5 processing | PASS | PASS | PASS INIT smoke | PASS | PASS | PASS QUERY smoke | PASS | PASS | PASS Total | 7/7 | 5/7 | 4/7 * awslabs#69 P5/P6 passes for wrong reason (nothing ever matches). Opt-in live terminal view via CAO_TEST_WATCH=1.
sriharshaarangi
added a commit
to sriharshaarangi/cli-agent-orchestrator
that referenced
this pull request
Feb 10, 2026
Decouple permission pattern from idle pattern in get_status(). Count idle prompt lines after last [y/n/t]: to distinguish active (<=1 line) from stale (>1 lines) permission prompts. Previous approaches failed because: - awslabs#69: [ \t]* can't bridge newlines, active prompts never detected - awslabs#61: \s* bridges newlines but also matches stale prompts Line-based counting handles \r redraws correctly (same line, no \n) and uses the LAST [y/n/t]: match to handle re-rendered prompts. Applied to both kiro_cli.py and q_cli.py. Unit tests (24 tests, test_permission_prompt_detection.py): Category | Our Fix | awslabs#61 | awslabs#69 Active prompts (P1-P4,P8) | 5/5 | 5/5 | 0/5 Stale prompts (P5-P7) | 4/4 | 0/4 | 4/4* Non-permission (N1-N9) | 8/8 | 8/8 | 8/8 Edge cases (ANSI, multi) | 7/7 | 4/7 | 4/7 Total | 24/24 | 18/24 | 15/24 * awslabs#69 stale tests pass only because prompts are never detected. Integration tests (7 tests, test_kiro_cli_integration.py, real kiro-cli): Test | Our Fix | awslabs#61 | awslabs#69 P1/P2 active | PASS | PASS | FAIL (IDLE) P3/P4 injection | PASS | PASS | FAIL (IDLE) P5/P6 stale | PASS | FAIL (WAITING_USER_ANSWER) | PASS* P7 multiple prompts | PASS | FAIL (never completes) | FAIL (not detected) N4/N5 processing | PASS | PASS | PASS INIT smoke | PASS | PASS | PASS QUERY smoke | PASS | PASS | PASS Total | 7/7 | 5/7 | 4/7 * awslabs#69 P5/P6 passes for wrong reason (nothing ever matches). Opt-in live terminal view via CAO_TEST_WATCH=1.
haofeif
pushed a commit
that referenced
this pull request
Feb 16, 2026
Decouple permission pattern from idle pattern in get_status(). Count idle prompt lines after last [y/n/t]: to distinguish active (<=1 line) from stale (>1 lines) permission prompts. Previous approaches failed because: - #69: [ \t]* can't bridge newlines, active prompts never detected - #61: \s* bridges newlines but also matches stale prompts Line-based counting handles \r redraws correctly (same line, no \n) and uses the LAST [y/n/t]: match to handle re-rendered prompts. Applied to both kiro_cli.py and q_cli.py. Unit tests (24 tests, test_permission_prompt_detection.py): Category | Our Fix | #61 | #69 Active prompts (P1-P4,P8) | 5/5 | 5/5 | 0/5 Stale prompts (P5-P7) | 4/4 | 0/4 | 4/4* Non-permission (N1-N9) | 8/8 | 8/8 | 8/8 Edge cases (ANSI, multi) | 7/7 | 4/7 | 4/7 Total | 24/24 | 18/24 | 15/24 * #69 stale tests pass only because prompts are never detected. Integration tests (7 tests, test_kiro_cli_integration.py, real kiro-cli): Test | Our Fix | #61 | #69 P1/P2 active | PASS | PASS | FAIL (IDLE) P3/P4 injection | PASS | PASS | FAIL (IDLE) P5/P6 stale | PASS | FAIL (WAITING_USER_ANSWER) | PASS* P7 multiple prompts | PASS | FAIL (never completes) | FAIL (not detected) N4/N5 processing | PASS | PASS | PASS INIT smoke | PASS | PASS | PASS QUERY smoke | PASS | PASS | PASS Total | 7/7 | 5/7 | 4/7 * #69 P5/P6 passes for wrong reason (nothing ever matches). Opt-in live terminal view via CAO_TEST_WATCH=1.
haofeif
added a commit
that referenced
this pull request
May 2, 2026
… rebuild
Previous hardening commit wrote sanitisers that CodeQL didn't recognise as
taint-kills because the checks sat *after* Path() construction and
requests.get() received the caller-controlled source string.
- _SAFE_URL_PATH_RE validates parsed.path *before* the fetch; the URL handed
to requests.get() is rebuilt as f"https://{safe_host}{parsed.path}" where
safe_host is pulled from the allowlist literal. Reject query/fragment/
userinfo which have no place in a static .md fetch.
- _FILE_PATH_RE validates the source string *before* Path(source).resolve()
and Path(source).exists() — the fullmatch regex sits on the data-flow
edge into each Path() sink.
- Add a CodeQL job to ci.yml (python + js/ts, security-and-quality suite)
so future SSRF/path-injection regressions fail CI instead of trickling
in as post-merge alerts.
- Add scripts/security-scan.sh for local trivy + codeql runs mirroring CI.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
haofeif
added a commit
that referenced
this pull request
May 2, 2026
The previous regex used a character class that included `.` and `/`, so `../../etc/passwd.md` matched and passed into `Path(source).resolve()`. CodeQL was right to flag it — the sanitiser was weaker than advertised. - Add a leading negative lookahead `(?!.*\.\.)` to the file-path regex so any `..` anywhere in the string rejects the source before Path() is constructed. Legitimate `./foo.md`, `/abs/foo.md`, `~/foo.md`, and `sub/dir/foo.md` all still work. - Two new regression tests cover `../../etc/passwd.md` and embedded `/tmp/foo/../etc/passwd.md` traversal shapes. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
12 tasks
haofeif
added a commit
that referenced
this pull request
May 2, 2026
#226) * fix(install): harden agent-profile install against SSRF and path injection Closes CodeQL py/full-ssrf and py/path-injection alerts on the install path added in #166. - URL downloads restricted to https:// with a host allowlist (github.com, raw.githubusercontent.com by default; extend via CAO_PROFILE_ALLOWED_HOSTS env var). - Redirects disabled; explicit is_redirect rejection. - (5, 30)s connect/read timeout to bound worker exposure. - Filename / profile-name regex [A-Za-z0-9_-]{1,64} on every sink. - New allow_file_source kwarg on install_agent(); HTTP API and (transitively) ops-MCP install_profile pass False so remote callers cannot coerce the server into reading arbitrary local files. CLI behaviour unchanged. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(install): close CodeQL #60/#61/#62 with pre-Path validation + URL rebuild Previous hardening commit wrote sanitisers that CodeQL didn't recognise as taint-kills because the checks sat *after* Path() construction and requests.get() received the caller-controlled source string. - _SAFE_URL_PATH_RE validates parsed.path *before* the fetch; the URL handed to requests.get() is rebuilt as f"https://{safe_host}{parsed.path}" where safe_host is pulled from the allowlist literal. Reject query/fragment/ userinfo which have no place in a static .md fetch. - _FILE_PATH_RE validates the source string *before* Path(source).resolve() and Path(source).exists() — the fullmatch regex sits on the data-flow edge into each Path() sink. - Add a CodeQL job to ci.yml (python + js/ts, security-and-quality suite) so future SSRF/path-injection regressions fail CI instead of trickling in as post-merge alerts. - Add scripts/security-scan.sh for local trivy + codeql runs mirroring CI. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(install): close CodeQL #63 + drop conflicting workflow CodeQL job Two follow-ups to the previous hardening commit: 1. Alert #63 (py/path-injection, install_service.py:235) The `elif allow_file_source and _FILE_PATH_RE.fullmatch(source) and Path(source).exists()` guard still tripped the scanner because CodeQL doesn't thread the regex sanitiser through the compound boolean into the Path() call. Fix: dispatch by pure string suffix (`source.endswith(".md")`) — no Path() in install_agent() at all. All path construction happens inside _download_agent(), which already regex-validates before `.resolve()`. 2. The workflow-based `codeql` job conflicted with the repo's existing default-setup CodeQL ("CodeQL analyses from advanced configurations cannot be processed when the default setup is enabled"). Dropped the job and left a comment in ci.yml explaining why; default setup already runs the Analyze (python) / Analyze (js-ts) checks on every PR. 3. SECURITY.md — documented CodeQL coverage, the host allowlist behaviour (`CAO_PROFILE_ALLOWED_HOSTS`), and the scripts/security-scan.sh wrapper. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(install): reject `..` segments in _FILE_PATH_RE (closes CodeQL #61) The previous regex used a character class that included `.` and `/`, so `../../etc/passwd.md` matched and passed into `Path(source).resolve()`. CodeQL was right to flag it — the sanitiser was weaker than advertised. - Add a leading negative lookahead `(?!.*\.\.)` to the file-path regex so any `..` anywhere in the string rejects the source before Path() is constructed. Legitimate `./foo.md`, `/abs/foo.md`, `~/foo.md`, and `sub/dir/foo.md` all still work. - Two new regression tests cover `../../etc/passwd.md` and embedded `/tmp/foo/../etc/passwd.md` traversal shapes. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(install): move file-path handling out of install_service (closes CodeQL #64) Earlier rounds kept `Path(user_input)` reachable inside `install_service` behind a regex sanitiser. Every regex shape that still admitted a legitimate CLI path like `./foo.md` also admitted `../../etc/passwd.md` without an unacceptable normalise+prefix-check — so CodeQL kept correctly flagging the `.resolve()` sink. Structural fix: the shared service doesn't need a file-path branch at all. - `install_service.install_agent()` now accepts only a bare profile name (`_PROFILE_NAME_RE`) or an https:// URL on the host allowlist. - `cli/commands/install.py` grows a `_copy_local_profile_to_store()` helper that does the file reading, stem validation, and copy-into-store itself, then calls the service with the bare validated stem. - `api/main.py` drops the `allow_file_source=False` kwarg — the parameter is gone; the service refuses filesystem paths for every caller. - Tests: remove the file-path branches from the service suite, move that coverage to the CLI suite (`TestCopyLocalProfileToStore` + integration tests on file-source `cao install` invocations). Full test suite (`test/ --ignore=test/e2e -m "not integration"`): 1581/1581 pass. End-to-end smoke of `cao install /tmp/smoke-agent.md --provider kiro_cli` verified. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * style: black reformat test_install.py (extra blank line) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes initialization timeout and handoff failures caused by CLI prompts now including trailing text like "How can I help?" and "What would you like to do next?".
Problem
Q CLI and Kiro CLI recently started adding helpful trailing text to prompts:
[agent] 16% λ > How can I help?[agent] > What would you like to do next?This broke:
Root Cause
$) that required nothing after>start_pos > end_pos, resulting in empty content extractionSolution
1. Remove end-of-line anchors from prompt patterns
rf"\[{agent}\]...$"torf"\[{agent}\]..."2. Fix response extraction logic
3. Improve error messages
Changes
src/cli_agent_orchestrator/providers/q_cli.py- Pattern + extraction fixsrc/cli_agent_orchestrator/providers/kiro_cli.py- Pattern + extraction fixsrc/cli_agent_orchestrator/providers/codex.py- Pattern fixtest/providers/test_q_cli_unit.py- Updated tests + new trailing text teststest/providers/test_kiro_cli_unit.py- Updated tests + new trailing text testsTotal: 5 files changed, 109 insertions(+), 11 deletions(-)
Testing
✅ All 86 tests passing
✅ Added 4 new tests for trailing text scenarios
✅ Updated 2 existing tests with correct error messages
✅ Code formatted (black) and type-checked (mypy)
Backward Compatibility
✅ Fully backward compatible - pattern is now MORE permissive
✅ Matches all previous prompt formats plus new ones with trailing text
🤖 Assisted by Amazon Q Developer