Skip to content

feat: Add OpenCode CLI provider support#193

Merged
haofeif merged 33 commits into
awslabs:mainfrom
patricka3125:feat/opencli-integration
Apr 24, 2026
Merged

feat: Add OpenCode CLI provider support#193
haofeif merged 33 commits into
awslabs:mainfrom
patricka3125:feat/opencli-integration

Conversation

@patricka3125

@patricka3125 patricka3125 commented Apr 21, 2026

Copy link
Copy Markdown
Collaborator

Overview

Adds OpenCode CLI as a new CAO provider, following the native-first integration philosophy established by the Kiro provider. CAO translates its agent profiles into OpenCode's native agent format at install time and launches terminals via opencode --agent <name>, leveraging OpenCode's built-in tool/permission/model/skill systems rather than reimplementing them in CAO.

Addresses issue #168.

Feature acceptance criteria:

  • New opencode_cli provider with full lifecycle (initialize / get_status / extract_last_message_from_script / exit_cli / cleanup)
  • cao install --provider opencode_cli emits OpenCode-native .md agent files with YAML frontmatter
  • Per-agent MCP gating via opencode.json agent.<name>.tools (default-deny at top level, explicit per-agent re-enable)
  • allowedTools → OpenCode permission: frontmatter translation (always allow/deny, never ask)
  • Config isolation from user's personal OpenCode via OPENCODE_CONFIG + OPENCODE_CONFIG_DIR pointed at ~/.aws/opencode/
  • Native skill delivery via symlink OPENCODE_CONFIG_DIR/skills → SKILLS_DIR (progressive loading, no baked catalog)
  • Model override forwarded via --model CLI flag at launch
  • TUI status detection for IDLE, PROCESSING, COMPLETED, WAITING_USER_ANSWER, ERROR verified against fixture captures
  • Stability env vars wired (OPENCODE_DISABLE_AUTOUPDATE, OPENCODE_DISABLE_MOUSE, OPENCODE_DISABLE_TERMINAL_TITLE, OPENCODE_CLIENT=cao, TERM=xterm-256color)

Key Changes

File Change
src/cli_agent_orchestrator/models/provider.py Register OPENCODE_CLI = "opencode_cli" enum value
src/cli_agent_orchestrator/constants.py Add OPENCODE_CONFIG_DIR, OPENCODE_AGENTS_DIR, OPENCODE_CONFIG_FILE
src/cli_agent_orchestrator/models/opencode_agent.py New OpenCodeAgentConfig Pydantic model for frontmatter serialization
src/cli_agent_orchestrator/utils/opencode_permissions.py New CAO allowedTools → OpenCode permission: translator (two-step expansion + mapping)
src/cli_agent_orchestrator/utils/opencode_config.py New atomic editor for shared opencode.json (MCP upsert, per-agent tool gating, skills symlink helper)
src/cli_agent_orchestrator/providers/opencode_cli.py New OpenCodeCliProvider — TUI status detection, message extraction, /exit teardown
src/cli_agent_orchestrator/providers/manager.py Register opencode_cli branch in provider factory
src/cli_agent_orchestrator/providers/base.py Minor interface touch-up consumed by the new provider
src/cli_agent_orchestrator/services/terminal_service.py Integration point for the new provider's terminal lifecycle
src/cli_agent_orchestrator/cli/commands/install.py New elif branch: emit agent .md, ensure skills symlink, upsert opencode.json
src/cli_agent_orchestrator/cli/commands/launch.py Add opencode_cli to PROVIDERS_REQUIRING_WORKSPACE_ACCESS
test/providers/test_opencode_cli_unit.py Provider unit tests against TUI fixtures
test/providers/fixtures/opencode_cli_*.txt Plain + ANSI probe captures for every status state
test/cli/commands/test_install_opencode.py Install-branch unit tests (idempotency, MCP wiring, stale-grant eviction, slash-sanitized IDs)
test/utils/test_opencode_permissions.py Permission translator unit tests
test/utils/test_opencode_config.py opencode.json editor unit tests
test/models/test_opencode_agent.py OpenCodeAgentConfig serialization tests
test/e2e/conftest.py, test/e2e/test_assign.py require_opencode fixture + OpenCode variant of assign/handoff e2e
test/services/test_terminal_service_full.py, test/test_constants.py Updates to cover new provider wiring
docs/opencode-cli.md New provider docs (prereqs, launch examples, permission/MCP mapping, troubleshooting)
README.md Add opencode_cli row to provider table
CHANGELOG.md "Unreleased" entry announcing the provider

Non-Goals

Scoped out intentionally:

  • opencode run (single-shot) integration — CAO requires a persistent REPL; TUI is the only fit.
  • opencode serve / attach / acp transports — do not fit CAO's tmux-centric architecture. Documented as possible future adapters.
  • Session resumption across CAO restarts (--continue / --session) — CAO's model is fresh-terminal-per-agent.
  • Project-local opencode.json override handling — OpenCode's merge precedence lets a project-local file override the CAO-owned config. Out of scope to detect or warn; documented as a known constraint.
  • Concurrent cao install --provider opencode_cli writers — the shared opencode.json can race on parallel installs. Sequential installs are safe; file locking is a ~5-line fix deferred until observed.
  • MCP server name collisions — two agents declaring the same MCP name with different commands: second install silently overwrites. Assumes users keep MCP names globally consistent across profiles, matching Kiro/Q behavior.
  • Skills symlink repair of user-owned state — if the target path exists as a non-symlink, the helper logs and no-ops rather than stomping user data.
  • Windows symlink support — Linux and macOS only; revisit if Windows target is added.
  • ask permission emission — CAO owns the permission decision at install time; OpenCode's △ Permission required UI is bypassed to keep automated flows from stalling in WAITING_USER_ANSWER.
  • Baked skill catalog in the system prompt — skills reach OpenCode agents via native skill tool discovery, not compose_agent_prompt.

Test Plan

  • uv run black src/ test/
  • uv run isort src/ test/
  • uv run mypy src/ (strict)
  • uv run pytest test/ --ignore=test/e2e --ignore=test/providers/test_q_cli_integration.py -v — full unit suite
  • uv run pytest test/providers/test_opencode_cli_unit.py -v — provider status detection against every fixture (IDLE splash, IDLE post-completion, PROCESSING, COMPLETED, WAITING_USER_ANSWER), stale esc interrupt → IDLE guard, Thinking: preamble stripping, /exit teardown, 120s initialize() timeout
  • uv run pytest test/cli/commands/test_install_opencode.py -v — install idempotency, MCP upsert correctness, agent-without-MCP path, preservation of user-authored opencode.json entries, stale agent.<id> eviction on reinstall, slash-sanitized agent IDs
  • uv run pytest test/utils/test_opencode_permissions.py -v — every translation-examples case (["*"], ["@builtin"], ["execute_bash", "fs_read"], ["fs_*", "@cao-mcp-server"]), hardcoded non-vocabulary policy (task/question/webfetch/websearch/codesearch deny, todowrite/skill allow)
  • uv run pytest test/utils/test_opencode_config.py -v — fresh-file creation, idempotent re-upsert, auto-mkdir, user-entry preservation, symlink helper (fresh/idempotent/non-symlink-directory)
  • uv run pytest test/models/test_opencode_agent.py -vOpenCodeAgentConfig round-trip via frontmatter.dumps()
  • uv run pytest -m e2e test/e2e/test_assign.py -k opencode — full assign/handoff/send_message flow with examples/assign/ profiles (supervisor + 3 workers + reporter) against a real opencode binary
  • Manual smoke: cao install developer --provider opencode_cli && cao launch --agents developer --provider opencode_cli reaches IDLE and responds to a prompt
  • Manual verification: OPENCODE_CONFIG=~/.aws/opencode/opencode.json OPENCODE_CONFIG_DIR=~/.aws/opencode opencode agent list shows CAO-installed agents alongside built-ins
  • Manual verification: OpenCode agent's skill tool lists cao-supervisor-protocols and cao-worker-protocols via the symlinked skills/ directory

patricka3125 and others added 12 commits April 20, 2026 21:59
…ider

- Add ProviderType.OPENCODE_CLI = "opencode_cli" to the provider enum
- Add OPENCODE_CONFIG_DIR / OPENCODE_AGENTS_DIR / OPENCODE_CONFIG_FILE path
  constants pointing at ~/.aws/opencode_cli/
- New OpenCodeAgentConfig Pydantic model (description, mode, permission) that
  serializes to OpenCode-compatible YAML frontmatter via frontmatter.dumps()
- New cao_tools_to_opencode_permission() translator: two-step algorithm from §9
  of the design doc (shorthand expansion + CAO-category → OpenCode tool mapping +
  hardcoded non-vocabulary deny/allow policies)
- New opencode_config.py read-modify-write helper for the shared opencode.json
  (upsert_mcp_server, upsert_agent_tools, remove_agent_tools, read_config,
  write_config)
- Port 5 TUI probe captures into test/providers/fixtures/ (plain + ANSI variants
  for idle-splash, idle-post-completion, processing, completed, permission states)
- 54 new unit tests covering all Phase 1 modules; all 1368 tests pass

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ard-rail

Item 1: Replace opencode_cli_processing.ansi.txt with a genuine PROCESSING frame
        re-captured via tmux probe (md5 9cbe2723, distinct from completed frame).
        Add test/providers/fixtures/OPENCODE_FIXTURES.md documenting all fixture
        sources and the remaining idle_post_completion.ansi.txt reuse.

Item 2: Remove dead Pydantic v1 `class Config: exclude_none = True` block from
        OpenCodeAgentConfig — it is a no-op under Pydantic v2.

Item 3: Add inline comment to OpenCodeAgentConfig.permission documenting the
        deliberate Phase 1 type simplification and when to widen it.

Item 4: Replace unreachable `else: result[tool] = "deny"` in
        opencode_permissions.py with `raise AssertionError(...)` so any future
        tool added to ALL_OPENCODE_TOOLS without a policy update fails loudly.

Item 5: Add test_noop_on_completely_missing_file to TestRemoveAgentTools —
        exercises the read_config() skeleton-return path when opencode.json
        does not exist yet.

All 1369 tests pass; mypy/black/isort clean.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds the `opencode_cli` branch to `cao install` (Phase 2):
- writes agent `<name>.md` with YAML frontmatter (description, mode, permission)
  using compose_agent_prompt for the body and cao_tools_to_opencode_permission
  for the per-tool allow/ask/deny map
- `--auto-approve` flag emits `allow` instead of `ask` for permitted tools;
  has no effect on other providers
- if the agent profile declares mcpServers, upserts top-level mcp/tools entries
  (default-deny) and per-agent tool re-enables into opencode.json
- full unit-test coverage in test/cli/commands/test_install_opencode.py
  (fresh install, idempotency, auto-approve, MCP wiring, config preservation,
  safe filename)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Rename agent_config_oc → agent_config in opencode_cli branch for
  consistency with Kiro/Q/Copilot sibling branches (Item 1)
- Strengthen test_agent_md_has_body: assert sentinel prompt text via
  profile.prompt frontmatter field instead of weak non-empty check (Item 2)
- Bump live smoke-test subprocess timeout 30s → 60s to survive cold-cache
  npm plugin installs on CI (Item 4)

Items 3 (MCP collision coverage already in Phase 1) and 5 (context-file
parent mkdir — out of Phase 2 scope) intentionally not addressed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Implements the opencode_cli runtime provider per §8 of the design doc:
- OpenCodeCliProvider with full BaseProvider interface (initialize,
  get_status, extract_last_message_from_script, exit_cli, cleanup)
- 5-state detection (IDLE/PROCESSING/COMPLETED/WAITING_USER_ANSWER/ERROR)
  with line-level position guard against stale alt-screen esc-interrupt
  remnants (lesson #16)
- COMPLETED vs IDLE-post-completion distinguished by checking for a
  subsequent ▣ token after the last full completion marker
- 120s initialize() timeout for first-run npm install cold-start (§8.2)
- Inline-env launch command with all stability env vars (§5)
- --model flag included only when profile.model is set (§3.1 exception)
- Registered in ProviderManager; "opencode_cli" added to
  PROVIDERS_REQUIRING_WORKSPACE_ACCESS in launch.py
- 43 unit tests at 96% line coverage against Phase 1 fixtures

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…eport

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… dual-pattern comment

Item 1: development report corrected 125 → 332 lines for opencode_cli.py.
Item 4: inline comment at extract_last_message_from_script explains why the
unanchored r"┃\s{2}" is used instead of the module-level USER_MESSAGE_PATTERN.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- test/e2e/conftest.py: add require_opencode fixture (skips if opencode not on PATH)
- test/e2e/test_assign.py: add TestOpenCodeCliAssign with data_analyst, report_generator,
  and assign_with_callback tests covering all four orchestration modes
- docs/opencode-cli.md: new provider doc covering prerequisites, quick start, config
  isolation, permission/tool mapping, MCP wiring, known limitations, troubleshooting
- README.md: add opencode_cli row to provider table + cao launch example
- CHANGELOG.md: add Unreleased entry announcing OpenCode CLI provider

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…json format (Phase 3 regression)

CAO profiles store MCP servers with {type: "stdio", command: str, args: list}.
OpenCode's opencode.json requires {type: "local", command: list, enabled: true}.
The install branch was passing raw CAO config directly, causing OpenCode to reject
the config with "Configuration is invalid: Invalid input mcp.cao-mcp-server".

Fix: add translate_mcp_server_config() to opencode_config.py and call it in the
opencode_cli install branch before upsert_mcp_server(). Also translates env→environment.
6 unit tests added for the translator.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ting (Phase 4 review polish)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ix notes

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@patricka3125 patricka3125 marked this pull request as draft April 21, 2026 06:27
Comment thread docs/opencode-cli.md Outdated
… symlink

At install time, create a skills → SKILLS_DIR symlink under OPENCODE_CONFIG_DIR
so OpenCode auto-discovers CAO skills through its native skill tool (§5.1). Uses
profile.system_prompt or profile.prompt as the lean agent body — the skill catalog
is no longer baked into the OpenCode system prompt.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@haofeif haofeif added the enhancement New feature or request label Apr 21, 2026
Two status/extraction bugs revealed by e2e runs with full system prompts:

1. COMPLETION_MARKER_PATTERN now matches the Nm Ns duration format that
   OpenCode emits for responses that take more than 60 seconds (e.g.
   "1m 8s"). The old pattern only matched the pure-seconds form, causing
   get_status() to stall at PROCESSING indefinitely for longer turns.

2. Add extraction_tail_lines property to BaseProvider (default None) and
   override to 2000 in OpenCodeCliProvider. terminal_service.get_output
   uses this value for the LAST-mode tmux capture so long responses don't
   push the user-message marker (┃  ) beyond the 200-line default window.
   Status-check captures are unaffected.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@patricka3125 patricka3125 force-pushed the feat/opencli-integration branch from c1356ca to 7c0c224 Compare April 21, 2026 07:03
patricka3125 and others added 6 commits April 21, 2026 00:11
…g-target symlink test

Item 2: Eliminate double capture-pane in get_output(mode=LAST). Previously
the function always captured at 200 lines then recaptured if the provider
declared extraction_tail_lines. Now FULL mode returns after a single capture
at the default depth; LAST mode resolves extract_lines from the provider once
and makes exactly one capture before the retry loop.

Item 1: Add test_warns_and_skips_when_symlink_points_elsewhere to
TestEnsureSkillsSymlink, covering the branch at opencode_config.py:37-42
where the target is a symlink that resolves to a different directory.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… to opencode

Change OPENCODE_CONFIG_DIR from ~/.aws/opencode_cli to ~/.aws/opencode in
constants.py; OPENCODE_AGENTS_DIR and OPENCODE_CONFIG_FILE update transitively.
Update all path string references in docs, CHANGELOG, and the constants unit test.
Provider identifier (ProviderType.OPENCODE_CLI.value == "opencode_cli") is unchanged.
Add CHANGELOG migration note for users who need to re-run cao install.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ge scrolled off viewport

OpenCode renders in alt-screen mode so the tmux scrollback only holds the current
visible frame (~41 lines, history_size≈2). For long responses the user-message bar
(┃  ) scrolls off the top before extraction runs, causing "No user message found".

When no ┃  is found before the completion marker, scan for the first 5-space-indented
agent line as the left boundary instead of raising. The visible frame already contains
only the current turn's content, so multi-turn disambiguation is not needed here.

Adds unit test test_fallback_extracts_when_user_message_scrolled_off.
e2e: 3/3 PASSED in 161s on port 9888.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…te skill-delivery comments

Cleanup 1: build_skill_catalog() now runs only when the provider is in
RUNTIME_SKILL_PROMPT_PROVIDERS, skipping the file reads/YAML parsing/Pydantic
validation for providers that deliver skills natively (OpenCode symlink, Kiro
skill:// resources) or via install-time baking (Q, Copilot). The skill_prompt
kwarg at the create_provider call site simplifies to skill_prompt=skill_prompt
since the guard now lives one line above.

Cleanup 2: update comments in the RUNTIME_SKILL_PROMPT_PROVIDERS block and
create_terminal Steps 3b/4 to reflect Phase 5's native OpenCode skill discovery.

Adds two new tests asserting the lazy-call invariant:
- test_build_skill_catalog_called_for_runtime_prompt_provider (call_count == 1)
- test_build_skill_catalog_not_called_for_native_or_baked_provider (parametrized
  over opencode_cli, kiro_cli, q_cli, copilot_cli; assert_not_called)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…eanup commits

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Comment thread src/cli_agent_orchestrator/models/opencode_agent.py Outdated
patricka3125 and others added 3 commits April 21, 2026 01:02
…t + cleanup polish

Item 1: black reformats terminal_service.py line 155 from a three-line expression
to the single 100-char form black prefers.

Item 2: rewrite extraction_tail_lines docstring — the old text claimed responses
push ┃ beyond a 200-line window, which is wrong; OpenCode's alt-screen mode caps
history_size near 2 making the override a no-op. Docstring now accurately describes
the belt-and-braces rationale and cross-references the within-viewport fallback.

Item 3: add single-turn alt-screen assumption comment to the normal extraction path.

Item 4: CHANGELOG migration note gains a rm -rf cleanup hint for pre-release users.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…allout

- Update design doc stability-env-vars entry: footer patterns (ctrl+p,
  esc interrupt) are pinned and scroll-safe; the completion marker
  (▣ agent · model · Ns) is conversation content and scrolls off,
  preventing COMPLETED detection if mouse reporting is enabled
- Add 'Scrolling enters tmux copy mode' Known Limitations entry in
  opencode-cli.md explaining the trade-off and how to work around it

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Comment thread src/cli_agent_orchestrator/cli/commands/install.py Outdated
Comment thread src/cli_agent_orchestrator/providers/base.py Outdated
@patricka3125 patricka3125 changed the title feat: Add OpenCode CLI provider feat: Add OpenCode CLI provider support Apr 21, 2026
@patricka3125 patricka3125 marked this pull request as ready for review April 21, 2026 09:32
@patricka3125

patricka3125 commented Apr 21, 2026

Copy link
Copy Markdown
Collaborator Author

side note - for skill discovery, the reason why the symlink approach works is because opencode cli is capable of merging config from global/user config directory into custom config directory. The available skills should cleanly shape out depending on whether opencode is ran with or without cao harness. See https://opencode.ai/docs/config/#locations

@patricka3125

Copy link
Copy Markdown
Collaborator Author

side note # 2 - realize that web ui will need to add support for opencode cli where applicable, considering descoping this for now to future pr...

@codecov-commenter

codecov-commenter commented Apr 21, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (main@eda4da2). Learn more about missing BASE report.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #193   +/-   ##
=======================================
  Coverage        ?   92.45%           
=======================================
  Files           ?       60           
  Lines           ?     4932           
  Branches        ?        0           
=======================================
  Hits            ?     4560           
  Misses          ?      372           
  Partials        ?        0           
Flag Coverage Δ
unittests 92.45% <100.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@haofeif

haofeif commented Apr 21, 2026

Copy link
Copy Markdown
Contributor

Codecov Report

❌ Patch coverage is 97.29730% with 7 lines in your changes missing coverage. Please review. ⚠️ Please upload report for BASE (main@3530084). Learn more about missing BASE report.
Files with missing lines Patch % Lines
...c/cli_agent_orchestrator/providers/opencode_cli.py 96.18% 5 Missing ⚠️
src/cli_agent_orchestrator/providers/base.py 66.66% 1 Missing ⚠️
...i_agent_orchestrator/utils/opencode_permissions.py 96.87% 1 Missing ⚠️
Additional details and impacted files

☔ View full report in Codecov by Sentry. 📢 Have feedback on the report? Share it here.
🚀 New features to boost your workflow:

@patricka3125 can you fix this ? also, did you get a chance to run a full test on Assign Example to check he OopenCode cli is able to perform this end to end ?

  • 1 supervisor agent assigns to 3x data analysts successfully
  • 1 supervisor agent handoff to 1s report agent successfully
  • 3x data analysts sent messages successfully to supervisor
  • supervisor receives all 3 messages from 3x data anslysts
  • supervisor agent got the result from report agent
  • supervisor can complete the jobs successfully

@haofeif

haofeif commented Apr 21, 2026

Copy link
Copy Markdown
Contributor

@patricka3125 Substantial, well-structured provider integration that follows the cao-provider skill closely.

One comment

  • CHANGELOG.md line 11 links to docs/feat-opencode-provider-design.md, which isn't in the PR. Either add the file or drop the sentence.

Design question

  • You questioned BaseProvider.extraction_tail_lines yourself — agreed. The docstring in opencode_cli.py admits it's currently a no-op. Either move it into OpenCodeCliProvider as an internal attribute, or add one test justifying its presence in base.

@patricka3125

patricka3125 commented Apr 22, 2026

Copy link
Copy Markdown
Collaborator Author

E2E smoke test — examples/assign/ on OpenCode CLI

Setupuv run cao-server on :9889, uv run cao launch --agents analysis_supervisor --provider opencode_cli --headless --auto-approve --session-name e2e-opencode, directive prompt sent via POST /terminals/{id}/input. Terminals observed via tmux capture-pane.

Haofeif's 6 scenarios

# Scenario Result Evidence
1 Supervisor assigns to 3× data analysts Windows data_analyst-7220/5d04/9f04 spawned within 4 s; 3 cao-mcp-server_assign tool calls visible in supervisor pane
2 Supervisor handoff to 1× report agent report_generator-5649 terminal created; supervisor logged "Hand off successful (took ~28.73 s); a draft template has been created"
3 3× data analysts sent messages to supervisor Inbox received 4 messages (msg 114/115/116 from the 3 distinct analysts + msg 117 a duplicate from Dataset-C analyst) with sender-ID-injection tag [Message from terminal <id>...]
4 Supervisor receives all 3 messages All 3 unique messages transitioned pending → delivered (msg 117 dup also delivered afterwards)
5 Supervisor got report-agent result Supervisor's turn-1 summary references the returned template; the report agent completed its handoff in ~29 s
6 Supervisor completes the job successfully Final report produced with per-dataset Mean/Median/Population-StdDev + "Consolidated Observations" + "Conclusion" sections

Observations worth noting

  1. Post-settle inbox delivery gap for opencode_cli (follow-up). After the supervisor's turn ends, inbox messages stay pending indefinitely. Verified with a DEBUG-level run + two isolation probes (raw DB pending row insert, 15 s poll-cycle wait, then manual mtime bump on the log file). Two compounding causes:

    • a) mtime doesn't advance post-settle. inbox_service.LogFileHandler is scheduled under watchdog.observers.polling.PollingObserver(timeout=INBOX_POLLING_INTERVAL=5), which scans TERMINAL_LOG_DIR every 5 s and emits on_modified only when a file's mtime changed since the previous scan. OpenCode's alt-screen TUI emits no pty bytes once the turn is idle, so tmux pipe-pane writes nothing, the log mtime freezes, and the handler never fires. Server debug log shows four consecutive Log file modified: 92a064b6.log entries at 5 s intervals during the turn, then a 62 s gap with no events, then an event immediately after a manual printf ' ' >> <log>.

    • b) Idle pattern is absent from the pipe-pane byte stream. Even when the handler does fire, _has_idle_pattern(tail) looks for ctrl+p\s+commands in the log tail and returns False. Inspecting the log tail confirms: the ctrl+p commands footer string is not present in the raw bytes; opencode's alt-screen TUI sends it once during the initial render (before pipe_pane() is attached — pipe_pane() is called after provider.initialize() returns) and thereafter emits only incremental cursor/character updates. So _has_idle_pattern → False (DEBUG log confirms: Terminal 92a064b6 not idle (no idle pattern in log tail), skipping) even when provider.get_status() == COMPLETED.

    Either cause alone blocks delivery; the two together make it deterministic. The fix is non-trivial (e.g., switch the opencode inbox-delivery trigger from log-mtime polling to tmux capture-pane-based status polling, or re-capture the idle frame into the log after initialize()) and should be deferred to a follow-up PR to keep this one focused on the provider integration itself. For this smoke test I drained the inbox by calling check_and_send_pending_messages() directly.

  2. Report-generator status = ERROR after handoff (expected). The handoff path exits the worker's opencode CLI process after returning its result, which lands the pane in a state that doesn't match any of the TUI's status regexes — hence ERROR. The handoff itself returned successfully and the supervisor received the template; the terminal's post-exit status is cosmetic only and is the expected behavior for the handoff lifecycle.

patricka3125 and others added 4 commits April 21, 2026 17:56
Addresses review comment: the design doc link in the Unreleased entry
referred to a file that is not included in this PR.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The property was opencode-specific but lived on BaseProvider, which meant
every provider carried an attribute it had no use for. Remove it from the
base class, keep it as a provider-local property on OpenCodeCliProvider,
and have terminal_service.get_output read it via a getattr capability
check so the base class stays agnostic.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…nslator

Adds four tests to cover lines flagged as missing on the codecov report:

- get_status ERROR fallback for non-empty output with no recognized marker
- extract_last_message_from_script residual ``┃`` line + blank-line
  branches when raw_response contains leftover bar-prefixed lines
- extract_last_message_from_script empty-response ValueError
- cao_tools_to_opencode_permission AssertionError when a tool appears in
  ALL_OPENCODE_TOOLS without a matching policy

Brings opencode_cli.py and opencode_permissions.py to 100% patch coverage.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@patricka3125

patricka3125 commented Apr 22, 2026

Copy link
Copy Markdown
Collaborator Author

@haofeif Please lmk what you think of preferred action item for follow-up # 1 in the above comment.

It also seems somewhat related to #115 in terms of fix strategy

@haofeif

haofeif commented Apr 22, 2026

Copy link
Copy Markdown
Contributor

@patricka3125 Thanks for the thorough RCA on the post-settle inbox gap and the live 6-scenario smoke test — exactly the diligence I'd want before shipping.

On the inbox-delivery bug: I verified your RCA against the code and it's correct. The one nuance worth adding: check_and_send_pending_messages at inbox_service.py:105 uses provider.get_status() directly (not the log tail), so the function itself is fine — the breakage is purely in the watchdog wake-up path. That shapes the fix.

I agree the fix belongs in a follow-up PR, not this one — 3290 lines is already plenty and an inbox-service change has different blast radius. But I think "silent deadlock on multi-agent flows" needs a louder signal than a buried Known Limitations bullet.

Two asks before merge:

  1. Status badge at the top of docs/opencode-cli.md and in the README provider table marking opencode as experimental — single-agent flows only. A user evaluating the provider should see the constraint before they invest, not after they hit the deadlock.
  2. Open the tracking issue with your RCA pasted in, link Event driven architecture #115, and describe the likely interim fix (~20-line provider-scoped polling fallback). Start the follow-up PR in draft. Doesn't have to merge today, just has to be in-flight so the "experimental" status has a visible path to "stable."

Plus open the Web UI opencode issue you flagged yourself in side note #2.

Suggest do that before we merge this in. What do you think ?

@patricka3125

Copy link
Copy Markdown
Collaborator Author

@haofeif agree with the assessment

@haofeif

haofeif commented Apr 23, 2026

Copy link
Copy Markdown
Contributor

@haofeif agree with the assessment

@patricka3125 please let me know when you get a chance to work on the above comments and ready for the final review

Add a warning badge to docs/opencode-cli.md and tag the README provider
table row, both linking the post-settle inbox-delivery deadlock tracked
in awslabs#203. Multi-agent flows are not yet reliable on opencode_cli; this
signals the constraint to evaluators before they hit it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@patricka3125

patricka3125 commented Apr 24, 2026

Copy link
Copy Markdown
Collaborator Author

hi @haofeif thanks for the review. The action items you mentioned above have been addressed, will take them when i get the chance. I have not yet had the chance to try out the latest open models released recently yet (with cao). Would love to know how the team and community experiences it :D

@haofeif haofeif left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@patricka3125 LGTM. also raised an issue due to the limitation #205

@haofeif haofeif merged commit 3f5bc0e into awslabs:main Apr 24, 2026
13 checks passed
fanhongy added a commit that referenced this pull request May 7, 2026
Pulls in PR #172 plugin system and PR #193 opencode_cli so later
phases can build plugins instead of provider hooks.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants