feat: Build an MCP server for cao operations by patricka3125 · Pull Request #166 · awslabs/cli-agent-orchestrator

patricka3125 · 2026-04-10T09:36:36Z

Overview

Addresses #161

Introduces cao-ops-mcp, a new MCP server that exposes CAO management operations as structured tools for a user's primary agent. It enables end-to-end agent-driven CAO workflows — discovering and installing profiles, launching sessions, delivering prompts, monitoring progress, and shutting down — without leaving the agent interface or switching to a separate terminal.

Motivation

All CAO management operations currently require either the cao CLI or direct HTTP API calls in a separate terminal. The existing cao-mcp-server does not address this — its tools (handoff, assign, send_message) are scoped to inter-agent orchestration within active CAO sessions, not to managing CAO itself.

cao-ops-mcp fills this gap as the agentic control plane alongside the existing Web UI (visual control plane) and CLI (manual control plane). A user's primary agent can now manage the full CAO lifecycle within a single conversation — no terminal switching required.

Key Changes

File	What changed and why
`src/.../ops_mcp_server/server.py`	New MCP server (`cao-ops-mcp`) with 8 tools across two groups: profile management (`list_profiles`, `get_profile_details`, `install_profile`) and session lifecycle (`launch_session`, `send_session_message`, `list_sessions`, `get_session_info`, `shutdown_session`)
`src/.../ops_mcp_server/models.py`	Pydantic response models for the ops MCP tools (`LaunchResult`, `InstallResult`, `ProfileListResult`, `SessionListResult`, `SendMessageResult`)
`src/.../ops_mcp_server/__init__.py`	Package init for the new `ops_mcp_server` module
`src/.../services/install_service.py`	New service layer extracted from the CLI `install` command — pure `install_agent()` function returning a structured `InstallResult`, reusable by both the CLI and the new API endpoint
`src/.../api/main.py`	Two new API endpoints: `GET /agents/profiles/{name}` (full profile content) and `POST /agents/profiles/install` (install via service layer), both consumed by the ops MCP tools
`src/.../cli/commands/install.py`	Refactored to a thin wrapper over `install_service.install_agent()` — behavior is unchanged, install logic now lives in the service layer
`pyproject.toml`	Registers the `cao-ops-mcp-server` entry point
`README.md`	Adds a `CAO Ops MCP Server` section documenting setup, available tools, and the typical workflow
`test/ops_mcp_server/test_server.py`	Unit tests for all 8 MCP tools with mocked API calls
`test/api/test_api_profiles.py`	Unit tests for the two new API endpoints
`test/services/test_install_service.py`	Unit tests for the extracted install service (each source type, each provider path, env var injection, error cases)
`test/cli/commands/test_install.py`	Slimmed down to cover only CLI-level behavior; install logic tests moved to `test_install_service.py`

Test Plan

uv run pytest test/ --ignore=test/e2e -v — full unit test suite passes with no regressions
uv run cao-ops-mcp-server — server starts without errors (exits cleanly when stdin closes)
Test implemented MCP tools manually with MCP inspector: list_profiles → install_profile → launch_session → send_session_message → get_session_info → shutdown_session
(uvx --with-editable . fastmcp dev inspector src/cli_agent_orchestrator/ops_mcp_server/server.py)

codecov-commenter · 2026-04-10T14:02:04Z

Codecov Report

❌ Patch coverage is 96.24277% with 13 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (main@f650aa2). Learn more about missing BASE report.

Files with missing lines	Patch %	Lines
...rc/cli_agent_orchestrator/ops_mcp_server/server.py	89.56%	12 Missing ⚠️
...cli_agent_orchestrator/services/install_service.py	99.22%	1 Missing ⚠️

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #166   +/-   ##
=======================================
  Coverage        ?   92.40%           
=======================================
  Files           ?       64           
  Lines           ?     5186           
  Branches        ?        0           
=======================================
  Hits            ?     4792           
  Misses          ?      394           
  Partials        ?        0

Flag	Coverage Δ
unittests	`92.40% <96.24%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Add tests for invalid session-create responses, prompt delivery failures, and rejected non-.md install sources so the corresponding branches in the ops MCP server and install service are exercised. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Move the hard-coded 120s terminal ready timeout into a shared TERMINAL_READY_TIMEOUT constant so both cao-mcp-server handoff and cao-ops-mcp launch consume the same value, and narrow the broad except clauses in install_agent and the ops MCP _request_json helper so real bugs propagate instead of being flattened into error strings. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Wrap discover_profiles and list_sessions in ProfileListResult and SessionListResult envelopes so every tool returns a consistent shape with a success flag instead of mixing list and error dict payloads. Rewrite the tool docstrings in cao-mcp style and document the install_profile source-resolution order, including the CWD-collision gotcha where a bare agent name matching a file in the working directory routes to path resolution. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Shorten TERMINAL_READY_TIMEOUT from 120s to 30s since it acts only as a fallback after the provider's own initialize() hook has already run. Rename the discover_profiles tool to list_profiles so the name matches its sibling list_sessions and more accurately describes the operation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Remove the prompt parameter from launch_session so it returns session identifiers immediately without waiting. Add send_session_message which delivers messages via the inbox API (POST /terminals/{id}/inbox/messages) using sender_id="cao-ops-mcp", consistent with how the cao-mcp server delivers messages but without requiring a CAO_TERMINAL_ID env var. Add SendMessageResult model and 4 new tests for the new tool. Remove 3 now-irrelevant prompt-flow tests from the launch path. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

These changes were out of scope for the ops MCP PR. Remove the constant from constants.py and restore the hardcoded 120.0 timeout in mcp_server server.py to match the original upstream state. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Documents the new cao-ops-mcp server — setup, available tools, and typical workflow — alongside the distinction from cao-mcp-server. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…y revert Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

fanhongy

Look good. conflict though (due to PR #145 merged)

Please resolve.

Upstream added compose_agent_prompt baking for Q/Copilot and skill:// resource support for Kiro during install. The merge conflict resolution kept the feature branch's service-layer thin wrapper but left those behaviours out of install_service.py. - Q: use compose_agent_prompt(profile) to bake skill catalog into prompt - Kiro: add skill://<SKILLS_DIR>/**/SKILL.md resource; use raw prompt (None when empty) so Kiro's native progressive loader handles skills - Copilot: use compose_agent_prompt(profile, base_prompt=...) to bake catalog into the .agent.md body TestInstallSkillCatalogBaking ported from upstream test_install.py into test_install_service.py with patch targets corrected for the service layer. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Audit of the upstream/main diff found two classes of upstream tests that were dropped during merge conflict resolution without replacement: 1. TestInstallCommandEnvFlags (8 tests) — env var injection, context file secret isolation, unresolved-var detection, and env file lifecycle. Ported as TestInstallAgentEnvBehaviour in test_install_service.py against install_agent directly. 2. test_install_general_error — upstream CLI caught bare Exception; the service only caught (ValueError, OSError). Added except Exception fallback to install_agent and a matching test test_install_returns_failure_for_unexpected_errors. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

1. Collapse dead except (ValueError, OSError) into except Exception in install_agent — both produced the same message; narrower branch was unreachable after the broad fallback was added. 2. Restore upstream UX in install.py CLI wrapper: - Unresolved-vars warning now reads "Unresolved env var(s) in profile: X. Set them with \`cao env set\` or pass --env KEY=VALUE." - "Set N env var(s)" now uses len(env_vars) (raw tuple) so duplicate --env flags count correctly instead of deduplicating via dict. 3. Fix env_vars round-trip between ops_mcp_server and API: switch from comma-separated KEY=VALUE (breaks on values containing ',') to JSON-encoded dict. _serialize_env_vars emits json.dumps, _parse_env_vars parses json.loads. 4. Return InstallResult directly from the FastAPI install endpoint with a typed return annotation instead of result.model_dump() — gives FastAPI a proper OpenAPI schema. 5. Inline the one-line _install_result_from_error helper in ops_mcp_server/server.py. 6. Add parametrized coverage for the three previously-uncovered _resolve_named_source branches: flat provider dir layout, extra dir flat layout, extra dir nested layout. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

parse_env_assignment was the only caller in _parse_env_vars; the JSON migration in f8ad731 removed that call site, leaving an F401 dead import flagged by Ruff. Also remove the `import json as _json` local shadow inside _parse_env_vars — json is already imported at module top level. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…lResult Upstream's inline install command printed "Downloaded agent from URL to local store" and "Copied agent from file to local store" before the success message — this was lost when install logic was extracted to the service layer. Adds a source_kind discriminator ('url'|'file'|'name') to InstallResult and wires it through the CLI output and tests. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

patricka3125 · 2026-04-16T07:19:00Z

hi @fanhongy would you mind taking another look when you have a chance? Appreciate the review!

anilkmr-a2z · 2026-04-16T19:36:08Z

Actually, I was thinking of this. I ended up implementing something like a command line to interact with cao to avoid introducing one more mcp server. Although your setup seems cleaner.

anilkmr-a2z · 2026-04-16T19:36:50Z

Also, similar to your other CRs. Mind squashing these commits. Not all of these need to be in the main repo.

patricka3125 · 2026-04-16T21:53:19Z

Hi @anilkmr-a2z I believe the commits would be automatically squashed by github before merge.

anilkmr-a2z · 2026-04-17T04:45:03Z

Ahh! I didn't know that. It still makes it very hard to review though. So, I could not go through all files in 25 commits. Is there any recommendation on how to review those ?

haofeif · 2026-04-17T07:06:41Z

Ahh! I didn't know that. It still makes it very hard to review though. So, I could not go through all files in 25 commits. Is there any recommendation on how to review those ?

@anilkmr-a2z how about reviewing the latest commit with the files changed ? dont worry how many commits were there before ? as when we merge we will squash merge into one commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

anilkmr-a2z · 2026-04-20T04:42:56Z

+async def install_agent_profile_endpoint(
+    source: str,
+    provider: str = DEFAULT_PROVIDER,
+    env_vars: Optional[str] = None,


I wonder if there is a better way. Env variables could be sensitive info like API keys. Wonder if it does only known ones.

good catch, this should be a JSON body instead of query parameters, will change

anilkmr-a2z

Mostly minor change. lgtm otherwise. Thanks for the awesome work!

- Move install endpoint to a JSON body model so env_vars (which may contain secrets like API keys) no longer travel as a URL query param. Update the cao-ops-mcp client to post the body and drop the local JSON-string serialization helper. - Extract shared profile-source lookup into utils.agent_profiles._read_agent_profile_source and have both load_agent_profile and the install service use it, so the two call sites cannot drift. - Drop the now-unused _parse_env_vars helper whose name clashed with the install service env parser. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

patricka3125 · 2026-04-20T06:12:08Z

hi @anilkmr-a2z thanks for reviewing. I have made some changes addressing some of your feedback

anilkmr-a2z

Thanks for making the changes.

haofeif · 2026-04-30T07:43:20Z

@patricka3125 is this still valid ? if so mind resolving conflicts and we might do a final review before merge ?

# Conflicts: # src/cli_agent_orchestrator/cli/commands/install.py

patricka3125 · 2026-05-01T10:12:26Z

Hi @haofeif, yes this is still valid. I resolved the conflicts, pushed the updated branch, and added a focused fix to reject invalid providers through the new install service/API path. Ready for final review.

#226) * fix(install): harden agent-profile install against SSRF and path injection Closes CodeQL py/full-ssrf and py/path-injection alerts on the install path added in #166. - URL downloads restricted to https:// with a host allowlist (github.com, raw.githubusercontent.com by default; extend via CAO_PROFILE_ALLOWED_HOSTS env var). - Redirects disabled; explicit is_redirect rejection. - (5, 30)s connect/read timeout to bound worker exposure. - Filename / profile-name regex [A-Za-z0-9_-]{1,64} on every sink. - New allow_file_source kwarg on install_agent(); HTTP API and (transitively) ops-MCP install_profile pass False so remote callers cannot coerce the server into reading arbitrary local files. CLI behaviour unchanged. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(install): close CodeQL #60/#61/#62 with pre-Path validation + URL rebuild Previous hardening commit wrote sanitisers that CodeQL didn't recognise as taint-kills because the checks sat *after* Path() construction and requests.get() received the caller-controlled source string. - _SAFE_URL_PATH_RE validates parsed.path *before* the fetch; the URL handed to requests.get() is rebuilt as f"https://{safe_host}{parsed.path}" where safe_host is pulled from the allowlist literal. Reject query/fragment/ userinfo which have no place in a static .md fetch. - _FILE_PATH_RE validates the source string *before* Path(source).resolve() and Path(source).exists() — the fullmatch regex sits on the data-flow edge into each Path() sink. - Add a CodeQL job to ci.yml (python + js/ts, security-and-quality suite) so future SSRF/path-injection regressions fail CI instead of trickling in as post-merge alerts. - Add scripts/security-scan.sh for local trivy + codeql runs mirroring CI. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(install): close CodeQL #63 + drop conflicting workflow CodeQL job Two follow-ups to the previous hardening commit: 1. Alert #63 (py/path-injection, install_service.py:235) The `elif allow_file_source and _FILE_PATH_RE.fullmatch(source) and Path(source).exists()` guard still tripped the scanner because CodeQL doesn't thread the regex sanitiser through the compound boolean into the Path() call. Fix: dispatch by pure string suffix (`source.endswith(".md")`) — no Path() in install_agent() at all. All path construction happens inside _download_agent(), which already regex-validates before `.resolve()`. 2. The workflow-based `codeql` job conflicted with the repo's existing default-setup CodeQL ("CodeQL analyses from advanced configurations cannot be processed when the default setup is enabled"). Dropped the job and left a comment in ci.yml explaining why; default setup already runs the Analyze (python) / Analyze (js-ts) checks on every PR. 3. SECURITY.md — documented CodeQL coverage, the host allowlist behaviour (`CAO_PROFILE_ALLOWED_HOSTS`), and the scripts/security-scan.sh wrapper. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(install): reject `..` segments in _FILE_PATH_RE (closes CodeQL #61) The previous regex used a character class that included `.` and `/`, so `../../etc/passwd.md` matched and passed into `Path(source).resolve()`. CodeQL was right to flag it — the sanitiser was weaker than advertised. - Add a leading negative lookahead `(?!.*\.\.)` to the file-path regex so any `..` anywhere in the string rejects the source before Path() is constructed. Legitimate `./foo.md`, `/abs/foo.md`, `~/foo.md`, and `sub/dir/foo.md` all still work. - Two new regression tests cover `../../etc/passwd.md` and embedded `/tmp/foo/../etc/passwd.md` traversal shapes. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(install): move file-path handling out of install_service (closes CodeQL #64) Earlier rounds kept `Path(user_input)` reachable inside `install_service` behind a regex sanitiser. Every regex shape that still admitted a legitimate CLI path like `./foo.md` also admitted `../../etc/passwd.md` without an unacceptable normalise+prefix-check — so CodeQL kept correctly flagging the `.resolve()` sink. Structural fix: the shared service doesn't need a file-path branch at all. - `install_service.install_agent()` now accepts only a bare profile name (`_PROFILE_NAME_RE`) or an https:// URL on the host allowlist. - `cli/commands/install.py` grows a `_copy_local_profile_to_store()` helper that does the file reading, stem validation, and copy-into-store itself, then calls the service with the bare validated stem. - `api/main.py` drops the `allow_file_source=False` kwarg — the parameter is gone; the service refuses filesystem paths for every caller. - Tests: remove the file-path branches from the service suite, move that coverage to the CLI suite (`TestCopyLocalProfileToStore` + integration tests on file-source `cao install` invocations). Full test suite (`test/ --ignore=test/e2e -m "not integration"`): 1581/1581 pass. End-to-end smoke of `cao install /tmp/smoke-agent.md --provider kiro_cli` verified. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * style: black reformat test_install.py (extra blank line) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

…228) Every filesystem lookup in _read_agent_profile_source now routes through a new _safe_join helper that normalises with resolve() and verifies containment via relative_to(). This closes the 8 CodeQL py/path-injection alerts remaining after #226 (which hardened only the install flow). The taint source is the /agents/profiles/{name} HTTP endpoint added by #166; the 8 sinks are the flat/nested/exists/read_text calls in the provider-dirs and extra-dirs loops. The helper also handles symlinks that resolve outside the configured store — something the existing _validate_agent_name string check cannot see. Rewrote 9 pre-existing tests that used Path mocks (which silently bypassed the new guard) to use real tmp_path + monkeypatch so the containment check is exercised end-to-end, plus one new test asserting ../escaped is rejected. Verified green: - test/utils unit tests: 43/43 passed - full unit suite: 1586 passed, 1 skipped - e2e ClaudeCode: 12/12 passed - e2e KiroCli: 11/11 passed - e2e GeminiCli: 12/12 passed Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

patricka3125 added 5 commits April 10, 2026 01:32

Extract profile install service for API and CLI

c934cdc

Refine profile lookup and follow-up tests

aa7c72c

Add CAO operations MCP server tools

5ff92f6

Refine ops MCP server follow-ups

0482743

Register ops MCP entry point and pass validation

c8b7b86

patricka3125 marked this pull request as draft April 10, 2026 09:36

Revert out-of-scope validation cleanup changes

ac65014

haofeif added the enhancement New feature or request label Apr 10, 2026

patricka3125 and others added 9 commits April 11, 2026 20:59

Merge branch 'main' into feat/cao-ops-mcp

34deb7d

Add CAO Ops MCP Server section to README

b8673ad

Documents the new cao-ops-mcp server — setup, available tools, and typical workflow — alongside the distinction from cao-mcp-server. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Restore mcp_server comments and error message inadvertently dropped b…

03b7424

…y revert Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

patricka3125 marked this pull request as ready for review April 12, 2026 05:31

haofeif requested review from a team and tuanknguyen April 12, 2026 08:47

fanhongy reviewed Apr 13, 2026

View reviewed changes

patricka3125 and others added 5 commits April 13, 2026 00:44

Merge branch 'upstream' into feat/cao-ops-mcp

c7bf444

patricka3125 marked this pull request as draft April 13, 2026 08:26

patricka3125 marked this pull request as ready for review April 13, 2026 08:37

Merge branch 'main' into feat/cao-ops-mcp

e235aa4

Merge branch 'main' into feat/cao-ops-mcp

9f1b689

patricka3125 mentioned this pull request Apr 16, 2026

feat: Build support for external plugins #172

Merged

8 tasks

Merge branch 'main' into feat/cao-ops-mcp

653bc19

patricka3125 and others added 2 commits April 17, 2026 00:21

style(test): reformat test_install_service.py with black

fc96131

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Merge branch 'main' into feat/cao-ops-mcp

3d560e0

anilkmr-a2z reviewed Apr 20, 2026

View reviewed changes

patricka3125 and others added 2 commits April 19, 2026 23:07

Merge branch 'main' into feat/cao-ops-mcp

ae88d81

anilkmr-a2z approved these changes Apr 20, 2026

View reviewed changes

Merge branch 'main' into feat/cao-ops-mcp

3817cca

Merge remote-tracking branch 'upstream/main' into feat/cao-ops-mcp

d3aa992

# Conflicts: # src/cli_agent_orchestrator/cli/commands/install.py

patricka3125 force-pushed the feat/cao-ops-mcp branch from 39015a2 to d3aa992 Compare May 1, 2026 10:05

Validate install provider in service

6328f35

haofeif merged commit e17d713 into awslabs:main May 1, 2026
29 checks passed

This was referenced May 1, 2026

[Docs]Reorganize README, split detail into topic docs, and add control-plane overview #225

Merged

fix(install): harden agent-profile install against SSRF and path inje… #226

Merged

haofeif mentioned this pull request May 3, 2026

fix(agent_profiles): guard agent-name path lookups against traversal (resolves 8 CodeQL path-injection alerts) #228

Merged

6 tasks

Conversation

patricka3125 commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Motivation

Key Changes

Test Plan

Uh oh!

codecov-commenter commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

fanhongy left a comment

Choose a reason for hiding this comment

Uh oh!

patricka3125 commented Apr 16, 2026

Uh oh!

anilkmr-a2z commented Apr 16, 2026

Uh oh!

anilkmr-a2z commented Apr 16, 2026

Uh oh!

patricka3125 commented Apr 16, 2026

Uh oh!

anilkmr-a2z commented Apr 17, 2026

Uh oh!

haofeif commented Apr 17, 2026

Uh oh!

anilkmr-a2z Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

patricka3125 Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

anilkmr-a2z left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

patricka3125 commented Apr 20, 2026

Uh oh!

anilkmr-a2z left a comment

Choose a reason for hiding this comment

Uh oh!

haofeif commented Apr 30, 2026

Uh oh!

patricka3125 commented May 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

patricka3125 commented Apr 10, 2026 •

edited

Loading

codecov-commenter commented Apr 10, 2026 •

edited

Loading