Skip to content

feat: Build an MCP server for cao operations#166

Merged
haofeif merged 32 commits into
awslabs:mainfrom
patricka3125:feat/cao-ops-mcp
May 1, 2026
Merged

feat: Build an MCP server for cao operations#166
haofeif merged 32 commits into
awslabs:mainfrom
patricka3125:feat/cao-ops-mcp

Conversation

@patricka3125

@patricka3125 patricka3125 commented Apr 10, 2026

Copy link
Copy Markdown
Collaborator

Overview

Addresses #161

Introduces cao-ops-mcp, a new MCP server that exposes CAO management operations as structured tools for a user's primary agent. It enables end-to-end agent-driven CAO workflows — discovering and installing profiles, launching sessions, delivering prompts, monitoring progress, and shutting down — without leaving the agent interface or switching to a separate terminal.

Motivation

All CAO management operations currently require either the cao CLI or direct HTTP API calls in a separate terminal. The existing cao-mcp-server does not address this — its tools (handoff, assign, send_message) are scoped to inter-agent orchestration within active CAO sessions, not to managing CAO itself.

cao-ops-mcp fills this gap as the agentic control plane alongside the existing Web UI (visual control plane) and CLI (manual control plane). A user's primary agent can now manage the full CAO lifecycle within a single conversation — no terminal switching required.

Key Changes

File What changed and why
src/.../ops_mcp_server/server.py New MCP server (cao-ops-mcp) with 8 tools across two groups: profile management (list_profiles, get_profile_details, install_profile) and session lifecycle (launch_session, send_session_message, list_sessions, get_session_info, shutdown_session)
src/.../ops_mcp_server/models.py Pydantic response models for the ops MCP tools (LaunchResult, InstallResult, ProfileListResult, SessionListResult, SendMessageResult)
src/.../ops_mcp_server/__init__.py Package init for the new ops_mcp_server module
src/.../services/install_service.py New service layer extracted from the CLI install command — pure install_agent() function returning a structured InstallResult, reusable by both the CLI and the new API endpoint
src/.../api/main.py Two new API endpoints: GET /agents/profiles/{name} (full profile content) and POST /agents/profiles/install (install via service layer), both consumed by the ops MCP tools
src/.../cli/commands/install.py Refactored to a thin wrapper over install_service.install_agent() — behavior is unchanged, install logic now lives in the service layer
pyproject.toml Registers the cao-ops-mcp-server entry point
README.md Adds a CAO Ops MCP Server section documenting setup, available tools, and the typical workflow
test/ops_mcp_server/test_server.py Unit tests for all 8 MCP tools with mocked API calls
test/api/test_api_profiles.py Unit tests for the two new API endpoints
test/services/test_install_service.py Unit tests for the extracted install service (each source type, each provider path, env var injection, error cases)
test/cli/commands/test_install.py Slimmed down to cover only CLI-level behavior; install logic tests moved to test_install_service.py

Test Plan

  • uv run pytest test/ --ignore=test/e2e -v — full unit test suite passes with no regressions
  • uv run cao-ops-mcp-server — server starts without errors (exits cleanly when stdin closes)
  • Test implemented MCP tools manually with MCP inspector: list_profilesinstall_profilelaunch_sessionsend_session_messageget_session_infoshutdown_session
    (uvx --with-editable . fastmcp dev inspector src/cli_agent_orchestrator/ops_mcp_server/server.py)

@patricka3125 patricka3125 marked this pull request as draft April 10, 2026 09:36
@codecov-commenter

codecov-commenter commented Apr 10, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 96.24277% with 13 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (main@f650aa2). Learn more about missing BASE report.

Files with missing lines Patch % Lines
...rc/cli_agent_orchestrator/ops_mcp_server/server.py 89.56% 12 Missing ⚠️
...cli_agent_orchestrator/services/install_service.py 99.22% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main     #166   +/-   ##
=======================================
  Coverage        ?   92.40%           
=======================================
  Files           ?       64           
  Lines           ?     5186           
  Branches        ?        0           
=======================================
  Hits            ?     4792           
  Misses          ?      394           
  Partials        ?        0           
Flag Coverage Δ
unittests 92.40% <96.24%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@haofeif haofeif added the enhancement New feature or request label Apr 10, 2026
patricka3125 and others added 9 commits April 11, 2026 20:59
Add tests for invalid session-create responses, prompt delivery
failures, and rejected non-.md install sources so the corresponding
branches in the ops MCP server and install service are exercised.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move the hard-coded 120s terminal ready timeout into a shared
TERMINAL_READY_TIMEOUT constant so both cao-mcp-server handoff and
cao-ops-mcp launch consume the same value, and narrow the broad
except clauses in install_agent and the ops MCP _request_json helper
so real bugs propagate instead of being flattened into error strings.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Wrap discover_profiles and list_sessions in ProfileListResult and
SessionListResult envelopes so every tool returns a consistent shape
with a success flag instead of mixing list and error dict payloads.
Rewrite the tool docstrings in cao-mcp style and document the
install_profile source-resolution order, including the CWD-collision
gotcha where a bare agent name matching a file in the working
directory routes to path resolution.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Shorten TERMINAL_READY_TIMEOUT from 120s to 30s since it acts only as
a fallback after the provider's own initialize() hook has already
run. Rename the discover_profiles tool to list_profiles so the name
matches its sibling list_sessions and more accurately describes the
operation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove the prompt parameter from launch_session so it returns session
identifiers immediately without waiting. Add send_session_message which
delivers messages via the inbox API (POST /terminals/{id}/inbox/messages)
using sender_id="cao-ops-mcp", consistent with how the cao-mcp server
delivers messages but without requiring a CAO_TERMINAL_ID env var.

Add SendMessageResult model and 4 new tests for the new tool. Remove 3
now-irrelevant prompt-flow tests from the launch path.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
These changes were out of scope for the ops MCP PR. Remove the constant
from constants.py and restore the hardcoded 120.0 timeout in mcp_server
server.py to match the original upstream state.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Documents the new cao-ops-mcp server — setup, available tools, and
typical workflow — alongside the distinction from cao-mcp-server.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…y revert

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@patricka3125 patricka3125 marked this pull request as ready for review April 12, 2026 05:31
@haofeif haofeif requested review from a team and tuanknguyen April 12, 2026 08:47

@fanhongy fanhongy left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look good. conflict though (due to PR #145 merged)

Please resolve.

patricka3125 and others added 5 commits April 13, 2026 00:44
Upstream added compose_agent_prompt baking for Q/Copilot and skill://
resource support for Kiro during install. The merge conflict resolution
kept the feature branch's service-layer thin wrapper but left those
behaviours out of install_service.py.

- Q: use compose_agent_prompt(profile) to bake skill catalog into prompt
- Kiro: add skill://<SKILLS_DIR>/**/SKILL.md resource; use raw prompt
  (None when empty) so Kiro's native progressive loader handles skills
- Copilot: use compose_agent_prompt(profile, base_prompt=...) to bake
  catalog into the .agent.md body

TestInstallSkillCatalogBaking ported from upstream test_install.py into
test_install_service.py with patch targets corrected for the service layer.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Audit of the upstream/main diff found two classes of upstream tests
that were dropped during merge conflict resolution without replacement:

1. TestInstallCommandEnvFlags (8 tests) — env var injection, context
   file secret isolation, unresolved-var detection, and env file
   lifecycle. Ported as TestInstallAgentEnvBehaviour in
   test_install_service.py against install_agent directly.

2. test_install_general_error — upstream CLI caught bare Exception;
   the service only caught (ValueError, OSError). Added except Exception
   fallback to install_agent and a matching test
   test_install_returns_failure_for_unexpected_errors.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
1. Collapse dead except (ValueError, OSError) into except Exception in
   install_agent — both produced the same message; narrower branch was
   unreachable after the broad fallback was added.

2. Restore upstream UX in install.py CLI wrapper:
   - Unresolved-vars warning now reads "Unresolved env var(s) in
     profile: X. Set them with \`cao env set\` or pass --env KEY=VALUE."
   - "Set N env var(s)" now uses len(env_vars) (raw tuple) so duplicate
     --env flags count correctly instead of deduplicating via dict.

3. Fix env_vars round-trip between ops_mcp_server and API: switch from
   comma-separated KEY=VALUE (breaks on values containing ',') to
   JSON-encoded dict. _serialize_env_vars emits json.dumps, _parse_env_vars
   parses json.loads.

4. Return InstallResult directly from the FastAPI install endpoint with
   a typed return annotation instead of result.model_dump() — gives
   FastAPI a proper OpenAPI schema.

5. Inline the one-line _install_result_from_error helper in
   ops_mcp_server/server.py.

6. Add parametrized coverage for the three previously-uncovered
   _resolve_named_source branches: flat provider dir layout, extra dir
   flat layout, extra dir nested layout.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
parse_env_assignment was the only caller in _parse_env_vars; the JSON
migration in f8ad731 removed that call site, leaving an F401 dead
import flagged by Ruff.

Also remove the `import json as _json` local shadow inside
_parse_env_vars — json is already imported at module top level.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@patricka3125 patricka3125 marked this pull request as draft April 13, 2026 08:26
…lResult

Upstream's inline install command printed "Downloaded agent from URL to
local store" and "Copied agent from file to local store" before the
success message — this was lost when install logic was extracted to the
service layer.  Adds a source_kind discriminator ('url'|'file'|'name')
to InstallResult and wires it through the CLI output and tests.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@patricka3125 patricka3125 marked this pull request as ready for review April 13, 2026 08:37
@patricka3125

Copy link
Copy Markdown
Collaborator Author

hi @fanhongy would you mind taking another look when you have a chance? Appreciate the review!

@anilkmr-a2z

Copy link
Copy Markdown
Contributor

Actually, I was thinking of this. I ended up implementing something like a command line to interact with cao to avoid introducing one more mcp server. Although your setup seems cleaner.

@anilkmr-a2z

Copy link
Copy Markdown
Contributor

Also, similar to your other CRs. Mind squashing these commits. Not all of these need to be in the main repo.

@patricka3125

Copy link
Copy Markdown
Collaborator Author

Hi @anilkmr-a2z I believe the commits would be automatically squashed by github before merge.

@anilkmr-a2z

Copy link
Copy Markdown
Contributor

Ahh! I didn't know that. It still makes it very hard to review though. So, I could not go through all files in 25 commits. Is there any recommendation on how to review those ?

@haofeif

haofeif commented Apr 17, 2026

Copy link
Copy Markdown
Contributor

Ahh! I didn't know that. It still makes it very hard to review though. So, I could not go through all files in 25 commits. Is there any recommendation on how to review those ?

@anilkmr-a2z how about reviewing the latest commit with the files changed ? dont worry how many commits were there before ? as when we merge we will squash merge into one commit.

patricka3125 and others added 2 commits April 17, 2026 00:21
Comment thread src/cli_agent_orchestrator/api/main.py Outdated
async def install_agent_profile_endpoint(
source: str,
provider: str = DEFAULT_PROVIDER,
env_vars: Optional[str] = None,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if there is a better way. Env variables could be sensitive info like API keys. Wonder if it does only known ones.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, this should be a JSON body instead of query parameters, will change

@anilkmr-a2z anilkmr-a2z left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly minor change. lgtm otherwise. Thanks for the awesome work!

Comment thread src/cli_agent_orchestrator/services/install_service.py Outdated
Comment thread src/cli_agent_orchestrator/api/main.py Outdated
Comment thread src/cli_agent_orchestrator/api/main.py Outdated
Comment thread src/cli_agent_orchestrator/api/main.py Outdated
Comment thread src/cli_agent_orchestrator/api/main.py
patricka3125 and others added 2 commits April 19, 2026 23:07
- Move install endpoint to a JSON body model so env_vars (which may
  contain secrets like API keys) no longer travel as a URL query param.
  Update the cao-ops-mcp client to post the body and drop the local
  JSON-string serialization helper.
- Extract shared profile-source lookup into
  utils.agent_profiles._read_agent_profile_source and have both
  load_agent_profile and the install service use it, so the two call
  sites cannot drift.
- Drop the now-unused _parse_env_vars helper whose name clashed with the
  install service env parser.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@patricka3125

Copy link
Copy Markdown
Collaborator Author

hi @anilkmr-a2z thanks for reviewing. I have made some changes addressing some of your feedback

@anilkmr-a2z anilkmr-a2z left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making the changes.

@haofeif

haofeif commented Apr 30, 2026

Copy link
Copy Markdown
Contributor

@patricka3125 is this still valid ? if so mind resolving conflicts and we might do a final review before merge ?

# Conflicts:
#	src/cli_agent_orchestrator/cli/commands/install.py
@patricka3125

Copy link
Copy Markdown
Collaborator Author

Hi @haofeif, yes this is still valid. I resolved the conflicts, pushed the updated branch, and added a focused fix to reject invalid providers through the new install service/API path. Ready for final review.

@haofeif haofeif merged commit e17d713 into awslabs:main May 1, 2026
29 checks passed
haofeif added a commit that referenced this pull request May 2, 2026
#226)

* fix(install): harden agent-profile install against SSRF and path injection

Closes CodeQL py/full-ssrf and py/path-injection alerts on the install
path added in #166.

- URL downloads restricted to https:// with a host allowlist
  (github.com, raw.githubusercontent.com by default; extend via
  CAO_PROFILE_ALLOWED_HOSTS env var).
- Redirects disabled; explicit is_redirect rejection.
- (5, 30)s connect/read timeout to bound worker exposure.
- Filename / profile-name regex [A-Za-z0-9_-]{1,64} on every sink.
- New allow_file_source kwarg on install_agent(); HTTP API and
  (transitively) ops-MCP install_profile pass False so remote callers
  cannot coerce the server into reading arbitrary local files.

CLI behaviour unchanged.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(install): close CodeQL #60/#61/#62 with pre-Path validation + URL rebuild

Previous hardening commit wrote sanitisers that CodeQL didn't recognise as
taint-kills because the checks sat *after* Path() construction and
requests.get() received the caller-controlled source string.

- _SAFE_URL_PATH_RE validates parsed.path *before* the fetch; the URL handed
  to requests.get() is rebuilt as f"https://{safe_host}{parsed.path}" where
  safe_host is pulled from the allowlist literal. Reject query/fragment/
  userinfo which have no place in a static .md fetch.
- _FILE_PATH_RE validates the source string *before* Path(source).resolve()
  and Path(source).exists() — the fullmatch regex sits on the data-flow
  edge into each Path() sink.
- Add a CodeQL job to ci.yml (python + js/ts, security-and-quality suite)
  so future SSRF/path-injection regressions fail CI instead of trickling
  in as post-merge alerts.
- Add scripts/security-scan.sh for local trivy + codeql runs mirroring CI.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(install): close CodeQL #63 + drop conflicting workflow CodeQL job

Two follow-ups to the previous hardening commit:

1. Alert #63 (py/path-injection, install_service.py:235)
   The `elif allow_file_source and _FILE_PATH_RE.fullmatch(source)
   and Path(source).exists()` guard still tripped the scanner because
   CodeQL doesn't thread the regex sanitiser through the compound
   boolean into the Path() call. Fix: dispatch by pure string suffix
   (`source.endswith(".md")`) — no Path() in install_agent() at all.
   All path construction happens inside _download_agent(), which already
   regex-validates before `.resolve()`.

2. The workflow-based `codeql` job conflicted with the repo's existing
   default-setup CodeQL ("CodeQL analyses from advanced configurations
   cannot be processed when the default setup is enabled"). Dropped the
   job and left a comment in ci.yml explaining why; default setup already
   runs the Analyze (python) / Analyze (js-ts) checks on every PR.

3. SECURITY.md — documented CodeQL coverage, the host allowlist behaviour
   (`CAO_PROFILE_ALLOWED_HOSTS`), and the scripts/security-scan.sh wrapper.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(install): reject `..` segments in _FILE_PATH_RE (closes CodeQL #61)

The previous regex used a character class that included `.` and `/`, so
`../../etc/passwd.md` matched and passed into `Path(source).resolve()`.
CodeQL was right to flag it — the sanitiser was weaker than advertised.

- Add a leading negative lookahead `(?!.*\.\.)` to the file-path regex so
  any `..` anywhere in the string rejects the source before Path() is
  constructed. Legitimate `./foo.md`, `/abs/foo.md`, `~/foo.md`, and
  `sub/dir/foo.md` all still work.
- Two new regression tests cover `../../etc/passwd.md` and embedded
  `/tmp/foo/../etc/passwd.md` traversal shapes.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(install): move file-path handling out of install_service (closes CodeQL #64)

Earlier rounds kept `Path(user_input)` reachable inside `install_service` behind
a regex sanitiser. Every regex shape that still admitted a legitimate CLI path
like `./foo.md` also admitted `../../etc/passwd.md` without an unacceptable
normalise+prefix-check — so CodeQL kept correctly flagging the `.resolve()`
sink.

Structural fix: the shared service doesn't need a file-path branch at all.

- `install_service.install_agent()` now accepts only a bare profile name
  (`_PROFILE_NAME_RE`) or an https:// URL on the host allowlist.
- `cli/commands/install.py` grows a `_copy_local_profile_to_store()` helper
  that does the file reading, stem validation, and copy-into-store itself,
  then calls the service with the bare validated stem.
- `api/main.py` drops the `allow_file_source=False` kwarg — the parameter
  is gone; the service refuses filesystem paths for every caller.
- Tests: remove the file-path branches from the service suite, move that
  coverage to the CLI suite (`TestCopyLocalProfileToStore` + integration
  tests on file-source `cao install` invocations).

Full test suite (`test/ --ignore=test/e2e -m "not integration"`): 1581/1581
pass. End-to-end smoke of `cao install /tmp/smoke-agent.md --provider kiro_cli`
verified.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* style: black reformat test_install.py (extra blank line)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
haofeif added a commit that referenced this pull request May 5, 2026
…228)

Every filesystem lookup in _read_agent_profile_source now routes
through a new _safe_join helper that normalises with resolve() and
verifies containment via relative_to(). This closes the 8 CodeQL
py/path-injection alerts remaining after #226 (which hardened only
the install flow).

The taint source is the /agents/profiles/{name} HTTP endpoint added
by #166; the 8 sinks are the flat/nested/exists/read_text calls in
the provider-dirs and extra-dirs loops. The helper also handles
symlinks that resolve outside the configured store — something the
existing _validate_agent_name string check cannot see.

Rewrote 9 pre-existing tests that used Path mocks (which silently
bypassed the new guard) to use real tmp_path + monkeypatch so the
containment check is exercised end-to-end, plus one new test
asserting ../escaped is rejected.

Verified green:
- test/utils unit tests: 43/43 passed
- full unit suite: 1586 passed, 1 skipped
- e2e ClaudeCode: 12/12 passed
- e2e KiroCli: 11/11 passed
- e2e GeminiCli: 12/12 passed

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants