Skip to content

feat(launch): forward env vars to supervisor and child agents#259

Merged
haofeif merged 4 commits into
awslabs:mainfrom
call-me-ram:fix/issue-248-env-forwarding
May 26, 2026
Merged

feat(launch): forward env vars to supervisor and child agents#259
haofeif merged 4 commits into
awslabs:mainfrom
call-me-ram:fix/issue-248-env-forwarding

Conversation

@call-me-ram

@call-me-ram call-me-ram commented May 24, 2026

Copy link
Copy Markdown
Contributor

Fixes #248.

Summary

  • cao launch --env KEY=VALUE (repeatable) now forwards to both the supervisor terminal AND every worker spawned later in the same session via assign / handoff / the web UI.
  • Validation runs at the CLI boundary (POSIX-name keys, blocked prefixes CLAUDE/CODEX_/__MISE_ with the six CLAUDE_CODE_USE_* / CLAUDE_CODE_SKIP_* auth flags allowlisted, 2048-byte value cap) so a bad entry fails fast instead of getting silently dropped server-side.
  • Values travel in the JSON body of POST /sessions, not the URL — keeps anything resembling a secret out of cao-server's HTTP access log.

Why this matters

The strict tmux env allowlist from #246 keeps the tmux new-session -e argv under the kernel limit but also drops anything outside CAO_/KIRO_/MISE_/AWS_ prefixes. Operators reported that arbitrary deployment context they wanted on their agents (e.g. MNEMOSYNE_DIR, ISAAC_CHANNEL=room:engineering) was silently disappearing. --env is the explicit opt-out for that allowlist on a per-session basis.

The naive approach — only honour --env on the first agent — would have broken the multi-agent case where a supervisor's assign-spawned analysts still need the same context. This PR persists the mapping on a per-session in-memory record so create_window picks it up on every spawn automatically.

Architecture

  • services/session_env.py — small thread-safe in-memory store (set / get / clear), held in cao-server's process. No schema migration; restarts wipe it.
  • clients/tmux.pycreate_session and create_window accept extra_env. New _merge_extra_env classmethod mirrors the same blocked-prefix / size-cap checks as the inherited-env filter, defending callers that bypass the CLI (cao-mcp-server, direct HTTP). Pre-existing constants extracted onto the class so both filters share them.
  • services/terminal_service.pycreate_terminal writes env_vars on new_session=True and looks up the persisted mapping on every spawn.
  • services/session_service.pycreate_session accepts env_vars; delete_session clears the mapping.
  • api/main.pyPOST /sessions accepts env_vars as an embedded JSON body field ({\"env_vars\": {...}}). Optional, so existing callers that send only query params remain compatible.
  • cli/commands/launch.py--env KEY=VALUE Click option, _parse_env_pairs validator, sends body only when at least one --env was provided.

Verification

$ cao launch --agents code_supervisor \
    --env MNEMOSYNE_DIR=/root/mnemosyne \
    --env ISAAC_CHANNEL=room:engineering
# supervisor terminal sees both vars ✓
# assign(data_analyst) → worker sees both vars ✓
# handoff(report_gen)  → worker sees both vars ✓

Bad inputs rejected at the CLI:

  • --env CLAUDE_SESSION_ID=abc--env key 'CLAUDE_SESSION_ID' uses a blocked prefix...
  • --env 1FOO=bar--env key must match [A-Za-z_][A-Za-z0-9_]* (got '1FOO')
  • --env BIG=... (≥2048 bytes) → --env value for 'BIG' exceeds 2048 bytes (tmux argv limit, PR #246)

Test plan

  • Run unit tests (uv run pytest test/cli/commands/test_launch.py test/services/test_session_env.py test/clients/test_tmux_merge_extra_env.py) — 58 passed locally.
  • Black + isort clean on touched files.
  • Manual launch with --env, verify the var appears in the supervisor's shell (env | grep … inside tmux attach).
  • Manual assign from that supervisor to a worker; verify the worker's shell also sees the var.
  • Manual launch with no --env to confirm POST /sessions still sends no body (regression guard for backward-compat callers).
  • Verify a deliberately blocked prefix (--env CLAUDE_SESSION_ID=abc) errors at the CLI before any HTTP call is made.

Notes

Forwarded vars are process-local on cao-server and dropped on session delete; restarting cao-server wipes them. No on-disk format, no migration, and the issue explicitly accepts that scope.

…s#248)

`cao launch --env KEY=VALUE` (repeatable) now reaches both the supervisor
terminal and every worker spawned later in the same session via
`assign` / `handoff` / the web UI. Previously the strict tmux env
allowlist (PR awslabs#246) silently dropped anything outside
`CAO_/KIRO_/MISE_/AWS_` prefixes, so operators could not forward
arbitrary deployment context (e.g. `MNEMOSYNE_DIR`,
`ISAAC_CHANNEL=room:engineering`) to their agents.

Wiring:

  CLI parses --env at the boundary, rejecting bad keys (POSIX names
  only), blocked prefixes (`CLAUDE`/`CODEX_`/`__MISE_`, with the six
  `CLAUDE_CODE_USE_*` / `CLAUDE_CODE_SKIP_*` auth flags explicitly
  allowlisted), and >=2048-byte values. Values travel in the JSON body
  of `POST /sessions`, not the URL, so secrets stay out of access logs.
  Server persists them in a per-session in-memory store; `create_window`
  reads from it on every spawn so the fanout to workers is automatic.
  `delete_session` drops the mapping. `TmuxClient._merge_extra_env`
  mirrors the same blocked-prefix / size-cap checks as a defensive
  second layer for callers that bypass the CLI (cao-mcp-server, direct
  HTTP).

Tests cover the parser branches, the body-not-URL shape, fail-fast on
blocked prefixes, the per-session store roundtrip, and the merge
helper's prefix/cap behaviour.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds explicit forwarding of operator-provided environment variables from cao launch --env KEY=VALUE into the tmux supervisor and all subsequently spawned worker agents within the same session, routing values via the POST /sessions JSON body to avoid leaking them in URL/query logs.

Changes:

  • Add --env KEY=VALUE (repeatable) to cao launch, including CLI-side validation and sending env_vars in the request body only when provided.
  • Persist per-session forwarded env vars in an in-memory server-side store and apply them automatically on later worker spawns.
  • Extend TmuxClient session/window creation to accept extra_env with server-side defensive filtering; add tests and documentation.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/cli_agent_orchestrator/cli/commands/launch.py Adds --env option, validation, and conditional JSON body for POST /sessions.
src/cli_agent_orchestrator/api/main.py Accepts optional env_vars in the embedded JSON body for session creation.
src/cli_agent_orchestrator/services/session_env.py New in-memory per-session forwarded-env store.
src/cli_agent_orchestrator/services/session_service.py Threads env_vars into session creation and clears stored mapping on delete.
src/cli_agent_orchestrator/services/terminal_service.py Persists forwarded env on session creation and injects stored env on window spawns.
src/cli_agent_orchestrator/clients/tmux.py Adds extra_env support to tmux session/window creation and merges with safety checks.
test/cli/commands/test_launch.py Adds unit tests for --env parsing and request-body behavior.
test/services/test_session_env.py Adds tests for per-session env store behavior.
test/clients/test_tmux_merge_extra_env.py Adds tests for server-side merging/filtering of forwarded env.
docs/tmux.md Documents env forwarding behavior, constraints, and lifecycle.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/cli_agent_orchestrator/services/terminal_service.py Outdated
Comment thread test/clients/test_tmux_merge_extra_env.py Outdated
If tmux session creation failed, the forwarded env mapping was already
stored and would leak in memory plus risk being inherited by a later
session that reused the same name with no --env. Now we clear any
stale mapping up front, persist only after create_session returns, and
clear again in the exception cleanup path when we kill the session.

Also reword a misleading test comment: the blocked-prefix allowlist
matches exact keys, not arbitrary prefixes.
Pre-existing test_create_session_success used assert_called_once_with
which is strict — adding the optional env_vars kwarg in awslabs#248 made the
expected call signature diverge by one item even when the value is None.
@call-me-ram call-me-ram force-pushed the fix/issue-248-env-forwarding branch from 73bfb0b to 1d89253 Compare May 26, 2026 13:14

@haofeif haofeif left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@haofeif haofeif merged commit 7906e83 into awslabs:main May 26, 2026
8 checks passed
@codecov-commenter

codecov-commenter commented May 26, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 98.57143% with 1 line in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (main@45898e8). Learn more about missing BASE report.

Files with missing lines Patch % Lines
...li_agent_orchestrator/services/terminal_service.py 83.33% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main     #259   +/-   ##
=======================================
  Coverage        ?   92.46%           
=======================================
  Files           ?       69           
  Lines           ?     6260           
  Branches        ?        0           
=======================================
  Hits            ?     5788           
  Misses          ?      472           
  Partials        ?        0           
Flag Coverage Δ
unittests 92.46% <98.57%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

call-me-ram added a commit to call-me-ram/cli-agent-orchestrator that referenced this pull request Jun 3, 2026
Bring the event-driven architecture branch up to date with main (98
commits) and reconcile the rewrite with features that landed after it
forked: eager inbox delivery (awslabs#251), the OpenCode poller, env-var
forwarding (awslabs#259), memory curation (awslabs#254/awslabs#262), CORS auto-derive (awslabs#261),
DNS host validation (awslabs#124), and the self-send guard (awslabs#24).

Highlights:
- Providers adopt the async initialize() + get_status(buffer) contract;
  copilot_cli/opencode_cli converted; kiro keeps colour-only ANSI
  stripping so carriage-return-redraw permission prompts aren't misread
  as idle.
- Event-driven InboxService.deliver_pending with the awslabs#251 eager gate and
  message-sender attribution; OpenCode poller retained as a status-driven
  method; the watchdog (PollingObserver/LogFileHandler) is removed.
- terminal_service.create_terminal is async (FIFO + StatusMonitor wiring);
  session_service.create_session, flow_service.execute_flow, the API
  endpoints, and `cao flow run` updated to await.
- memory_service curated path and the flow CLI fixed to the new contract.

Full unit suite green (1908 passed); black + isort clean.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: env var forwarding to spawned agent tmux sessions

4 participants