fix(tmux): filter environment to prevent 'command too long' errors#246
Conversation
Restrict env vars passed to tmux new-session to essential keys (HOME, PATH, SHELL, etc.) plus CAO_*, KIRO_*, and MISE_* prefixes. Block __MISE_ internal vars and skip any value >= 2048 bytes. This prevents 'new-session: command too long' failures in environments with many or large environment variables. Fixes awslabs#242
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #246 +/- ##
=======================================
Coverage ? 92.87%
=======================================
Files ? 65
Lines ? 5498
Branches ? 0
=======================================
Hits ? 5106
Misses ? 392
Partials ? 0
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
This PR updates the tmux session creation logic to prevent tmux new-session: command too long failures by aggressively reducing the environment passed to libtmux.Server.new_session(), addressing issue #242.
Changes:
- Whitelist a small set of essential environment keys and allow only
CAO_*,KIRO_*, andMISE_*-prefixed variables (plus specificCLAUDE_CODE_*auth variables). - Block additional internal prefixes (
__MISE_) and drop environment variables with very large values (intended cutoff:>= 2048bytes). - Keep
CAO_TERMINAL_IDinjected into the tmux session environment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Only pass essential env vars to avoid tmux "command too long" | ||
| essential_keys = { | ||
| "HOME", | ||
| "PATH", | ||
| "SHELL", | ||
| "USER", | ||
| "LANG", | ||
| "TERM", | ||
| "SSH_AUTH_SOCK", | ||
| "DISPLAY", | ||
| "XDG_RUNTIME_DIR", | ||
| "AWS_PROFILE", | ||
| "AWS_REGION", | ||
| "AWS_DEFAULT_REGION", | ||
| "DO_NOT_TRACK", | ||
| } | ||
| environment = { | ||
| k: v | ||
| for k, v in os.environ.items() | ||
| if k in allowed_vars or not any(k.startswith(p) for p in blocked_prefixes) | ||
| if ( | ||
| k in essential_keys | ||
| or k in allowed_vars | ||
| or ( | ||
| not any(k.startswith(p) for p in blocked_prefixes) | ||
| and k.startswith(("CAO_", "KIRO_", "MISE_")) | ||
| ) | ||
| ) | ||
| and len(v) < 2048 | ||
| } |
haofeif
left a comment
There was a problem hiding this comment.
@gutosantos82 thanks for the quick fixes, added some of comments too
- Move len() check inside prefix branch only so essential keys (PATH)
are never dropped regardless of size
- Use len(v.encode('utf-8')) for byte-accurate comparison matching tmux's
actual argument limit
- Add LC_ALL and LC_CTYPE to essential_keys for proper locale propagation
- Replace explicit AWS_PROFILE/AWS_REGION/AWS_DEFAULT_REGION with AWS_
prefix to pass all AWS env vars through
- Add unit tests for env filtering logic
haofeif
left a comment
There was a problem hiding this comment.
Thanks @gutosantos82 LGTM
* feat(launch): forward env vars to supervisor and child agents (#248) `cao launch --env KEY=VALUE` (repeatable) now reaches both the supervisor terminal and every worker spawned later in the same session via `assign` / `handoff` / the web UI. Previously the strict tmux env allowlist (PR #246) silently dropped anything outside `CAO_/KIRO_/MISE_/AWS_` prefixes, so operators could not forward arbitrary deployment context (e.g. `MNEMOSYNE_DIR`, `ISAAC_CHANNEL=room:engineering`) to their agents. Wiring: CLI parses --env at the boundary, rejecting bad keys (POSIX names only), blocked prefixes (`CLAUDE`/`CODEX_`/`__MISE_`, with the six `CLAUDE_CODE_USE_*` / `CLAUDE_CODE_SKIP_*` auth flags explicitly allowlisted), and >=2048-byte values. Values travel in the JSON body of `POST /sessions`, not the URL, so secrets stay out of access logs. Server persists them in a per-session in-memory store; `create_window` reads from it on every spawn so the fanout to workers is automatic. `delete_session` drops the mapping. `TmuxClient._merge_extra_env` mirrors the same blocked-prefix / size-cap checks as a defensive second layer for callers that bypass the CLI (cao-mcp-server, direct HTTP). Tests cover the parser branches, the body-not-URL shape, fail-fast on blocked prefixes, the per-session store roundtrip, and the merge helper's prefix/cap behaviour. * fix(launch): persist forwarded env only after tmux create succeeds If tmux session creation failed, the forwarded env mapping was already stored and would leak in memory plus risk being inherited by a later session that reused the same name with no --env. Now we clear any stale mapping up front, persist only after create_session returns, and clear again in the exception cleanup path when we kill the session. Also reword a misleading test comment: the blocked-prefix allowlist matches exact keys, not arbitrary prefixes. * test(api): include env_vars in create_session assertion Pre-existing test_create_session_success used assert_called_once_with which is strict — adding the optional env_vars kwarg in #248 made the expected call signature diverge by one item even when the value is None.
Summary
Restrict env vars passed to tmux
new-sessionto essential keys (HOME,PATH,SHELL, etc.) plusCAO_*,KIRO_*, andMISE_*prefixes. Block__MISE_internal vars and skip any value >= 2048 bytes.Problem
cao launchfails withnew-session: command too longwhen the host environment contains many or large environment variables (e.g., mise/asdf internals, large system prompts injected via env).Fix
__MISE_to blocked prefixesCAO_*/KIRO_*/MISE_*prefixesTesting
test/clients/test_tmux_client.pytests passFixes #242