Skip to content

Change codex command option to '--yolo'#105

Closed
reyoung wants to merge 3 commits into
awslabs:mainfrom
reyoung:main
Closed

Change codex command option to '--yolo'#105
reyoung wants to merge 3 commits into
awslabs:mainfrom
reyoung:main

Conversation

@reyoung

@reyoung reyoung commented Mar 10, 2026

Copy link
Copy Markdown

Issue #, if available:

Description of changes:

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Copilot AI review requested due to automatic review settings March 10, 2026 03:48

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the _build_codex_command method in the Codex provider to include the --yolo flag when constructing the codex CLI command.

Changes:

  • Added --yolo flag to the codex command construction

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@haofeif

haofeif commented Mar 10, 2026

Copy link
Copy Markdown
Contributor

Thanks for the awesome PR @reyoung. This is a valid gap. Claude Code already hardcodes --dangerously-skip-permissions and Gemini CLI hardcodes --yolo for the same reason: approval prompts block handoff/assign flows in non-interactive tmux sessions. Codex was the odd one out.

A couple asks before we can merge:

  1. Add a comment explaining why --yolo is needed, similar to the existing comment in claude_code.py (lines 63-66):
# --yolo (alias for --dangerously-bypass-approvals-and-sandbox):
# bypass approval prompts and sandboxing. CAO agents run in
# non-interactive tmux sessions where interactive approval prompts
# block handoff/assign flows. This mirrors Claude Code's
# --dangerously-skip-permissions and Gemini CLI's --yolo flags.
  1. Update the unit tests — there are 5 places in test/providers/test_codex_provider_unit.py that assert the base command string "codex --no-alt-screen --disable shell_snapshot". These need to be updated to include --yolo. (I guess that was why the unit testing failed)
  • Specifically in test/providers/test_codex_provider_unit.py — there are 5 assertions that check the exact command string and will fail if --yolo is added without updating them:

    • Line 39: "codex --no-alt-screen --disable shell_snapshot" (initialize test)
    • Line 71: assert command == "codex --no-alt-screen --disable shell_snapshot" (no profile)
    • Line 84: assert "codex --no-alt-screen --disable shell_snapshot" in command (with profile)
    • Line 196: assert command == "codex --no-alt-screen --disable shell_snapshot" (empty prompt)
    • Line 209: assert command == "codex --no-alt-screen --disable shell_snapshot" (None prompt)

@haofeif haofeif requested review from a team and haofeif March 10, 2026 06:51
@haofeif haofeif added the bug Something isn't working label Mar 10, 2026

@haofeif haofeif left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please address the PR feedback, many thanks for all your help!

Copilot AI review requested due to automatic review settings March 13, 2026 10:43

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot was unable to review this pull request because the user who requested the review is ineligible. To be eligible to request a review, you need a paid Copilot license, or your organization must enable Copilot code review.

@haofeif

haofeif commented Mar 17, 2026

Copy link
Copy Markdown
Contributor

Please address the PR feedback, many thanks for all your help!

Hi @reyoung do you get a chance to address the feedback pls ? Many thanks

haofeif added a commit that referenced this pull request Mar 21, 2026
Merges the intent of PR #105. CAO agents run in non-interactive tmux
sessions where Codex's interactive approval prompts block handoff/assign
flows. The --yolo flag (alias for --dangerously-bypass-approvals-and-sandbox)
mirrors Claude Code's --dangerously-skip-permissions and Gemini CLI's
--yolo flags.

E2E results: 5/8 passed. 3 failures are pre-existing Codex model behavior
issues (timeouts, supervisor not calling MCP tools) unrelated to this change.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
haofeif added a commit that referenced this pull request Mar 31, 2026
#125)

* feat(security): add allowedTools — universal tool restriction across all providers

Add role-based tool restrictions that translate CAO's unified tool vocabulary
(execute_bash, fs_read, fs_write, fs_*, @cao-mcp-server) to each provider's
native enforcement mechanism:

- Q CLI / Kiro CLI: allowedTools in agent JSON (install time)
- Claude Code: --disallowedTools flags alongside --dangerously-skip-permissions
- Copilot CLI: --deny-tool flags override --allow-all
- Gemini CLI: Policy Engine TOML deny rules in ~/.gemini/policies/
- Kimi CLI / Codex: Security system prompt (soft enforcement)

Key changes:
- AgentProfile: add role field (supervisor/developer/reviewer)
- Constants: ROLE_TOOL_DEFAULTS with per-role defaults
- launch.py: --allowed-tools and --yolo CLI flags, confirmation prompt
- New utils/tool_mapping.py: CAO-to-native tool name translation
- All 7 providers: native restriction flags in command builders
- Database: allowed_tools column for cross-provider inheritance
- MCP server: allowed_tools inheritance for handoff/assign
- Built-in profiles: role + allowedTools frontmatter + security constraints
- SECURITY.md: full documentation of tool restriction system
- E2E tests: file-based bash execution proof across all providers

Gemini CLI enforcement uses Policy Engine deny rules (not deprecated
excludeTools) which work even in --yolo mode by completely excluding
denied tools from the model's memory.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs: add tool restrictions documentation

- README.md: add Tool Restrictions section with role defaults, usage
  examples, and provider enforcement table
- docs/tool-restrictions.md: comprehensive guide covering unified CLI
  (--allowed-tools, --yolo), role-based defaults, CAO tool vocabulary,
  resolution order, per-provider behavior matrix, cross-provider
  inheritance, and security recommendations
- docs/agent-profile.md: add role field, allowedTools documentation,
  tool vocabulary reference, and resolution order
- docs/gemini-cli.md: add Tool Restrictions section explaining Policy
  Engine TOML deny rules and why excludeTools was replaced

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* style: apply black formatting to pass CI code quality checks

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* style: fix isort import ordering in test_allowed_tools.py

Move `from pathlib import Path` to stdlib import group.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat(codex): add --yolo flag to bypass approval prompts in tmux sessions

Merges the intent of PR #105. CAO agents run in non-interactive tmux
sessions where Codex's interactive approval prompts block handoff/assign
flows. The --yolo flag (alias for --dangerously-bypass-approvals-and-sandbox)
mirrors Claude Code's --dangerously-skip-permissions and Gemini CLI's
--yolo flags.

E2E results: 5/8 passed. 3 failures are pre-existing Codex model behavior
issues (timeouts, supervisor not calling MCP tools) unrelated to this change.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(e2e): pre-warm uvx cache to prevent MCP startup timeouts

Agent profiles launch cao-mcp-server via uvx --from git+..., which
downloads ~80 packages on a cold cache (~20s). This exceeds Codex's
default 10s MCP startup timeout, causing flaky test failures. Add a
session-scoped fixture that runs uvx once before tests to populate
the cache (<3s on subsequent runs).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat(tool-restrictions): redesign to opt-in roles + allowedTools system

- Role is a high-level abstraction (named bundle of allowedTools)
- allowedTools is the low-level fine-grained control
- No role + no allowedTools = unrestricted (backward compatible)
- Support custom roles via settings.json "roles" key
- Fix supervisor defaults: add fs_read and fs_list
- Remove DEFAULT_ROLE (no role = unrestricted, not developer)
- Rewrite docs/tool-restrictions.md with clear concept hierarchy
- Update tests for new defaults and opt-in behavior

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat(tool-restrictions): improve launch prompts, default to developer, deduplicate docs

- Change default when no role/allowedTools set from unrestricted ["*"] to
  developer defaults — secure by default while preserving backward compat
- Add --yolo warning prompt (visible but non-blocking)
- Add "To grant all permissions, re-run with --yolo" hint to normal prompt
- Add "no role set" reminder with docs link when profile lacks role/allowedTools
- Change confirmation text from "Do you trust all the actions in this folder?"
  to "Proceed?" — clearer that it confirms the shown restrictions, not blanket trust
- Deduplicate agent-profile.md: replace duplicated roles/vocabulary/resolution
  tables with brief summary linking to tool-restrictions.md
- Add allowedTools-only example (no role needed) to tool-restrictions.md
- Add "Launch Confirmation Prompt" section with Confirmation vs --yolo comparison
- Add "Example Profiles" section linking to examples directory
- Fix stale supervisor defaults in README.md (add fs_read, fs_list)
- Add "Tool restrictions" bullet to Key Features in README.md
- Add role/allowedTools to all example profiles missing them
- Add Codex allowed_tools e2e tests with xfail for soft enforcement

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(ci): fix black formatting and update launch prompt assertions

- Run black on launch.py to fix formatting
- Update test_launch.py assertions: old "Do you trust all the actions
  in this folder?" → new "Proceed?" prompt text
- Add WARNING assertion for --yolo test

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs(tool-restrictions): clarify @cao-mcp-server is pass-through, not enforced

Address PR #125 feedback: make explicit that @cao-mcp-server in
allowedTools is a declarative marker, not translated to native
enforcement flags. MCP tools remain always available. Also add
WebFetch reference link.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* refactor(examples): use role only, remove redundant allowedTools

Replace dual role + allowedTools with role-only in all 14 example
profiles. Add inline comment showing the role's default permissions
and pointing to docs/tool-restrictions.md for fine-grained control.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* refactor(agent_store): use role only, remove redundant allowedTools

Same cleanup as examples/ — built-in profiles now use role-only with
inline comment showing default permissions. Also fixes stale supervisor
defaults (was missing fs_read, fs_list).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs(README): simplify Tool Restrictions section, link to full reference

Replace detailed tables and examples with a brief summary and link
to docs/tool-restrictions.md for the comprehensive reference.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
@haofeif

haofeif commented Mar 31, 2026

Copy link
Copy Markdown
Contributor

closing the PR as it is included in #125

@haofeif haofeif closed this Mar 31, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants