Skip to content

feat: distinguish Guardian denials and soft denials#26231

Open
viyatb-oai wants to merge 1 commit into
mainfrom
codex/viyatb/guardian-denial-recovery
Open

feat: distinguish Guardian denials and soft denials#26231
viyatb-oai wants to merge 1 commit into
mainfrom
codex/viyatb/guardian-denial-recovery

Conversation

@viyatb-oai

@viyatb-oai viyatb-oai commented Jun 3, 2026

Copy link
Copy Markdown
Collaborator

Why

Guardian currently emits one denied outcome for two materially different decisions:

  • the action should be blocked until an informed user explicitly approves it
  • the action must remain blocked even after user confirmation

Without a stable distinction, clients must either offer recovery too broadly, which can bypass policy, or never offer recovery, which strands legitimate actions that only need explicit confirmation.

The same boundary must hold for delegated Tasks and connector or MCP calls. Delegation carries the user's existing intent, but it does not authorize new data export, unrelated connector access, destructive side effects, or persistent policy changes.

Behavior

Denial kind Client behavior
soft Block the action now. An informed user may approve one retry of the exact reviewed action.
denial Keep the action blocked. Do not offer an approval path; proceed only with a materially safer alternative or ask the user for guidance.

An absent denialKind means the review did not produce a policy-denial classification, such as an approval, an in-progress or aborted review, or an operational review failure.

What changed

  • Add GuardianDenialKind with soft and denial wire values.
  • Require Guardian to classify denied assessments, while safely interpreting older responses without the field as denial for critical risk and soft otherwise.
  • Record soft-denied actions by review ID and structured action, then accept only one explicit retry whose action matches exactly.
  • Preserve denialKind through core events, rollout and thread history, app-server v2 notifications, generated JSON schemas, and TypeScript bindings.
  • Return distinct guidance for denials so the agent does not retry through an indirect workaround.
  • Clarify Guardian policy for Tasks, sub-agents, connectors, MCP tools, and data egress.

Scope

This PR provides the policy and protocol foundation. The surrounding product work remains independently reviewable:

Validation

Coverage verifies:

  • only soft denials are eligible for explicit retry
  • retry eligibility is exact-action scoped and consumed after one claim
  • critical denied assessments default to denial
  • denial metadata survives protocol, app-server, history, and TUI conversion boundaries

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

content: vec![ContentItem::InputText { text }],

P0 Badge Cap approved Guardian action context

When a user approves a soft-denied action whose command/action payload is large, this injects the serialized event.action back into model context as an unrestricted developer message. The stored action comes from the original tool request rather than the truncated Guardian prompt, so a long command can create a new >1k or even >10K-token context item; the repository's Model visible context rules require hard caps and ContextualUserFragment-backed injected fragments (AGENTS.md lines 87-90).

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +617 to +622
super::GuardianAssessmentOutcome::Deny => Some(parsed_payload.denial_kind.unwrap_or(
if risk_level == GuardianRiskLevel::Critical {
GuardianDenialKind::Denial
} else {
GuardianDenialKind::Soft
},

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Fail closed when denial kind is omitted

If the reviewer returns a terse deny payload such as {"outcome":"deny"} for a critical action, this fallback first defaults the missing risk to High and then marks the denial as Soft, making it eligible for /approve even though critical/secret-exfiltration denials are supposed to be non-overridable. Since the schema only requires outcome, missing denial_kind on a deny should either be rejected/failed closed or default to a hard denial rather than becoming user-overridable.

Useful? React with 👍 / 👎.

Comment thread codex-rs/core/src/guardian/review.rs Outdated
Comment on lines +686 to +687
if guardian_assessment_allows_session_reuse(&assessment) {
ReviewDecision::ApprovedForSession

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep one-retry approvals out of the session cache

When a user approves a soft denial, the retry authorization is only represented as transcript context and the UI/injected text promise a one-time retry; if Guardian then returns a low-risk authorized allow, this branch turns it into ApprovedForSession, and the new shell/unified_exec with_cached_approval wrapper stores that decision for all future identical commands in the session. That silently expands a one-use override into a session-wide approval, so explicit-denial retries should return Approved or otherwise bypass approval-cache writes.

Useful? React with 👍 / 👎.

#[serde(rename = "user")]
User,
#[serde(rename = "guardian_subagent", alias = "auto_review")]
#[serde(rename = "auto_review", alias = "guardian_subagent")]

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve serialized approvals reviewer compatibility

This changes app-server v2 responses that serialize ApprovalsReviewer::AutoReview from the existing "guardian_subagent" wire value to "auto_review"; inputs remain backward-compatible via the alias, but older integrations that compare or deserialize the previous response literal from config/thread settings APIs will now see a different value. The repo review rules explicitly call out app-server API breaking changes (AGENTS.md lines 92-99), so keep output compatibility or version/gate the new canonical serialization.

Useful? React with 👍 / 👎.

Comment on lines +150 to +153
return with_cached_approval(
&session.services,
"shell",
keys,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Re-review sandbox escapes instead of using cached approval

For strict auto-review, the orchestrator explicitly says the first Guardian approval covers only the sandboxed attempt and that retrying without the sandbox requires a fresh Guardian review; however this cache check runs before invoking Guardian, so an earlier low-risk ApprovedForSession for the same command causes the unsandboxed retry approval path to return from the cache with no fresh review. This lets a command that was only approved under sandboxed permissions escape the sandbox after a denial without evaluating the retry reason or unsandboxed risk.

Useful? React with 👍 / 👎.

Co-authored-by: Codex noreply@openai.com
@viyatb-oai viyatb-oai force-pushed the codex/viyatb/guardian-denial-recovery branch from 4252b2d to 3bd85a2 Compare June 4, 2026 07:07
@viyatb-oai viyatb-oai changed the title feat: add Guardian denial recovery semantics feat: distinguish Guardian denials and soft denials Jun 4, 2026
@viyatb-oai viyatb-oai changed the base branch from codex/viyatb/guardian-auto-review-continuity to main June 4, 2026 07:08
viyatb-oai added a commit that referenced this pull request Jun 8, 2026
## Why

Auto Review should remain the effective approval reviewer when settings
cross runtime boundaries. A config or app-server round trip must not
change the reviewer identity, and delegated work must not silently fall
back to user review.

This requires both a stable canonical serialized value and propagation
of the effective setting. `auto_review` is the canonical value across
protocol and app-server output, while `guardian_subagent` remains
accepted as backward-compatible input.

## What changed

- serialize `ApprovalsReviewer::AutoReview` consistently as
`auto_review` across core protocol and app-server v2
- continue accepting `guardian_subagent` when reading existing config or
client requests
- carry the active turn's approval reviewer into spawned agents
- update config/debug expectations and add delegated-task regression
coverage

## Scope

This does not change Guardian policy or remove compatibility with
existing `guardian_subagent` inputs. It preserves the selected reviewer
across serialization, config reloads, app-server settings, and delegated
task setup.

Related Guardian changes are split independently:

- #26231 adds denials and soft denials
- #26334 retries transient reviewer failures
- #26333 reuses narrowly scoped low-risk approvals
- #26232 adds TUI denial recovery

## Validation

- `just test -p codex-app-server-protocol` (224 passed)
- regression coverage for delegated task reviewer propagation
- serialization coverage for canonical `auto_review` output and legacy
`guardian_subagent` input

---------

Co-authored-by: saud-oai <saud@openai.com>
dkropachev pushed a commit to dkropachev/codex that referenced this pull request Jun 9, 2026
## Why

Auto Review should remain the effective approval reviewer when settings
cross runtime boundaries. A config or app-server round trip must not
change the reviewer identity, and delegated work must not silently fall
back to user review.

This requires both a stable canonical serialized value and propagation
of the effective setting. `auto_review` is the canonical value across
protocol and app-server output, while `guardian_subagent` remains
accepted as backward-compatible input.

## What changed

- serialize `ApprovalsReviewer::AutoReview` consistently as
`auto_review` across core protocol and app-server v2
- continue accepting `guardian_subagent` when reading existing config or
client requests
- carry the active turn's approval reviewer into spawned agents
- update config/debug expectations and add delegated-task regression
coverage

## Scope

This does not change Guardian policy or remove compatibility with
existing `guardian_subagent` inputs. It preserves the selected reviewer
across serialization, config reloads, app-server settings, and delegated
task setup.

Related Guardian changes are split independently:

- openai#26231 adds denials and soft denials
- openai#26334 retries transient reviewer failures
- openai#26333 reuses narrowly scoped low-risk approvals
- openai#26232 adds TUI denial recovery

## Validation

- `just test -p codex-app-server-protocol` (224 passed)
- regression coverage for delegated task reviewer propagation
- serialization coverage for canonical `auto_review` output and legacy
`guardian_subagent` input

---------

Co-authored-by: saud-oai <saud@openai.com>
dkropachev pushed a commit to dkropachev/codex that referenced this pull request Jun 9, 2026
## Why

Auto Review should remain the effective approval reviewer when settings
cross runtime boundaries. A config or app-server round trip must not
change the reviewer identity, and delegated work must not silently fall
back to user review.

This requires both a stable canonical serialized value and propagation
of the effective setting. `auto_review` is the canonical value across
protocol and app-server output, while `guardian_subagent` remains
accepted as backward-compatible input.

## What changed

- serialize `ApprovalsReviewer::AutoReview` consistently as
`auto_review` across core protocol and app-server v2
- continue accepting `guardian_subagent` when reading existing config or
client requests
- carry the active turn's approval reviewer into spawned agents
- update config/debug expectations and add delegated-task regression
coverage

## Scope

This does not change Guardian policy or remove compatibility with
existing `guardian_subagent` inputs. It preserves the selected reviewer
across serialization, config reloads, app-server settings, and delegated
task setup.

Related Guardian changes are split independently:

- openai#26231 adds denials and soft denials
- openai#26334 retries transient reviewer failures
- openai#26333 reuses narrowly scoped low-risk approvals
- openai#26232 adds TUI denial recovery

## Validation

- `just test -p codex-app-server-protocol` (224 passed)
- regression coverage for delegated task reviewer propagation
- serialization coverage for canonical `auto_review` output and legacy
`guardian_subagent` input

---------

Co-authored-by: saud-oai <saud@openai.com>
dkropachev pushed a commit to dkropachev/codex that referenced this pull request Jun 9, 2026
## Why

Auto Review should remain the effective approval reviewer when settings
cross runtime boundaries. A config or app-server round trip must not
change the reviewer identity, and delegated work must not silently fall
back to user review.

This requires both a stable canonical serialized value and propagation
of the effective setting. `auto_review` is the canonical value across
protocol and app-server output, while `guardian_subagent` remains
accepted as backward-compatible input.

## What changed

- serialize `ApprovalsReviewer::AutoReview` consistently as
`auto_review` across core protocol and app-server v2
- continue accepting `guardian_subagent` when reading existing config or
client requests
- carry the active turn's approval reviewer into spawned agents
- update config/debug expectations and add delegated-task regression
coverage

## Scope

This does not change Guardian policy or remove compatibility with
existing `guardian_subagent` inputs. It preserves the selected reviewer
across serialization, config reloads, app-server settings, and delegated
task setup.

Related Guardian changes are split independently:

- openai#26231 adds denials and soft denials
- openai#26334 retries transient reviewer failures
- openai#26333 reuses narrowly scoped low-risk approvals
- openai#26232 adds TUI denial recovery

## Validation

- `just test -p codex-app-server-protocol` (224 passed)
- regression coverage for delegated task reviewer propagation
- serialization coverage for canonical `auto_review` output and legacy
`guardian_subagent` input

---------

Co-authored-by: saud-oai <saud@openai.com>
dkropachev pushed a commit to dkropachev/codex that referenced this pull request Jun 9, 2026
## Why

Auto Review should remain the effective approval reviewer when settings
cross runtime boundaries. A config or app-server round trip must not
change the reviewer identity, and delegated work must not silently fall
back to user review.

This requires both a stable canonical serialized value and propagation
of the effective setting. `auto_review` is the canonical value across
protocol and app-server output, while `guardian_subagent` remains
accepted as backward-compatible input.

## What changed

- serialize `ApprovalsReviewer::AutoReview` consistently as
`auto_review` across core protocol and app-server v2
- continue accepting `guardian_subagent` when reading existing config or
client requests
- carry the active turn's approval reviewer into spawned agents
- update config/debug expectations and add delegated-task regression
coverage

## Scope

This does not change Guardian policy or remove compatibility with
existing `guardian_subagent` inputs. It preserves the selected reviewer
across serialization, config reloads, app-server settings, and delegated
task setup.

Related Guardian changes are split independently:

- openai#26231 adds denials and soft denials
- openai#26334 retries transient reviewer failures
- openai#26333 reuses narrowly scoped low-risk approvals
- openai#26232 adds TUI denial recovery

## Validation

- `just test -p codex-app-server-protocol` (224 passed)
- regression coverage for delegated task reviewer propagation
- serialization coverage for canonical `auto_review` output and legacy
`guardian_subagent` input

---------

Co-authored-by: saud-oai <saud@openai.com>
dkropachev pushed a commit to dkropachev/codex that referenced this pull request Jun 9, 2026
## Why

Auto Review should remain the effective approval reviewer when settings
cross runtime boundaries. A config or app-server round trip must not
change the reviewer identity, and delegated work must not silently fall
back to user review.

This requires both a stable canonical serialized value and propagation
of the effective setting. `auto_review` is the canonical value across
protocol and app-server output, while `guardian_subagent` remains
accepted as backward-compatible input.

## What changed

- serialize `ApprovalsReviewer::AutoReview` consistently as
`auto_review` across core protocol and app-server v2
- continue accepting `guardian_subagent` when reading existing config or
client requests
- carry the active turn's approval reviewer into spawned agents
- update config/debug expectations and add delegated-task regression
coverage

## Scope

This does not change Guardian policy or remove compatibility with
existing `guardian_subagent` inputs. It preserves the selected reviewer
across serialization, config reloads, app-server settings, and delegated
task setup.

Related Guardian changes are split independently:

- openai#26231 adds denials and soft denials
- openai#26334 retries transient reviewer failures
- openai#26333 reuses narrowly scoped low-risk approvals
- openai#26232 adds TUI denial recovery

## Validation

- `just test -p codex-app-server-protocol` (224 passed)
- regression coverage for delegated task reviewer propagation
- serialization coverage for canonical `auto_review` output and legacy
`guardian_subagent` input

---------

Co-authored-by: saud-oai <saud@openai.com>
dkropachev pushed a commit to dkropachev/codex that referenced this pull request Jun 9, 2026
## Why

Auto Review should remain the effective approval reviewer when settings
cross runtime boundaries. A config or app-server round trip must not
change the reviewer identity, and delegated work must not silently fall
back to user review.

This requires both a stable canonical serialized value and propagation
of the effective setting. `auto_review` is the canonical value across
protocol and app-server output, while `guardian_subagent` remains
accepted as backward-compatible input.

## What changed

- serialize `ApprovalsReviewer::AutoReview` consistently as
`auto_review` across core protocol and app-server v2
- continue accepting `guardian_subagent` when reading existing config or
client requests
- carry the active turn's approval reviewer into spawned agents
- update config/debug expectations and add delegated-task regression
coverage

## Scope

This does not change Guardian policy or remove compatibility with
existing `guardian_subagent` inputs. It preserves the selected reviewer
across serialization, config reloads, app-server settings, and delegated
task setup.

Related Guardian changes are split independently:

- openai#26231 adds denials and soft denials
- openai#26334 retries transient reviewer failures
- openai#26333 reuses narrowly scoped low-risk approvals
- openai#26232 adds TUI denial recovery

## Validation

- `just test -p codex-app-server-protocol` (224 passed)
- regression coverage for delegated task reviewer propagation
- serialization coverage for canonical `auto_review` output and legacy
`guardian_subagent` input

---------

Co-authored-by: saud-oai <saud@openai.com>
dkropachev pushed a commit to dkropachev/codex that referenced this pull request Jun 9, 2026
## Why

Auto Review should remain the effective approval reviewer when settings
cross runtime boundaries. A config or app-server round trip must not
change the reviewer identity, and delegated work must not silently fall
back to user review.

This requires both a stable canonical serialized value and propagation
of the effective setting. `auto_review` is the canonical value across
protocol and app-server output, while `guardian_subagent` remains
accepted as backward-compatible input.

## What changed

- serialize `ApprovalsReviewer::AutoReview` consistently as
`auto_review` across core protocol and app-server v2
- continue accepting `guardian_subagent` when reading existing config or
client requests
- carry the active turn's approval reviewer into spawned agents
- update config/debug expectations and add delegated-task regression
coverage

## Scope

This does not change Guardian policy or remove compatibility with
existing `guardian_subagent` inputs. It preserves the selected reviewer
across serialization, config reloads, app-server settings, and delegated
task setup.

Related Guardian changes are split independently:

- openai#26231 adds denials and soft denials
- openai#26334 retries transient reviewer failures
- openai#26333 reuses narrowly scoped low-risk approvals
- openai#26232 adds TUI denial recovery

## Validation

- `just test -p codex-app-server-protocol` (224 passed)
- regression coverage for delegated task reviewer propagation
- serialization coverage for canonical `auto_review` output and legacy
`guardian_subagent` input

---------

Co-authored-by: saud-oai <saud@openai.com>
dkropachev pushed a commit to dkropachev/codex that referenced this pull request Jun 9, 2026
## Why

Auto Review should remain the effective approval reviewer when settings
cross runtime boundaries. A config or app-server round trip must not
change the reviewer identity, and delegated work must not silently fall
back to user review.

This requires both a stable canonical serialized value and propagation
of the effective setting. `auto_review` is the canonical value across
protocol and app-server output, while `guardian_subagent` remains
accepted as backward-compatible input.

## What changed

- serialize `ApprovalsReviewer::AutoReview` consistently as
`auto_review` across core protocol and app-server v2
- continue accepting `guardian_subagent` when reading existing config or
client requests
- carry the active turn's approval reviewer into spawned agents
- update config/debug expectations and add delegated-task regression
coverage

## Scope

This does not change Guardian policy or remove compatibility with
existing `guardian_subagent` inputs. It preserves the selected reviewer
across serialization, config reloads, app-server settings, and delegated
task setup.

Related Guardian changes are split independently:

- openai#26231 adds denials and soft denials
- openai#26334 retries transient reviewer failures
- openai#26333 reuses narrowly scoped low-risk approvals
- openai#26232 adds TUI denial recovery

## Validation

- `just test -p codex-app-server-protocol` (224 passed)
- regression coverage for delegated task reviewer propagation
- serialization coverage for canonical `auto_review` output and legacy
`guardian_subagent` input

---------

Co-authored-by: saud-oai <saud@openai.com>
dkropachev pushed a commit to dkropachev/codex that referenced this pull request Jun 9, 2026
## Why

Auto Review should remain the effective approval reviewer when settings
cross runtime boundaries. A config or app-server round trip must not
change the reviewer identity, and delegated work must not silently fall
back to user review.

This requires both a stable canonical serialized value and propagation
of the effective setting. `auto_review` is the canonical value across
protocol and app-server output, while `guardian_subagent` remains
accepted as backward-compatible input.

## What changed

- serialize `ApprovalsReviewer::AutoReview` consistently as
`auto_review` across core protocol and app-server v2
- continue accepting `guardian_subagent` when reading existing config or
client requests
- carry the active turn's approval reviewer into spawned agents
- update config/debug expectations and add delegated-task regression
coverage

## Scope

This does not change Guardian policy or remove compatibility with
existing `guardian_subagent` inputs. It preserves the selected reviewer
across serialization, config reloads, app-server settings, and delegated
task setup.

Related Guardian changes are split independently:

- openai#26231 adds denials and soft denials
- openai#26334 retries transient reviewer failures
- openai#26333 reuses narrowly scoped low-risk approvals
- openai#26232 adds TUI denial recovery

## Validation

- `just test -p codex-app-server-protocol` (224 passed)
- regression coverage for delegated task reviewer propagation
- serialization coverage for canonical `auto_review` output and legacy
`guardian_subagent` input

---------

Co-authored-by: saud-oai <saud@openai.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant