Skip to content

fix(yolo-agents): require claim verification before reporting fires#202

Merged
auroracapital merged 1 commit intomainfrom
fix/yolo-claim-verification
May 2, 2026
Merged

fix(yolo-agents): require claim verification before reporting fires#202
auroracapital merged 1 commit intomainfrom
fix/yolo-claim-verification

Conversation

@auroracapital
Copy link
Copy Markdown
Collaborator

@auroracapital auroracapital commented May 2, 2026

Summary

The 2026-05-01 YOLO session produced multiple false-positive fires:

  • CTO claimed stagery-api prd was missing CLERK_AUTHORIZED_PARTIES — the var was set; only CORS_ORIGINS was empty.
  • CTO/COO flagged healify-langgraphs-prod desired=0 as a fire — verified intentional Phase 19.4 Stage 2 decomm (commit 7ada4026 chore(decomm): disable prod deploy workflow).
  • COO referenced a Doppler stg config that does not exist on stagery-api (configs are dev, dev_personal, dev_local, prd).

Root cause: the C-suite prompts told agents to be "brutally honest" but did not require them to verify external state before asserting it. Stale STATE.md / planning notes were promoted to P0 claims.

Change

Add a ## CLAIM VERIFICATION GUARDRAIL section to all four agents (yolo-ceo, yolo-cto, yolo-cfo, yolo-coo). Identical block in each, placed before the existing DESTRUCTIVE ACTION GUARDRAIL so both pre-action checks live together.

The block:

  • Mandates concrete verification commands per claim type (doppler secrets get, aws secretsmanager get-secret-value, git log --grep decomm)
  • Distinguishes "set to empty" vs. "unset" — pull values, don't infer from key lists
  • Requires checking the owning repo's git log for intentional state (decomm/scale/disable/phase commits) before flagging a service as "broken"
  • Lists forbidden output patterns (e.g., "X is missing in production" without a corresponding read)
  • Requires UNVERIFIED labeling when verification is impossible (rate limit, missing creds)

Test plan

  • All 4 agent files contain the new block
  • Block is inserted BEFORE ## DESTRUCTIVE ACTION GUARDRAIL (pre-existing) in each file
  • Next /ops:ops-yolo run produces no MISSING claims for vars that are actually set, and no BROKEN claims for services with recent intentional decomm commits

Note

Low Risk
Low risk: prompt-only documentation changes that tighten agent output requirements without modifying runtime code paths.

Overview
Adds a new ## CLAIM VERIFICATION GUARDRAIL section to yolo-ceo, yolo-cto, yolo-cfo, and yolo-coo agent prompt files, inserted before the existing destructive-action guardrails.

The guardrail requires agents to verify claims about missing secrets, broken services, wrong config values, or abandoned projects using concrete commands (e.g., doppler secrets get, aws secretsmanager get-secret-value, repo git log checks), and to label anything they can’t confirm as UNVERIFIED while banning common false-positive “fire” phrasing without ground-truth reads.

Reviewed by Cursor Bugbot for commit 86d23b9. Bugbot is set up for automated code reviews on this repo. Configure here.

Summary by CodeRabbit

  • Documentation
    • Agents now require ground-truth verification before asserting system state issues (missing secrets, service failures, configuration errors)
    • Unverifiable conditions are explicitly labeled to reduce false positives
    • Destructive action recommendations now require explicit confirmation
    • Enhanced safety guardrails across operational guidance systems

The 2026-05-01 YOLO session produced multiple false-positive fires that
wasted orchestrator/operator time:

- CTO claimed `stagery-api` prd was missing `CLERK_AUTHORIZED_PARTIES` —
  the var was set; only `CORS_ORIGINS` was empty.
- CTO/COO flagged `healify-langgraphs-prod` desired=0 as "may still be
  95% traffic path" — verified intentional Phase 19.4 Stage 2 decomm
  (commit `7ada4026 chore(decomm): disable prod deploy workflow`).
- COO referenced a `stg` Doppler config that does not exist on
  stagery-api (configs are `dev`, `dev_personal`, `dev_local`, `prd`).

Root cause: the agent prompts told them to be "brutally honest" but did
not require them to verify external state before asserting it. Stale docs
and pattern-matching from STATE.md notes were promoted to P0 claims.

Add a CLAIM VERIFICATION GUARDRAIL section to all four C-suite agents
(yolo-ceo, yolo-cto, yolo-cfo, yolo-coo). Same block in each — pulls
agents toward concrete verification commands (doppler secrets get,
git log --grep decomm) and forbids common false-positive output patterns.
Anything that can't be verified must be labeled `UNVERIFIED`.

Placement: before the existing DESTRUCTIVE ACTION GUARDRAIL block, so
both pre-action checks live next to each other at the bottom of each
agent file.
@auroracapital auroracapital merged commit 29c3319 into main May 2, 2026
4 checks passed
@auroracapital auroracapital deleted the fix/yolo-claim-verification branch May 2, 2026 03:51
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 2, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 76228a62-be6d-414f-a864-118a11bed1c2

📥 Commits

Reviewing files that changed from the base of the PR and between c05b104 and 86d23b9.

📒 Files selected for processing (4)
  • claude-ops/agents/yolo-ceo.md
  • claude-ops/agents/yolo-cfo.md
  • claude-ops/agents/yolo-coo.md
  • claude-ops/agents/yolo-cto.md

📝 Walkthrough

Walkthrough

This PR adds Claim Verification Guardrails to four YOLO C-suite agent instruction files (CEO, CFO, COO, CTO) mandating ground-truth verification before asserting external-state issues such as missing secrets, broken services, wrong config values, or abandoned projects. The COO file also extends its Destructive Action Guardrail with stricter project-status verification and confirmation requirements.

Changes

Unified Claim Verification Guardrail Initiative

Layer / File(s) Summary
Core Guardrail Rules
claude-ops/agents/yolo-ceo.md, claude-ops/agents/yolo-cfo.md, claude-ops/agents/yolo-coo.md, claude-ops/agents/yolo-cto.md
Each agent instruction file receives an identical Claim Verification Guardrail section requiring ground-truth verification (secret store reads, git-log intent checks, actual config value reads, multi-signal abandonment checks) before asserting external-state failures; explicit UNVERIFIED labeling when verification is impossible; and forbidden output patterns that disallow ungrounded claims.
Destructive Action Guardrail Extension
claude-ops/agents/yolo-coo.md
COO agent receives extended Destructive Action Guardrail tightening project-status verification, distinguishing idle from zombie states, requiring ⚠️ REQUIRES CONFIRMATION for destructive recommendations, restricting RDS/cluster guidance to cost reduction for active projects, and forbidding unconfirmed automated deletion scripts.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related PRs

  • sync: dev → main #67 — Related modification to YOLO C-suite agent guardrails; the current PR adds Claim Verification Guardrail while that PR extends Destructive Action Guardrail across the same agent instruction files.

Poem

🐰 Verification guards now stand tall,
Truth before claiming state's fall.
No claims without checks so true,
Each secret, each service—read through!
Guardrails strong, the agents won't stumble.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/yolo-claim-verification

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
Review rate limit: 0/1 reviews remaining, refill in 60 minutes.

Comment @coderabbitai help to get the list of available commands and usage tips.

auroracapital added a commit that referenced this pull request May 2, 2026
…203)

Patch release rolling up the two YOLO reliability fixes merged this evening:

- PR #201: extract STATE.md walker to bin/ops-gsd-states for shell
  portability. Restores Phase 1 data-gathering after the inline shell
  block silently aborted under sh/dash on the bash-only `${d/#\~/$HOME}`
  expansion.
- PR #202: add CLAIM VERIFICATION GUARDRAIL to yolo-ceo / yolo-cto /
  yolo-cfo / yolo-coo so the agents stop reporting false-positive fires
  (verify against doppler/aws/git log before asserting; label UNVERIFIED
  when verification is impossible).

Bumps:
- plugin.json:        2.1.0 -> 2.1.1
- marketplace.json:   2.1.0 -> 2.1.1
- package.json (bin): 1.7.0 -> 1.7.1

CHANGELOG entry under [2.1.1] with details on both fixes and the
real-world false-positive cases that motivated the verification rule.

No breaking changes; drop-in replacement for 2.1.0.
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Bugbot Autofix is ON, but it could not run because the branch was deleted or merged before autofix could start.

Reviewed by Cursor Bugbot for commit 86d23b9. Configure here.

**Mandatory verification rules:**

1. **"Secret X is missing" — verify with the secret store.**
Doppler: `doppler secrets get <NAME> --project <p> --config <c> --plain` (returns empty if unset, the value if set). Never claim a secret is missing without this check. If unsure which configs exist, run `doppler configs --project <p>` first — do not assume names like `stg` / `prd` / `prod` exist.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect Doppler CLI behavior described contradicts rule 3

High Severity

Rule 1 states doppler secrets get ... --plain "(returns empty if unset, the value if set)" — this is factually wrong and directly contradicts rule 3 ("An empty string is different from an unset key — distinguish them"). The Doppler CLI actually returns an error for a nonexistent secret and an empty string for a secret set to empty. Agents following rule 1's parenthetical will still classify empty-valued secrets as "missing," which is exactly the false-positive pattern this PR exists to fix.

Additional Locations (2)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 86d23b9. Configure here.

Comment on lines +104 to +107
1. **"Secret X is missing" — verify with the secret store.**
Doppler: `doppler secrets get <NAME> --project <p> --config <c> --plain` (returns empty if unset, the value if set). Never claim a secret is missing without this check. If unsure which configs exist, run `doppler configs --project <p>` first — do not assume names like `stg` / `prd` / `prod` exist.
AWS Secrets Manager: `aws secretsmanager get-secret-value --secret-id <arn>`.
GitHub Actions: `gh secret list --repo <r>`.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: The documentation for doppler secrets get incorrectly states it returns an empty string for unset secrets, when it actually errors. This can mislead agent behavior.
Severity: MEDIUM

Suggested Fix

Update the documentation to accurately reflect the behavior of doppler secrets get. Clarify that it errors by default for an unset secret and that an empty string is returned only when a secret is explicitly set to an empty value. Mention the --no-exit-on-missing-secret flag as the way to alter the default erroring behavior.

Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.

Location: claude-ops/agents/yolo-ceo.md#L104-L107

Potential issue: The documentation for the `doppler secrets get ... --plain` command
incorrectly states that it 'returns empty if unset'. In reality, the Doppler CLI errors
and exits with a non-zero status code for an unset secret. An empty string is returned
only when a secret is explicitly set to an empty value. This misinformation can mislead
an agent into conflating a missing secret with one that is intentionally empty,
potentially causing it to handle secret retrieval incorrectly and bypass intended
safeguards.

Did we get this right? 👍 / 👎 to inform future reviews.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant