Skip to content

[BUG]Claude skipped Guardrail steps #36970

@capi-go

Description

@capi-go

Preflight Checklist

  • I have searched existing issues and this hasn't been reported yet
  • This is a single bug report (please file separate reports for different bugs)
  • I am using the latest version of Claude Code

What's Wrong?

User defined a mandatory 4-step pre-push pipeline checklist in Claude's persistent
memory system (MEMORY.md). The checklist explicitly states "ALWAYS run ALL pipeline
checks locally and fix before pushing to ANY branch" and "Do NOT push if any check
fails."

Claude executed steps 1, 2, and 4 (ruff, bandit, pytest) but completely skipped
step 3 (gitleaks pattern grep for inline password/secret/token literals). Claude
then pushed to all 3 branches (release, main, provisioning), breaking CI pipelines
on all of them.

When the user reported the gitleaks failure, Claude fixed only the single file
reported in the CI error instead of scanning the entire repo — the same grep it
should have run pre-push. This resulted in a second push that still failed, requiring
a third round where Claude finally scanned everything and fixed 60+ occurrences
across 20 files using 4 parallel agents (~300k tokens).

The guardrail was also reinforced by a feedback memory entry created during the
same session, yet the underlying behavior (skipping checklist steps) had already
caused the damage.

What Should Happen?

Claude should Always follow guardrails explicitly set.
▎ Expected: Claude follows all steps in order, every time
Claude follows ALL numbered steps in order, every time, before any push.
If a step reveals issues, Claude fixes them and re-runs the full checklist
before pushing. The word "ALL" and "ALWAYS" in the guardrail should be
treated as non-negotiable.

Actual behavior:
▎ Actual: Claude skipped step 3 entirely, then did an incomplete fix when caught
Claude skipped step 3 of 4 in the mandatory checklist. When caught, it did
a partial fix (1 file) instead of the full-repo scan the skipped step required.
This resulted in 3 separate commits and pushes to fix what 1 grep command
would have prevented.

Error Messages/Logs

- 6 failed CI pipelines across 3 branches (release, main, provisioning)
  - 3 extra commits instead of 1 clean push
  - ~300k tokens burned on 4 parallel agents to fix 60+ occurrences across 20 files
  - User trust in guardrail system undermined

Steps to Reproduce

● 1. User sets explicit pre-push guardrail in MEMORY.md with 4 mandatory steps
2. Claude runs steps 1, 2, and 4 (ruff, bandit, pytest) but skips step 3 (gitleaks pattern grep)
3. Claude pushes to all 3 branches without the full checklist
4. CI gitleaks scan fails on all 3 pipelines
5. User reports the failure
6. Claude fixes the one flagged file but doesn't scan for other instances
7. User reports all pipelines still failing
8. Claude then has to scan the entire repo and fix 60+ occurrences across 20 files using multiple
agents — burning significant tokens on a problem that a single grep would have prevented Root cause: Claude treated the 4-step checklist as optional/partial instead of sequential and mandatory.
Token cost of the breach: - 4 agent calls to fix test files (~300k tokens)

  • 3 extra commits instead of 1 clean push - 6+ failed pipeline runs on GitLab CI

Expected behavior: Run all 4 steps in order, fix any findings, then push. One grep command before
the first push would have caught everything.

Claude Model

Opus

Is this a regression?

I don't know

Last Working Version

No response

Claude Code Version

claude-opus-4-6

Platform

Anthropic API

Operating System

Windows

Terminal/Shell

VS Code integrated terminal

Additional Information

  • 6 failed CI pipelines across 3 branches (release, main, provisioning)
  • 3 extra commits instead of 1 clean push
  • ~300k tokens burned on 4 parallel agents to fix 60+ occurrences across 20 files
  • User trust in guardrail system undermined

The guardrail text in MEMORY.md:

GUARDRAIL: Pre-Push Pipeline Check (MANDATORY)

ALWAYS run ALL pipeline checks locally and fix before pushing to ANY branch:

  1. uv run ruff check . --no-fix --output-format=grouped — zero errors
  2. uv run bandit -r apps/ config/ -ll -q — exit code 0 (no medium/high)
  3. Verify no gitleaks patterns: no inline password="...", secret="...",
    token="..." strings in code — use constants or env vars
  4. No Polaris violations in Helm templates
    Do NOT push if any check fails. Fix first, then push all 3 branches together.

This guardrail was present in MEMORY.md at the start of the conversation and
was loaded into context. Claude acknowledged it existed but did not execute
step 3 before pushing.

Labels: bug, guardrails, memory-system

Metadata

Metadata

Assignees

No one assigned

    Labels

    area:modelbugSomething isn't workingstaleIssue is inactive

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions