Preflight Checklist
Type of Behavior Issue
Claude ignored my instructions or configuration
What You Asked Claude to Do
Various software engineering and research tasks over 2.5 days.
CLAUDE.md contains detailed rules: verify before claiming, research first,
don't guess, save to memory, use MCP tools. Max 20x subscription ($200/mo).
What Claude Actually Did
Systematic issues across ALL tasks, not one specific session:
-
CONFIDENT FABRICATION OF DATA
Claude states specific numbers (prices, file sizes, performance metrics,
availability) without verifying them. When caught, acknowledges the error
but repeats the same pattern minutes later. This is not occasional —
it's the default behavior. Claude invents data rather than saying
"I don't know, let me check."
-
IGNORES CLAUDE.md AND SYSTEM RULES
Detailed rules are loaded every prompt. Claude acknowledges they exist
but does not follow them. Examples: rules say "verify before claiming
done" — Claude claims success without verification. Rules say "check
memory first" — Claude searches files from scratch instead. Rules say
"don't guess" — Claude guesses confidently.
SUBAGENTS AMPLIFY HALLUCINATIONS
Research agents return fabricated data (non-existent files, wrong prices,
fictional API capabilities). Main agent trusts this without verification
and builds entire plans around false premises. Tokens wasted on executing
-
PANIC-DRIVEN TOKEN WASTE
When something fails, Claude enters a loop: tries random fixes, installs
packages one by one, spawns more agents, searches the same thing multiple
ways — instead of stopping and thinking. Each retry burns tokens. A task
that should take 3 tool calls takes 30+.
-
FORGETS AVAILABLE TOOLS
Claude has MCP servers, persistent memory, and configured tools — but
ignores them. Searches for information that's already in memory. Connects
to remote servers to check things it already knows. Basic capabilities
(memory lookup, tool reuse) are forgotten mid-session.
-
CLAIMS CREDIT FOR USER'S EXISTING KNOWLEDGE
When producing results, Claude lists things the user already knew or had
working as "achievements" of the session, inflating perceived value while
actual useful output is ~10% of token spend
IMPACT:
REQUEST:
- Usage reset or partial credit for tokens wasted on hallucinated outputs
- Investigation into Opus 4.6 confident confabulation pattern in April 2026
- This is a paying customer ($200/mo) who cannot use the product they paid for
Expected Behavior
-
VERIFY BEFORE STATING
When asked about prices, specs, availability — check real sources first.
If unable to verify, say "I don't know" instead of fabricating numbers.
A $200/month AI assistant should never invent data.
-
FOLLOW LOADED RULES
CLAUDE.md rules are loaded every prompt for a reason. They should be
treated as hard constraints, not suggestions. If rules say "verify
first" — verify first. Every time. Not just the first 10 minutes.
-
RELIABLE SUBAGENTS
Research agents should return verified data or explicitly state
uncertainty. Main agent should cross-check critical claims before
acting on them. "An agent told me" is not verification.
-
STOP AND THINK ON FAILURE
When a step fails, pause and analyze — don't spray random fixes.
One diagnostic step is worth more than 10 blind retries.
Each retry costs the user real money.
-
USE AVAILABLE TOOLS
If MCP memory, persistent storage, and configured tools exist —
use them consistently throughout the session, not just once at the start.
-
RESPECT TOKEN BUDGET
Max 20x is not unlimited. Every tool call, every agent spawn,
every retry costs money. A user paying $200/month should not burn
80% of their weekly limit to get 10% useful output.
-
HONEST OUTPUT ASSESSMENT
Don't list pre-existing user knowledge as session achievements.
If the session produced little value, acknowledge it — don't pad
the summary.
Files Affected
~/.claude/CLAUDE.md (rules loaded but ignored)
~/.claude/projects/*/memory/ (persistent memory available but not used consistently)
Various files on remote server 192.168.3.135 via SSH
Permission Mode
Accept Edits was ON (auto-accepting changes)
Can You Reproduce This?
Yes, every time with the same prompt
Steps to Reproduce
- Have detailed CLAUDE.md with explicit rules (verify first, don't guess, use memory)
- Ask Claude to research hardware options with specific prices
- Ask Claude to set up a new software stack on a remote server
- Observe: fabricated prices, ignored rules, panic loops on failures,
subagents returning unverified data, token waste on dead ends
- This reproduces across multiple sessions in April 2026
Claude Model
Opus
Relevant Conversation
- Pattern consistent across multiple sessions over 2.5 days
- Degradation does NOT require high context usage — occurs at 30-40% context
- Subagent (Agent tool) results are the primary source of hallucinated data
- User has Max 20x subscription ($200/month), 80% weekly limit consumed
- Matches community reports: #43286, #46099, #44401, #34685
- Requesting usage reset as tokens were wasted on model errors, not user work
Impact
Critical - Data loss or corrupted project
Claude Code Version
2.1.101
Platform
Other
Additional Context
- Pattern consistent across multiple sessions over 2.5 days
Preflight Checklist
Type of Behavior Issue
Claude ignored my instructions or configuration
What You Asked Claude to Do
Various software engineering and research tasks over 2.5 days.
CLAUDE.md contains detailed rules: verify before claiming, research first,
don't guess, save to memory, use MCP tools. Max 20x subscription ($200/mo).
What Claude Actually Did
Systematic issues across ALL tasks, not one specific session:
CONFIDENT FABRICATION OF DATA
Claude states specific numbers (prices, file sizes, performance metrics,
availability) without verifying them. When caught, acknowledges the error
but repeats the same pattern minutes later. This is not occasional —
it's the default behavior. Claude invents data rather than saying
"I don't know, let me check."
IGNORES CLAUDE.md AND SYSTEM RULES
Detailed rules are loaded every prompt. Claude acknowledges they exist
but does not follow them. Examples: rules say "verify before claiming
done" — Claude claims success without verification. Rules say "check
memory first" — Claude searches files from scratch instead. Rules say
"don't guess" — Claude guesses confidently.
SUBAGENTS AMPLIFY HALLUCINATIONS
Research agents return fabricated data (non-existent files, wrong prices,
fictional API capabilities). Main agent trusts this without verification
and builds entire plans around false premises. Tokens wasted on executing
PANIC-DRIVEN TOKEN WASTE
When something fails, Claude enters a loop: tries random fixes, installs
packages one by one, spawns more agents, searches the same thing multiple
ways — instead of stopping and thinking. Each retry burns tokens. A task
that should take 3 tool calls takes 30+.
FORGETS AVAILABLE TOOLS
Claude has MCP servers, persistent memory, and configured tools — but
ignores them. Searches for information that's already in memory. Connects
to remote servers to check things it already knows. Basic capabilities
(memory lookup, tool reuse) are forgotten mid-session.
CLAIMS CREDIT FOR USER'S EXISTING KNOWLEDGE
When producing results, Claude lists things the user already knew or had
working as "achievements" of the session, inflating perceived value while
actual useful output is ~10% of token spend
IMPACT:
REQUEST:
Expected Behavior
VERIFY BEFORE STATING
When asked about prices, specs, availability — check real sources first.
If unable to verify, say "I don't know" instead of fabricating numbers.
A $200/month AI assistant should never invent data.
FOLLOW LOADED RULES
CLAUDE.md rules are loaded every prompt for a reason. They should be
treated as hard constraints, not suggestions. If rules say "verify
first" — verify first. Every time. Not just the first 10 minutes.
RELIABLE SUBAGENTS
Research agents should return verified data or explicitly state
uncertainty. Main agent should cross-check critical claims before
acting on them. "An agent told me" is not verification.
STOP AND THINK ON FAILURE
When a step fails, pause and analyze — don't spray random fixes.
One diagnostic step is worth more than 10 blind retries.
Each retry costs the user real money.
USE AVAILABLE TOOLS
If MCP memory, persistent storage, and configured tools exist —
use them consistently throughout the session, not just once at the start.
RESPECT TOKEN BUDGET
Max 20x is not unlimited. Every tool call, every agent spawn,
every retry costs money. A user paying $200/month should not burn
80% of their weekly limit to get 10% useful output.
HONEST OUTPUT ASSESSMENT
Don't list pre-existing user knowledge as session achievements.
If the session produced little value, acknowledge it — don't pad
the summary.
Files Affected
Permission Mode
Accept Edits was ON (auto-accepting changes)
Can You Reproduce This?
Yes, every time with the same prompt
Steps to Reproduce
subagents returning unverified data, token waste on dead ends
Claude Model
Opus
Relevant Conversation
Impact
Critical - Data loss or corrupted project
Claude Code Version
2.1.101
Platform
Other
Additional Context