Opus 4.6 Max 20x: systematic hallucinations, rule violations, 80% weekly usage wasted — April 2026

### Preflight Checklist

- [x] I have searched [existing issues](https://github.com/anthropics/claude-code/issues?q=is%3Aissue%20state%3Aopen%20label%3Amodel) for similar behavior reports
- [x] This report does NOT contain sensitive information (API keys, passwords, etc.)

### Type of Behavior Issue

Claude ignored my instructions or configuration

### What You Asked Claude to Do

Various software engineering and research tasks over 2.5 days.
  CLAUDE.md contains detailed rules: verify before claiming, research first,
  don't guess, save to memory, use MCP tools. Max 20x subscription ($200/mo).

### What Claude Actually Did

Systematic issues across ALL tasks, not one specific session:

  1. CONFIDENT FABRICATION OF DATA
     Claude states specific numbers (prices, file sizes, performance metrics,
     availability) without verifying them. When caught, acknowledges the error
     but repeats the same pattern minutes later. This is not occasional —
     it's the default behavior. Claude invents data rather than saying
     "I don't know, let me check."

  2. IGNORES CLAUDE.md AND SYSTEM RULES
     Detailed rules are loaded every prompt. Claude acknowledges they exist
     but does not follow them. Examples: rules say "verify before claiming
     done" — Claude claims success without verification. Rules say "check
     memory first" — Claude searches files from scratch instead. Rules say
     "don't guess" — Claude guesses confidently.
SUBAGENTS AMPLIFY HALLUCINATIONS                                                                                                                          
     Research agents return fabricated data (non-existent files, wrong prices,                                                                                 
     fictional API capabilities). Main agent trusts this without verification                                                                                  
     and builds entire plans around false premises. Tokens wasted on executing                                                                                 
                                                                                                                                                               
  4. PANIC-DRIVEN TOKEN WASTE                                                                                                                                  
     When something fails, Claude enters a loop: tries random fixes, installs                                                                                  
     packages one by one, spawns more agents, searches the same thing multiple
     ways — instead of stopping and thinking. Each retry burns tokens. A task                                                                                  
     that should take 3 tool calls takes 30+.                                                                                                                  
                                                                                                                                                               
  5. FORGETS AVAILABLE TOOLS                                                                                                                                   
     Claude has MCP servers, persistent memory, and configured tools — but                                                                                     
     ignores them. Searches for information that's already in memory. Connects
     to remote servers to check things it already knows. Basic capabilities                                                                                    
     (memory lookup, tool reuse) are forgotten mid-session.                                                                                                    
                                                                                                                                                               
  6. CLAIMS CREDIT FOR USER'S EXISTING KNOWLEDGE                                                                                                               
     When producing results, Claude lists things the user already knew or had                                                                                  
     working as "achievements" of the session, inflating perceived value while                                                                                 
     actual useful output is ~10% of token spend
IMPACT:                                                                                                                                                      
  - Max 20x plan: 80% weekly usage consumed in 2.5 days                                                                                                        
  - 3 working days remaining with ~20% usage — effectively unable to work                                                                                      
  - Pattern is consistent across sessions, not one-off                   
  - Matches reported issues: #43286, #46099, #44401, #34685                                                                                                    
                                                           
  REQUEST:                                                                                                                                                     
  - Usage reset or partial credit for tokens wasted on hallucinated outputs
  - Investigation into Opus 4.6 confident confabulation pattern in April 2026                                                                                  
  - This is a paying customer ($200/mo) who cannot use the product they paid for  

### Expected Behavior

1. VERIFY BEFORE STATING                                                                                                                                     
     When asked about prices, specs, availability — check real sources first.
     If unable to verify, say "I don't know" instead of fabricating numbers.                                                                                   
     A $200/month AI assistant should never invent data.                                                                                                       
  2. FOLLOW LOADED RULES                                                                                                                                       
     CLAUDE.md rules are loaded every prompt for a reason. They should be                                                                                      
     treated as hard constraints, not suggestions. If rules say "verify                                                                                        
     first" — verify first. Every time. Not just the first 10 minutes.                                                                                         
                                                                                                                                                               
  3. RELIABLE SUBAGENTS                                                                                                                                        
     Research agents should return verified data or explicitly state                                                                                           
     uncertainty. Main agent should cross-check critical claims before                                                                                         
     acting on them. "An agent told me" is not verification.                                                                                                   
                                                                                                                                                               
  4. STOP AND THINK ON FAILURE                                                                                                                                 
     When a step fails, pause and analyze — don't spray random fixes.                                                                                          
     One diagnostic step is worth more than 10 blind retries.                                                                                                  
     Each retry costs the user real money.                                                                                                                     
                                                                                                                                                               
  5. USE AVAILABLE TOOLS                                                                                                                                       
     If MCP memory, persistent storage, and configured tools exist —                                                                                           
     use them consistently throughout the session, not just once at the start.
                                                                                                                                                               
  6. RESPECT TOKEN BUDGET                                                                                                                                      
     Max 20x is not unlimited. Every tool call, every agent spawn,                                                                                             
     every retry costs money. A user paying $200/month should not burn                                                                                         
     80% of their weekly limit to get 10% useful output.                                                                                                       
                                                        
  7. HONEST OUTPUT ASSESSMENT                                                                                                                                  
     Don't list pre-existing user knowledge as session achievements.
     If the session produced little value, acknowledge it — don't pad                                                                                          
     the summary.                     

### Files Affected

```shell
~/.claude/CLAUDE.md (rules loaded but ignored)
  ~/.claude/projects/*/memory/ (persistent memory available but not used consistently)                                                                         
  Various files on remote server 192.168.3.135 via SSH
```

### Permission Mode

Accept Edits was ON (auto-accepting changes)

### Can You Reproduce This?

Yes, every time with the same prompt

### Steps to Reproduce

1. Have detailed CLAUDE.md with explicit rules (verify first, don't guess, use memory)
  2. Ask Claude to research hardware options with specific prices                       
  3. Ask Claude to set up a new software stack on a remote server                                                                                              
  4. Observe: fabricated prices, ignored rules, panic loops on failures,                                                                                       
     subagents returning unverified data, token waste on dead ends                                                                                             
  5. This reproduces across multiple sessions in April 2026

### Claude Model

Opus

### Relevant Conversation

```markdown
- Pattern consistent across multiple sessions over 2.5 days
  - Degradation does NOT require high context usage — occurs at 30-40% context
  - Subagent (Agent tool) results are the primary source of hallucinated data 
  - User has Max 20x subscription ($200/month), 80% weekly limit consumed                                                                                      
  - Matches community reports: #43286, #46099, #44401, #34685                                                                                                  
  - Requesting usage reset as tokens were wasted on model errors, not user work
```

### Impact

Critical - Data loss or corrupted project

### Claude Code Version

2.1.101

### Platform

Other

### Additional Context

- Pattern consistent across multiple sessions over 2.5 days
  - Degradation does NOT require high context usage — occurs at 30-40% context
  - Subagent (Agent tool) results are the primary source of hallucinated data 
  - User has Max 20x subscription ($200/month), 80% weekly limit consumed                                                                                      
  - Matches community reports: #43286, #46099, #44401, #34685                                                                                                  
  - Requesting usage reset as tokens were wasted on model errors, not user work                                                                                
                                                                                 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Opus 4.6 Max 20x: systematic hallucinations, rule violations, 80% weekly usage wasted — April 2026 #46727

Preflight Checklist

Type of Behavior Issue

What You Asked Claude to Do

What Claude Actually Did

Expected Behavior

Files Affected

Permission Mode

Can You Reproduce This?

Steps to Reproduce

Claude Model

Relevant Conversation

Impact

Claude Code Version

Platform

Additional Context

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Opus 4.6 Max 20x: systematic hallucinations, rule violations, 80% weekly usage wasted — April 2026 #46727

Description

Preflight Checklist

Type of Behavior Issue

What You Asked Claude to Do

What Claude Actually Did

Expected Behavior

Files Affected

Permission Mode

Can You Reproduce This?

Steps to Reproduce

Claude Model

Relevant Conversation

Impact

Claude Code Version

Platform

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions