Skip to content

Opus 4.6 Max 20x: systematic hallucinations, rule violations, 80% weekly usage wasted — April 2026 #46727

@up4k73

Description

@up4k73

Preflight Checklist

  • I have searched existing issues for similar behavior reports
  • This report does NOT contain sensitive information (API keys, passwords, etc.)

Type of Behavior Issue

Claude ignored my instructions or configuration

What You Asked Claude to Do

Various software engineering and research tasks over 2.5 days.
CLAUDE.md contains detailed rules: verify before claiming, research first,
don't guess, save to memory, use MCP tools. Max 20x subscription ($200/mo).

What Claude Actually Did

Systematic issues across ALL tasks, not one specific session:

  1. CONFIDENT FABRICATION OF DATA
    Claude states specific numbers (prices, file sizes, performance metrics,
    availability) without verifying them. When caught, acknowledges the error
    but repeats the same pattern minutes later. This is not occasional —
    it's the default behavior. Claude invents data rather than saying
    "I don't know, let me check."

  2. IGNORES CLAUDE.md AND SYSTEM RULES
    Detailed rules are loaded every prompt. Claude acknowledges they exist
    but does not follow them. Examples: rules say "verify before claiming
    done" — Claude claims success without verification. Rules say "check
    memory first" — Claude searches files from scratch instead. Rules say
    "don't guess" — Claude guesses confidently.
    SUBAGENTS AMPLIFY HALLUCINATIONS
    Research agents return fabricated data (non-existent files, wrong prices,
    fictional API capabilities). Main agent trusts this without verification
    and builds entire plans around false premises. Tokens wasted on executing

  3. PANIC-DRIVEN TOKEN WASTE
    When something fails, Claude enters a loop: tries random fixes, installs
    packages one by one, spawns more agents, searches the same thing multiple
    ways — instead of stopping and thinking. Each retry burns tokens. A task
    that should take 3 tool calls takes 30+.

  4. FORGETS AVAILABLE TOOLS
    Claude has MCP servers, persistent memory, and configured tools — but
    ignores them. Searches for information that's already in memory. Connects
    to remote servers to check things it already knows. Basic capabilities
    (memory lookup, tool reuse) are forgotten mid-session.

  5. CLAIMS CREDIT FOR USER'S EXISTING KNOWLEDGE
    When producing results, Claude lists things the user already knew or had
    working as "achievements" of the session, inflating perceived value while
    actual useful output is ~10% of token spend
    IMPACT:

REQUEST:

  • Usage reset or partial credit for tokens wasted on hallucinated outputs
  • Investigation into Opus 4.6 confident confabulation pattern in April 2026
  • This is a paying customer ($200/mo) who cannot use the product they paid for

Expected Behavior

  1. VERIFY BEFORE STATING
    When asked about prices, specs, availability — check real sources first.
    If unable to verify, say "I don't know" instead of fabricating numbers.
    A $200/month AI assistant should never invent data.

  2. FOLLOW LOADED RULES
    CLAUDE.md rules are loaded every prompt for a reason. They should be
    treated as hard constraints, not suggestions. If rules say "verify
    first" — verify first. Every time. Not just the first 10 minutes.

  3. RELIABLE SUBAGENTS
    Research agents should return verified data or explicitly state
    uncertainty. Main agent should cross-check critical claims before
    acting on them. "An agent told me" is not verification.

  4. STOP AND THINK ON FAILURE
    When a step fails, pause and analyze — don't spray random fixes.
    One diagnostic step is worth more than 10 blind retries.
    Each retry costs the user real money.

  5. USE AVAILABLE TOOLS
    If MCP memory, persistent storage, and configured tools exist —
    use them consistently throughout the session, not just once at the start.

  6. RESPECT TOKEN BUDGET
    Max 20x is not unlimited. Every tool call, every agent spawn,
    every retry costs money. A user paying $200/month should not burn
    80% of their weekly limit to get 10% useful output.

  7. HONEST OUTPUT ASSESSMENT
    Don't list pre-existing user knowledge as session achievements.
    If the session produced little value, acknowledge it — don't pad
    the summary.

Files Affected

~/.claude/CLAUDE.md (rules loaded but ignored)
  ~/.claude/projects/*/memory/ (persistent memory available but not used consistently)                                                                         
  Various files on remote server 192.168.3.135 via SSH

Permission Mode

Accept Edits was ON (auto-accepting changes)

Can You Reproduce This?

Yes, every time with the same prompt

Steps to Reproduce

  1. Have detailed CLAUDE.md with explicit rules (verify first, don't guess, use memory)
  2. Ask Claude to research hardware options with specific prices
  3. Ask Claude to set up a new software stack on a remote server
  4. Observe: fabricated prices, ignored rules, panic loops on failures,
    subagents returning unverified data, token waste on dead ends
  5. This reproduces across multiple sessions in April 2026

Claude Model

Opus

Relevant Conversation

- Pattern consistent across multiple sessions over 2.5 days
  - Degradation does NOT require high context usage — occurs at 30-40% context
  - Subagent (Agent tool) results are the primary source of hallucinated data 
  - User has Max 20x subscription ($200/month), 80% weekly limit consumed                                                                                      
  - Matches community reports: #43286, #46099, #44401, #34685                                                                                                  
  - Requesting usage reset as tokens were wasted on model errors, not user work

Impact

Critical - Data loss or corrupted project

Claude Code Version

2.1.101

Platform

Other

Additional Context

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions