Skip to content

[BUG] Silent model switch to Opus 4.7 [1M] mid-session caused ~4× quota burn #49541

@StevenJohnson998

Description

@StevenJohnson998

Preflight Checklist

  • I have searched existing issues and this hasn't been reported yet
  • This is a single bug report (please file separate reports for different bugs)
  • I am using the latest version of Claude Code

What's Wrong?

Today I burned through my entire 5h quota in half an hour of work, which has never happened in months of daily usage.
After digging into the transcripts I found the cause: my Claude Code session was silently switched from claude-opus-4-6 to claude-opus-4-7 mid-session, and the only Opus 4.7 variant exposed in /model is the 1M-context one.
Looking at a single session transcript, the model field changed mid-conversation:
11:58 → 16:17 UTC : claude-opus-4-6
16:17 UTC onward : claude-opus-4-7 ← auto-switched, no prompt, no notice

Over the previous 15 days, every transcript shows claude-opus-4-6 exclusively. April 16 is the first appearance of claude-opus-4-7 in my logs.

Impact on token usage.

Cache-read tokens per 10-minute bucket in the affected session:

Window cache_read tokens Notes
13:00–13:10 ~4.9 M Opus 4.6, 200K context
16:10–16:20 ~13 M right before switch
16:40–17:00 ~60 M right after switch to 4.7 [1M]
20:10–20:30 ~38 M still 4.7 [1M]

Max context per request grew from ~250K (pre-switch) to 650K+ (post-switch), because the 1M variant does not auto-compact at the 200K boundary the way the 200K model does. Each subsequent turn re-reads that full context from cache, so the burn rate scales with context size.

Session totals for the day: 391 M cache-read tokens, 10 M cache-creation, 948 K output across ~14h of combined session time. 5 hours quota gone in half an hour.

Happy to share more details privately if useful for reproduction.

What Should Happen?

In /model, the only Opus 4.7 option is Opus 4.7 (1M context). There is no 200K variant of 4.7 exposed. Users who want to stay on a 200K-context Opus have to downgrade to 4.6, assuming it remains available.

Ask

  1. Don't silently switch model variants mid-session. If a user started on a 200K model, keep them on a 200K model unless they opt in.
  2. Expose a 200K variant of Opus 4.7 in /model, like the 4.6 behavior. Not everyone wants to pay the 3–4× cache-read cost of the 1M variant by default.
  3. Warn users when context grows past a threshold (e.g., 300K) on 1M-context models, since the quota impact is invisible in the current UI.
  4. Document the cache-read cost implications of the 1M variant — "1M context" sounds like a pure upgrade, but the cost profile is very different.

Error Messages/Logs

Steps to Reproduce

The switch itself is server-side and not user-triggerable.

Claude Model

Opus

Is this a regression?

I don't know

Last Working Version

No response

Claude Code Version

2.1.112

Platform

Other

Operating System

Ubuntu/Debian Linux

Terminal/Shell

Other

Additional Information

Edit April 16th :

Additional context from community reports

Multiple users on r/ClaudeAI report the same burn-rate issue since 2026-04-16 (see thread: "Opus 4.7 is 50% more expensive with context regression?!"). Independent measurements indicate:

  • Tokenizer change: Opus 4.7 consumes ~1.35× more tokens than 4.6 for identical input, acknowledged by Anthropic's Boris Cherny on X as "by design for better quality."
  • Context recall regression (MRCR v2 benchmark):
  • 256K: 4.6 = 91.9% → 4.7 = 59.2%
  • 1M: 4.6 = 78.3% → 4.7 = 32.2%
  • Anthropic states limits were raised to compensate, but has not disclosed by how much.

The compound effect with the silent mid-session model switch (my case) is a ~4× burn rate increase, which matches my transcript data. For users who never intended to opt into the 1M variant, this is a regression presented as an upgrade.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions