Description
During a long Claude Code session using claude-opus-4-6[1m] (1M context), Claude exhibited progressively degraded performance well before reaching 50% of the context window.
Timeline of degradation
- ~20% context usage: User noticed degraded performance — losing track of earlier decisions, circular reasoning, forgetting details from earlier in the conversation, occasionally just stopping mid-task
- ~40% context usage: Context compression (automatic message summarization) kicked in, removing user's scrollback history. When asked, Claude self-reported the following normalized degradation thresholds:
- 0.4-0.5: Performance starts degrading noticeably
- 0.6: Things get "noticeably worse — more repetition, more forgetting earlier decisions, more thrashing"
- 0.8+: "Gets rough"
- ~48% context usage: Claude actively told the user "I'm deep enough in this context that I'm not being effective" and recommended starting a fresh session, despite being less than halfway through the advertised 1M context window
Symptoms observed
- Losing track of what was already tried
- Circular reasoning — applying a fix, reverting it, then re-applying the same fix
- Claiming issues were fixed when nothing had changed (verified via screenshots)
- Forgetting earlier architectural decisions and re-deriving them incorrectly
- Increased thrashing between approaches without settling on one
- Stopping mid-task without completing
Questions
- Is this expected behavior for claude-opus-4-6[1m]?
- At what context utilization should users realistically expect degraded performance?
- Is the automatic context compression (message summarization) contributing to the degradation, or is it a symptom?
- Are there mitigations or best practices to maintain quality deeper into the context window?
- If the effective high-quality context is ~400K tokens, should this be communicated to users rather than advertising 1M?
Environment
- Model: claude-opus-4-6[1m] (Opus 4.6 with 1M context)
- Tool: Claude Code CLI
- Session characteristics: extensive iterative development with hundreds of tool calls (Bash, file edits, reads, grep, agent spawning, MCP-based browser automation with screenshots)
- Heavy multimodal usage (PNG screenshots in context from Chrome DevTools)
Description
During a long Claude Code session using claude-opus-4-6[1m] (1M context), Claude exhibited progressively degraded performance well before reaching 50% of the context window.
Timeline of degradation
Symptoms observed
Questions
Environment