Generic 400/disconnect errors misclassified as context_overflow in 1M-context sessions

## Bug description

`agent.error_classifier.classify_api_error()` can misclassify generic HTTP 400 errors and server disconnects as `FailoverReason.context_overflow` in explicitly large-context sessions (for example 1M-token Codex/GPT-5.x sessions), even when the prompt is far below the configured context window.

The problematic path is the absolute size/message-count heuristic. On current `main`, a generic 400 with many messages is classified as context overflow because `num_messages > 80`, even when `approx_tokens` is only ~74K against a 1M context window.

## Minimal reproduction

```python
from agent.error_classifier import classify_api_error

class FakeHTTP400(Exception):
    status_code = 400
    body = {"error": {"message": "Error"}}
    def __str__(self):
        return "Error"

result = classify_api_error(
    FakeHTTP400(),
    provider="openai-codex",
    model="gpt-5.5",
    approx_tokens=74320,
    context_length=1_000_000,
    num_messages=432,
)

print(result.reason, result.retryable, result.should_compress)
```

Current result:

```text
FailoverReason.context_overflow True True
```

Expected result:

```text
FailoverReason.format_error False False
```

A similar issue exists for server disconnect messages with the same low token pressure / high message count shape: the absolute `num_messages > 200` branch classifies it as `context_overflow` instead of a transport/timeout condition.

## Root cause

Current `agent/error_classifier.py` has heuristics equivalent to:

```python
# server disconnect path
is_large = approx_tokens > context_length * 0.6 or approx_tokens > 120000 or num_messages > 200

# generic 400 path
is_large = approx_tokens > context_length * 0.4 or approx_tokens > 80000 or num_messages > 80
```

The absolute fallbacks are reasonable for ~128K/200K context windows, but they are too aggressive for 1M-context sessions. A long session can have hundreds of messages while still being well below the actual context budget.

## User impact

This sends non-context errors into the context-overflow recovery path. In long-context Codex sessions, that can cause unnecessary compression and runtime context probe-down from an explicit 1M window to lower probe tiers (currently 256K/128K depending on branch/version), which can lead to repeated compaction and stale handoff pollution.

## Suggested fix

Gate the absolute token/message-count heuristics to smaller context windows, and require relative pressure for large-context models. For example:

```python
# server disconnect path
is_large = approx_tokens > context_length * 0.6 or (
    context_length <= 256000 and (approx_tokens > 120000 or num_messages > 200)
)

# generic 400 path
is_large = approx_tokens > context_length * 0.4 or (
    context_length <= 256000 and (approx_tokens > 80000 or num_messages > 80)
)
```

This preserves existing behavior for smaller context windows while preventing 1M sessions from being classified as overflow solely because they have many messages.

## Related work

Related but not identical:

- #14499: prevents direct long-context probe collapse by changing probe tiers
- #14858: guards untrusted probe shrink when the guessed tier is below the current prompt estimate
- #14953: preserves explicit context window after generic overflow
- #15844: merged context-length propagation/probe-tier changes
- #6751: fixed one Codex 400-format-error compression loop by parsing flat 400 bodies

This issue is specifically about the classifier entering `context_overflow` too early for large context windows due to absolute message-count/token heuristics.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generic 400/disconnect errors misclassified as context_overflow in 1M-context sessions #16351

Bug description

Minimal reproduction

Root cause

User impact

Suggested fix

Related work

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Generic 400/disconnect errors misclassified as context_overflow in 1M-context sessions #16351

Description

Bug description

Minimal reproduction

Root cause

User impact

Suggested fix

Related work

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions