bug(core): context compaction loop when budget too tight — each turn compacts, 400 on oversized context

## Description

When `context_budget_tokens` is set too tight (e.g. 18K with a 12K+ system prompt), the compaction logic enters a multi-turn loop where every single turn triggers compaction, compaction summaries grow the context further, and eventually the API returns 400 (context too large).

## Reproduction

Config: `context_budget_tokens = 18000, compaction_threshold = 0.65, auto_budget = false, provider = openai`
Run 7+ turns with tool calls.

## Observed Behavior

1. Turn 5: `cached_tokens=15371, threshold=11700 → should_compact=true` — **compaction #1 fires**: 40 messages → `summary_tokens=3104`. After: `cached_tokens=12329` (still > threshold).
2. Turn 6: `should_compact=true` again — **compaction #2 fires**: 2 messages → `summary_tokens=2755`. After: `cached_tokens=12217` (still > threshold).
3. Turn 7: **compaction #3** fires — 400 Bad Request from API (context too large).

The root problem: compaction summaries are injected back as system messages. With a very tight budget, the system prompt + injected summaries alone exceed the threshold, so every turn triggers compaction even when there are almost no messages left to compact.

**Secondary bug**: compaction #2 produced a 2755-token summary for only 2 messages — the summary is larger than what it replaced, making the context worse.

## Debug dump evidence

Request #3 (failed): 3 system messages (18630 + 2059 + 13957 chars), tool output 15799 chars, `max_tokens=4096`. Total context clearly exceeds 18K token budget.

## Expected Behavior

- After compaction, if `cached_tokens` is still above threshold, emit `WARN "context compaction could not reduce usage below threshold (compacted N messages, still at M/B tokens)"` and **stop attempting further compaction** for that session — surface `Stopping: context window is nearly full` to user.
- OR: add a post-compaction cooldown: skip `should_compact()` for the next N turns after a successful compaction.
- Compaction summary should be bounded — if `summary_tokens > freed_tokens`, it was counterproductive; log WARN and don't apply.

## Severity

Medium — only affects users with extremely tight `context_budget_tokens` settings (well below the default). With `auto_budget = true` (default), this doesn't occur. Manual tight budgets hit this edge case.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug(core): context compaction loop when budget too tight — each turn compacts, 400 on oversized context #1708

Description

Reproduction

Observed Behavior

Debug dump evidence

Expected Behavior

Severity

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

bug(core): context compaction loop when budget too tight — each turn compacts, 400 on oversized context #1708

Description

Description

Reproduction

Observed Behavior

Debug dump evidence

Expected Behavior

Severity

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions