[Feature]: Re-budget the context compressor when a router serves a different backend per request

### Problem or Use Case

When the configured model is a router id (`openrouter/auto`, `:free`-suffixed names, or a model fallback chain), the router can pick a different backend for each request, and each backend has a different context window. Hermes budgets compression once, against the configured model id. So when the router serves a smaller-window backend, the agent can exceed the real limit before compaction triggers. When it serves a larger-window backend, Hermes compacts earlier than necessary and wastes usable context. There is currently no way for the compaction budget to follow the backend that actually served a given call.

### Proposed Solution

Read the `model` field from each successful API response (`response.model`). When the served backend changes, look up that backend's `context_length` and re-budget the compressor thresholds (trigger point, tail budget, summary cap) to the real window. Gate it behind a config key, `compression.adaptive_context_window`, default false, so it is a no-op for anyone who has not opted in. Cost is at most one `model_metadata` lookup per backend transition, and lookups are cached on disk, so repeated transitions to the same backend are free.

### Alternatives Considered


Static per-model config overrides handle the case where you know the backend ahead of time, and manual re-probes cover explicit events. Several open PRs already take those routes: #24495 (per-model `context_length` and `provider_routing` overrides), #37548 (respect `model.context_length` config), #36199 (persist resolved `context_length` after `/model` switch), #31067 (reinitialize the compressor on `/new`), and #31492 (re-probe on session reset). None of them can see a silent per-request router swap, because config does not know which backend the router chose and there is no explicit switch event to hook. The only signal of what actually served a call is `response.model`. This is meant to be complementary to those PRs, not a replacement: config sets your intended defaults, this corrects the budget at runtime when the router ignores them.

### Feature Type

Performance / reliability

### Scope

Medium (few files, < 300 lines)

### Contribution

- [x] I'd like to implement this myself and submit a PR

### Debug Report (optional)

```shell

```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: Re-budget the context compressor when a router serves a different backend per request #37719

Problem or Use Case

Proposed Solution

Alternatives Considered

Feature Type

Scope

Contribution

Debug Report (optional)

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Feature]: Re-budget the context compressor when a router serves a different backend per request #37719

Description

Problem or Use Case

Proposed Solution

Alternatives Considered

Feature Type

Scope

Contribution

Debug Report (optional)

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions