Improve context compression retry/fallback for incomplete chunked reads

## Bug Description

During long-running Hermes Agent sessions, automatic context compression can fail when the auxiliary summary request is interrupted by a transient streaming/network error such as:

```text
peer closed connection without sending complete message body (incomplete chunked read)
```

When this happens, `agent/context_compressor.py` inserts a static fallback context marker and removes the middle conversation turns without a real summary. The session survives, but the compaction is lossy and the next assistant may need to recover context from files, logs, or session search.

## Observed Behavior

In a real long-running gateway session, logs showed repeated compression attempts and intermittent failures:

```text
Failed to generate context summary: peer closed connection without sending complete message body (incomplete chunked read). Further summary attempts paused for 60 seconds.
Summary generation failed — inserting static fallback context marker
Auxiliary compression: using auto (...) at ...
```

This is more likely in very long, tool-heavy tasks because the compression prompt can be large and the auxiliary response is a long-lived request.

## Expected Behavior

Transient compression-summary failures should be retried and/or routed through fallback providers before Hermes drops the compressed middle turns without a real summary.

Suggested behavior:

1. Classify common premature-response/streaming close errors as connection errors, including strings such as:
   - `incomplete chunked read`
   - `peer closed connection`
   - `unexpected eof`
   - `response ended prematurely`
   - `connection was closed`
2. Add a short retry policy around compression summary generation, e.g. 1-2 retries with small exponential backoff for transient network/timeout/read errors.
3. If `auxiliary.compression.provider` is `auto`, allow the existing auxiliary provider fallback chain to run for those transient errors.
4. Only insert the static fallback marker after retry/fallback attempts are exhausted.
5. Log enough detail to distinguish:
   - summary retry succeeded
   - fallback provider succeeded
   - final static marker fallback was used

## Why This Matters

The current static marker is better than crashing, but it is still lossy. In long tasks that run for hours, losing the middle handoff summary can make the agent repeat work, miss decisions, or require manual recovery.

## Relevant Code Areas

- `agent/context_compressor.py`
  - `_generate_summary(...)`
  - `compress(...)` static fallback marker path
- `agent/auxiliary_client.py`
  - `_is_connection_error(...)`
  - `call_llm(...)` retry/fallback path

## Possible Tests

- Unit test that `_is_connection_error()` returns true for `incomplete chunked read` / `peer closed connection` / premature EOF strings.
- Compression test where the first summary call raises a transient incomplete-chunked-read error and the second call succeeds; assert no static fallback marker is inserted.
- Compression test where auto provider A raises a transient connection error and provider B succeeds; assert the real summary is used.
- Compression test where all retry/fallback attempts fail; assert the existing static fallback marker is still inserted.

## Environment Notes

This was observed in a gateway/Feishu long-running workflow with `compression.enabled: true` and `auxiliary.compression.provider: auto`. The underlying cause is likely transient upstream/proxy/network interruption, but Hermes can make this path much more robust.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve context compression retry/fallback for incomplete chunked reads #18458

Bug Description

Observed Behavior

Expected Behavior

Why This Matters

Relevant Code Areas

Possible Tests

Environment Notes

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Improve context compression retry/fallback for incomplete chunked reads #18458

Description

Bug Description

Observed Behavior

Expected Behavior

Why This Matters

Relevant Code Areas

Possible Tests

Environment Notes

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions