Compression fallback marker after incomplete chunked read loses useful context in long sessions

## Summary

Context compression can fail when the auxiliary compression API call is interrupted with an incomplete chunked read. Hermes inserts a fallback context marker instead of a real summary:

```text
⚠️ Compression summary failed: peer closed connection without sending complete message body (incomplete chunked read). Inserted a fallback context marker.
```

This is especially visible in long Telegram sessions because context compaction is frequent.

## Observed log evidence

Local logs show repeated failures from auxiliary compression:

```text
agent.auxiliary_client: Auxiliary compression: using auto (gpt-5.5) at https://chatgpt.com/backend-api/codex/
WARNING root: Failed to generate context summary: peer closed connection without sending complete message body (incomplete chunked read). Further summary attempts paused for 60 seconds.
```

Recent examples occurred repeatedly in one long-running Telegram workflow, e.g.:

```text
2026-04-28 01:52:55 WARNING Failed to generate context summary: peer closed connection without sending complete message body (incomplete chunked read).
2026-04-28 02:14:57 WARNING Failed to generate context summary: peer closed connection without sending complete message body (incomplete chunked read).
2026-04-28 02:20:45 WARNING Failed to generate context summary: peer closed connection without sending complete message body (incomplete chunked read).
2026-04-28 02:23:20 WARNING Failed to generate context summary: peer closed connection without sending complete message body (incomplete chunked read).
```

## User impact

Not usually data-destructive, but it is operationally serious for long sessions:

- context is compacted without a useful generated summary;
- the fallback marker preserves that something happened, but useful prior-turn details can be lost;
- long Telegram sessions become less reliable exactly when compaction is needed most.

## Local mitigation tried

I applied local config mitigations to reduce frequency/severity:

```yaml
auxiliary:
  compression:
    timeout: 360

compression:
  threshold: 0.55
```

This should give the compression call more time and trigger compaction earlier with smaller context chunks. It does not address the underlying bug.

## Suggested fix direction

Compression should handle `incomplete chunked read`/peer-closed transport failures more robustly:

1. Treat incomplete chunked read as retryable for auxiliary compression, not as immediate fallback-marker finalization.
2. Retry with backoff before inserting fallback marker.
3. If the primary auxiliary provider fails, try configured fallback provider/model if available.
4. Consider a smaller emergency compression prompt/chunked summarization fallback before giving up.
5. Improve the fallback marker to include a minimal deterministic local summary such as message count, timestamp range, and last N user/assistant snippets, so continuity loss is less severe.

## Environment notes

- Gateway platform: Telegram
- Auxiliary compression provider: `auto`, resolving to the main `openai-codex` provider against `https://chatgpt.com/backend-api/codex/`
- Model observed: `gpt-5.5`


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compression fallback marker after incomplete chunked read loses useful context in long sessions #16670

Summary

Observed log evidence

User impact

Local mitigation tried

Suggested fix direction

Environment notes

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Compression fallback marker after incomplete chunked read loses useful context in long sessions #16670

Description

Summary

Observed log evidence

User impact

Local mitigation tried

Suggested fix direction

Environment notes

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions