Skip to content

Gateway can reuse parent-sized history after compression split, causing infinite preflight compression #25921

@bizyumov

Description

@bizyumov

Summary

A gateway session can enter an infinite preflight compression loop after a successful compression-induced session split.

Compression writes a small child transcript, and the turn ends with the child session id. However, the next inbound gateway turn can still receive the parent-sized history. That immediately triggers preflight compression again, creating another child, and the loop repeats.

This is related to #20470, but the observed failure is broader than a stale Telegram topic binding. In the failing trace, the next turn is logged as running under the compressed child session id while still receiving the original parent-sized history.

It is also related to #25242, but this issue is specifically about compression route/session/history publication after a split, not auto-continue tool-tail replay.

Environment shape

Configuration shape:

compression:
  enabled: true
  threshold: 0.5

auxiliary:
  compression:
    provider: openai-codex
    model: gpt-5.5
    timeout: 240

Using an explicit compression provider/model still reproduces the loop, so auxiliary.compression.provider: auto fallback and GLM are not the direct cause.

Anonymized observed trace

Anonymized log shape from a gateway session:

T0 [S0] conversation turn: history=147
T0 [S0] Preflight compression: ~138k tokens >= threshold
T0 [S0] context compression done: session=S1 messages=148->8 tokens=~32k
T0 [S0] Turn ended ... session=S1
T0 gateway.run: Session split detected: S0 -> S1 (compression)

T1 [S1] conversation turn: history=147
T1 [S1] Preflight compression: ~138k tokens >= threshold
T1 [S1] context compression done: session=S2 messages=148->8 tokens=~33k
T1 [S1] Turn ended ... session=S2
T1 gateway.run: Session split detected: S0 -> S2 (compression)

T2 [S2] conversation turn: history=147
T2 [S2] Preflight compression: ~138k tokens >= threshold

On disk, the compressed child transcripts were small, roughly:

S1.jsonl: ~19 messages
S2.jsonl: ~11 messages

The key anomaly is:

[S1] conversation turn: history=147

while the S1 transcript on disk is compact.

The second anomaly is:

Session split detected: S0 -> S2 (compression)

after the previous split was already S0 -> S1. The old side of the second split should be S1, not the original ancestor S0.

Reproduction steps

No personal data is required.

  1. Enable gateway-managed sessions with preflight compression.
  2. Use interrupt-capable gateway input mode, but do not rely on interrupts for the core reproduction.
  3. Create a gateway session with transcript size above the compression threshold.
  4. Send an inbound gateway message so preflight compression creates child session S1.
  5. Send another inbound message on the same gateway route.
  6. Observe that the gateway may log the child session id while still passing the parent-sized history into the agent.
  7. Preflight compression fires again and creates S2.
  8. Repeat messages; each turn compresses again instead of using the compact child transcript.

A deterministic regression test can stub compression so S0 always compresses to a small S1, then assert the next gateway turn loads S1 history rather than S0 history.

Actual behavior

  • Compression succeeds and writes a compact child transcript.
  • The next gateway turn can still receive parent-sized history.
  • The route/session split log can continue to treat the original parent as the old session even after the route should have advanced to the child.
  • The gateway repeatedly preflight-compresses already-compressed turns.

Expected behavior

After S0 -> S1 compression:

  • the canonical gateway route/session entry points to S1;
  • any channel/topic binding that pointed to S0 is advanced to S1 or resolved through the compression tip;
  • cached agent state is either consistent with S1 or evicted/rebuilt;
  • the next inbound turn loads compact S1 history;
  • if S1 later compresses again, the split is logged and applied as S1 -> S2.

Suspected root cause

Compression advances AIAgent.session_id, but the gateway does not publish the compression child as the canonical route/session/history source atomically.

Likely interacting problems:

  • gateway route/session entry can lag behind agent.session_id;
  • channel-specific bindings can point at an old compression ancestor;
  • cached AIAgent reuse can pair one session id with another session's loaded history;
  • split handling mutates only part of the routing state after agent.run_conversation(...);
  • inbound route resolution does not consistently follow compression descendants before loading transcript history.

Fix direction

Add a single gateway-side compression route publication path.

When a turn starts with canonical route session old_session_id, and after agent.run_conversation(...) the agent has agent.session_id == new_session_id != old_session_id, and the old session ended by compression, the gateway should atomically:

  1. update the canonical SessionStore entry to new_session_id;
  2. update channel/topic bindings that still point to old_session_id;
  3. update or invalidate cached AIAgent state so the next turn cannot combine new_session_id with old history;
  4. ensure the next transcript load uses new_session_id;
  5. log the split as old_session_id -> new_session_id, where old_session_id is the actual route session for the completed turn, not the original ancestor.

Also harden inbound route resolution:

  • if a binding points to a session with a compression child/tip, advance to the tip before loading history;
  • do not call explicit session-switch logic merely to follow compression lineage;
  • reuse a cached agent only when cached_agent.session_id == canonical_session_id, otherwise evict/rebuild and warn.

Regression tests requested

Please add tests for:

  1. next turn after S0 -> S1 compression receives compact S1 history;
  2. a later S1 -> S2 split advances from S1, not S0;
  3. channel/topic binding follows the compression tip before transcript load;
  4. cached agent/session mismatch cannot pair one session id with another session's history;
  5. route publication works with a stub compressor and does not depend on a specific auxiliary provider/model.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1High — major feature broken, no workaroundcomp/gatewayGateway runner, session dispatch, deliverysweeper:implemented-on-mainSweeper: behavior already present on current maintype/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions