fix: time travel replay/fork for graphs with interrupts and subgraphs#2325
Closed
open-swe[bot] wants to merge 1 commit into
Closed
fix: time travel replay/fork for graphs with interrupts and subgraphs#2325open-swe[bot] wants to merge 1 commit into
open-swe[bot] wants to merge 1 commit into
Conversation
Ports Python PRs #7038, #7115, #7498 to fix: - Stale RESUME writes during replay (interrupts returned cached answers) - Wrong subgraph checkpoint loading during time travel - Missing fork checkpoint on time travel (resumes loaded wrong state) Adds ReplayState class, isReplaying property, eager fork checkpoint creation, and direct-to-subgraph time travel detection. Co-authored-by: Sydney Runkle <54324534+sydney-runkle@users.noreply.github.com>
|
@langchain/langgraph-checkpoint
@langchain/langgraph-checkpoint-mongodb
@langchain/langgraph-checkpoint-postgres
@langchain/langgraph-checkpoint-redis
@langchain/langgraph-checkpoint-sqlite
@langchain/langgraph-checkpoint-validation
create-langgraph
@langchain/langgraph-api
@langchain/langgraph-cli
@langchain/langgraph
@langchain/langgraph-cua
@langchain/langgraph-supervisor
@langchain/langgraph-swarm
@langchain/langgraph-ui
@langchain/langgraph-sdk
@langchain/angular
@langchain/react
@langchain/svelte
@langchain/vue
commit: |
Member
|
Closing in favor of #2179 |
Christian Bromann (christian-bromann)
added a commit
that referenced
this pull request
Jun 10, 2026
## Summary Ports Python time-travel fixes ([#7038](langchain-ai/langgraph#7038), [#7115](langchain-ai/langgraph#7115), [#7498](langchain-ai/langgraph#7498), [#7499](langchain-ai/langgraph#7499)) into `@langchain/langgraph` so replay/fork behave correctly with interrupts and nested subgraphs. - **Stale `RESUME` on replay** — Replaying from a checkpoint before an interrupt no longer consumes cached resume writes; interrupts re-fire with the correct payload. - **Subgraph checkpoint loading on time travel** — Introduces `ReplayState` (`CONFIG_KEY_REPLAY_STATE`) so nested subgraphs load the checkpoint that existed at the replay point on first visit, then resume normal head loading within the same run. - **Parent fork checkpoints on replay** — Time travel runs through `PregelLoop._first()` (not `stream()` delegation on the parent `Pregel`), creating an eager `source: "fork"` checkpoint and propagating `ReplayState` to subgraphs. - **Direct-to-subgraph time travel** — `getState()` subgraph delegation is guarded with `CONFIG_KEY_READ`; direct subgraph configs strip stale `RESUME` writes and prefer explicit `checkpoint_id` over `checkpoint_map` when both are set. - **Streaming** — Fixes subgraph interrupt namespace when streaming with `subgraphs: true` (empty `checkpoint_ns` no longer becomes `[""]`; parent emits interrupts under the deepest `checkpoint_map` namespace). Closes #2325 (supersedes the earlier partial port). ### Implementation notes | Area | Change | |------|--------| | `pregel/replay.ts` | New `ReplayState` class (mirrors Python) | | `pregel/loop.ts` | Replay/time-travel detection, fork creation, `RESUME` stripping, `ReplayState` wiring, stream namespace helpers | | `pregel/index.ts` | `getState` subgraph delegation guard only (removed `stream()` bypass that skipped parent fork creation) | | Tests | `time_travel.test.ts` (14), `time_travel_extended.test.ts` (33), shared `time_travel_helpers.ts`, Vitest matchers `toBeInterrupted` / `toHaveInterruptValue` | --------- Co-authored-by: Cursor <cursoragent@cursor.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Ports three Python PRs (#7038, #7115, #7498) that fix time travel (replay and fork) for graphs with interrupts and subgraphs.
Problems fixed
RESUMEvalues from priorinterrupt()calls, so interrupts silently returned stale answers instead of re-firing.checkpoint_mapwasn't detected, causing stale RESUME writes to be preserved instead of stripped._putCheckpoint()ran, no new checkpoint was created. The parent's "latest" checkpoint remained the old one, so subsequentCommand(resume=...)calls loaded the wrong state.Changes
constants.ts: AddedCONFIG_KEY_REPLAY_STATEconstant and added it to theRESERVEDset.pregel/replay.ts(new):ReplayStateclass that tracks which subgraph namespaces have already loaded their pre-replay checkpoint. On first visit to a subgraph namespace, it loads the latest checkpoint created before the replay point (viacheckpointer.list(..., before=..., limit=1)). On subsequent visits (e.g. the same subgraph in a later loop iteration), it falls back to normal latest-checkpoint loading. The task-id suffix is stripped from namespaces so the same logical subgraph is recognized across loop iterations.pregel/loop.ts:isReplayinggetter (equivalent to Python'sis_replaying) — returnstruewhenskipDoneTasksisfalse(i.e., replaying from a specific checkpoint).interrupt()re-fires instead of returning old values. Genuine resumes (Command(resume=...)) preserve these writes.checkpoint_map, it's detected as time-travel and RESUME writes are force-stripped even whenCONFIG_KEY_RESUMINGis set by the parent.update_statefork), it eagerly writes a fork checkpoint (source="fork") at the start of the tick. This ensures the parent thread's latest checkpoint points to the replayed state and subsequent resumes find the correct checkpoint. Stale INTERRUPT pending writes are also cleared.ReplayStateinstance and passes it to subgraphs via config. For forks (source=update/fork), the replay state uses the fork's parent checkpoint ID.ReplayStatein its config (and no explicitcheckpoint_id), it delegates checkpoint loading toReplayState.getCheckpoint()instead of using the defaultgetTuple. It also clearsCONFIG_KEY_RESUMINGso the subgraph re-applies input naturally.tests/python_port/checkpoint.test.ts: Updated existingtest_running_from_checkpoint_id_retains_previous_writesto account for the new fork checkpoint appearing in history (history length increases by 2 instead of 1).Tests
New test file
time_travel.test.tswith 12 tests covering:PR #7038 tests:
PR #7115 tests:
PR #7498 tests:
Test Plan
Opened collaboratively by Sydney Runkle and open-swe.