chore: node-level timeouts#7599
Merged
Merged
Conversation
5d6faee to
585f776
Compare
585f776 to
6506787
Compare
977e52c to
0a3ac8d
Compare
f6e72d5 to
5e64ead
Compare
5e64ead to
6658120
Compare
William FH (hinthornw)
added a commit
that referenced
this pull request
Apr 27, 2026
Reverts #7599 I am going to implement this as an `idle_timeout` instead. I think that's a better default behavior.
Christian Bromann (christian-bromann)
added a commit
to langchain-ai/langgraphjs
that referenced
this pull request
May 28, 2026
Add a timeout option to addNode, the functional API (task/entrypoint), and the Send constructor. Accept a number of milliseconds (hard wall-clock cap) or a TimeoutPolicy { runTimeout, idleTimeout, refreshOn }.
Firing raises NodeTimeoutError(kind: run|idle) carrying node/elapsed/timeouts/kind. idleTimeout resets on observable progress (writes, stream-writer calls, child-task scheduling, callbacks) or runtime.heartbeat(). On fire, buffered task writes are dropped and the node's AbortSignal is aborted. The timer resets per retry attempt, and NodeTimeoutError is retryable under the default retry policy.
Ports langchain-ai/langgraph#7599, #7646, #7659.
Christian Bromann (christian-bromann)
added a commit
to langchain-ai/langgraphjs
that referenced
this pull request
Jun 3, 2026
Add a timeout option to addNode, the functional API (task/entrypoint), and the Send constructor. Accept a number of milliseconds (hard wall-clock cap) or a TimeoutPolicy { runTimeout, idleTimeout, refreshOn }.
Firing raises NodeTimeoutError(kind: run|idle) carrying node/elapsed/timeouts/kind. idleTimeout resets on observable progress (writes, stream-writer calls, child-task scheduling, callbacks) or runtime.heartbeat(). On fire, buffered task writes are dropped and the node's AbortSignal is aborted. The timer resets per retry attempt, and NodeTimeoutError is retryable under the default retry policy.
Ports langchain-ai/langgraph#7599, #7646, #7659.
Christian Bromann (christian-bromann)
added a commit
to langchain-ai/langgraphjs
that referenced
this pull request
Jun 10, 2026
Add a timeout option to addNode, the functional API (task/entrypoint), and the Send constructor. Accept a number of milliseconds (hard wall-clock cap) or a TimeoutPolicy { runTimeout, idleTimeout, refreshOn }.
Firing raises NodeTimeoutError(kind: run|idle) carrying node/elapsed/timeouts/kind. idleTimeout resets on observable progress (writes, stream-writer calls, child-task scheduling, callbacks) or runtime.heartbeat(). On fire, buffered task writes are dropped and the node's AbortSignal is aborted. The timer resets per retry attempt, and NodeTimeoutError is retryable under the default retry policy.
Ports langchain-ai/langgraph#7599, #7646, #7659.
Christian Bromann (christian-bromann)
added a commit
to langchain-ai/langgraphjs
that referenced
this pull request
Jun 10, 2026
Add a timeout option to addNode, the functional API (task/entrypoint), and the Send constructor. Accept a number of milliseconds (hard wall-clock cap) or a TimeoutPolicy { runTimeout, idleTimeout, refreshOn }.
Firing raises NodeTimeoutError(kind: run|idle) carrying node/elapsed/timeouts/kind. idleTimeout resets on observable progress (writes, stream-writer calls, child-task scheduling, callbacks) or runtime.heartbeat(). On fire, buffered task writes are dropped and the node's AbortSignal is aborted. The timer resets per retry attempt, and NodeTimeoutError is retryable under the default retry policy.
Ports langchain-ai/langgraph#7599, #7646, #7659.
Christian Bromann (christian-bromann)
added a commit
to langchain-ai/langgraphjs
that referenced
this pull request
Jun 10, 2026
Add a timeout option to addNode, the functional API (task/entrypoint), and the Send constructor. Accept a number of milliseconds (hard wall-clock cap) or a TimeoutPolicy { runTimeout, idleTimeout, refreshOn }.
Firing raises NodeTimeoutError(kind: run|idle) carrying node/elapsed/timeouts/kind. idleTimeout resets on observable progress (writes, stream-writer calls, child-task scheduling, callbacks) or runtime.heartbeat(). On fire, buffered task writes are dropped and the node's AbortSignal is aborted. The timer resets per retry attempt, and NodeTimeoutError is retryable under the default retry policy.
Ports langchain-ai/langgraph#7599, #7646, #7659.
Christian Bromann (christian-bromann)
pushed a commit
to langchain-ai/langgraphjs
that referenced
this pull request
Jun 10, 2026
This PR was opened by the [Changesets release](https://github.com/changesets/action) GitHub action. When you're ready to do a release, you can merge this and the packages will be published to npm automatically. If you're not ready to do a release yet, that's fine, whenever you add more changesets to main, this PR will be updated. # Releases ## @langchain/langgraph-checkpoint@1.1.0 ### Minor Changes - [#2452](#2452) [`a8e7659`](a8e7659) Thanks [@christian-bromann](https://github.com/christian-bromann)! - Add `DeltaChannel` and the writes-history saver API (beta). `DeltaChannel` is a reducer channel that stores only a sentinel in checkpoint blobs instead of the full accumulated value, reconstructing state on read by replaying ancestor writes through a batch reducer. This avoids re-serializing the entire accumulated value at every step (e.g. long message histories). - `DeltaChannel(reducer, { snapshotFrequency })` in `@langchain/langgraph` — count-based snapshot cadence (default `snapshotFrequency=1000`) plus a system bound `DELTA_MAX_SUPERSTEPS_SINCE_SNAPSHOT` (default 5000, env `LANGGRAPH_DELTA_MAX_SUPERSTEPS_SINCE_SNAPSHOT`). - `messagesDeltaReducer` — a batching-invariant messages reducer that coerces raw object/string writes, for use with `DeltaChannel`. - `BaseCheckpointSaver.getDeltaChannelHistory({ config, channels })` (beta) — walks the parent chain returning per-channel `{ writes, seed? }`, with a direct-storage override in `MemorySaver`. - `counters_since_delta_snapshot` added to `CheckpointMetadata`; `DeltaSnapshot` serialization support in the JSON+ serializer. Reconstruction is wired through the Pregel read/execution paths (initialization, `getState`, `updateState`, local reads) and `exit` durability accumulates and anchors delta writes so threads remain reconstructible without forcing snapshots. ### Patch Changes - [#2450](#2450) [`2f6d873`](2f6d873) Thanks [@christian-bromann](https://github.com/christian-bromann)! - Add node-level timeouts. A `timeout` option is now supported on `StateGraph.addNode`, the functional API (`task`/`entrypoint`), and the `Send` constructor. Pass a number of milliseconds for a hard wall-clock cap, or a `TimeoutPolicy` for finer control: ```ts import { TimeoutPolicy } from "@langchain/langgraph"; // hard wall-clock cap on each attempt builder.addNode("agent", agentFn, { timeout: 60_000 }); // full control builder.addNode("agent", agentFn, { timeout: { runTimeout: 60_000, // hard wall-clock cap, never refreshed idleTimeout: 10_000, // cap on time without observable progress refreshOn: "auto", // "auto" | "heartbeat" }, }); // per-task override new Send("agent", state, { timeout: { idleTimeout: 5_000 } }); ``` When a timeout fires, a `NodeTimeoutError` (carrying `node`, `kind` (`"run"`/`"idle"`), `timeout`, `elapsed`, `runTimeout`, `idleTimeout`) is raised, the attempt's buffered writes are dropped, and the node's `AbortSignal` is aborted. `idleTimeout` is refreshed by observable progress (writes, custom stream-writer calls, child-task scheduling, callback events) or an explicit `runtime.heartbeat()` call. The timer resets per retry attempt, and `NodeTimeoutError` is retryable under the default retry policy. Ports langchain-ai/langgraph#7599, [#7646](https://github.com/langchain-ai/langgraphjs/issues/7646), and [#7659](https://github.com/langchain-ai/langgraphjs/issues/7659). ## @langchain/langgraph@1.4.0 ### Minor Changes - [#2449](#2449) [`d12d269`](d12d269) Thanks [@christian-bromann](https://github.com/christian-bromann)! - Add cooperative, between-superstep graph draining via `RunControl`. A new `RunControl` (exported from `@langchain/langgraph`) exposes `requestDrain(reason)` plus read-only `drainRequested` / `drainReason`. Pass it through the new `control` option on `invoke` / `stream` / `streamEvents` (and the functional API). It is surfaced on `runtime.control`, so nodes can read it or call `requestDrain()` themselves, and it is propagated into subgraphs. When a drain is requested, the Pregel loop checks the flag at the top of each superstep (after the previous step's writes are applied and checkpointed): if more tasks remain it saves the checkpoint and throws the new `GraphDrained` error (also under `durability: "exit"`), so the run can be resumed later from the same config. If the graph naturally finishes on that tick it returns normally and the caller can inspect `control.drainRequested`. A drain requested inside a subgraph bubbles up and stops the parent at its next boundary. Draining never cancels work that is already running — pair it with an `AbortSignal` if you need a hard upper bound. - [#2452](#2452) [`a8e7659`](a8e7659) Thanks [@christian-bromann](https://github.com/christian-bromann)! - Add `DeltaChannel` and the writes-history saver API (beta). `DeltaChannel` is a reducer channel that stores only a sentinel in checkpoint blobs instead of the full accumulated value, reconstructing state on read by replaying ancestor writes through a batch reducer. This avoids re-serializing the entire accumulated value at every step (e.g. long message histories). - `DeltaChannel(reducer, { snapshotFrequency })` in `@langchain/langgraph` — count-based snapshot cadence (default `snapshotFrequency=1000`) plus a system bound `DELTA_MAX_SUPERSTEPS_SINCE_SNAPSHOT` (default 5000, env `LANGGRAPH_DELTA_MAX_SUPERSTEPS_SINCE_SNAPSHOT`). - `messagesDeltaReducer` — a batching-invariant messages reducer that coerces raw object/string writes, for use with `DeltaChannel`. - `BaseCheckpointSaver.getDeltaChannelHistory({ config, channels })` (beta) — walks the parent chain returning per-channel `{ writes, seed? }`, with a direct-storage override in `MemorySaver`. - `counters_since_delta_snapshot` added to `CheckpointMetadata`; `DeltaSnapshot` serialization support in the JSON+ serializer. Reconstruction is wired through the Pregel read/execution paths (initialization, `getState`, `updateState`, local reads) and `exit` durability accumulates and anchors delta writes so threads remain reconstructible without forcing snapshots. - [#2451](#2451) [`d65a920`](d65a920) Thanks [@christian-bromann](https://github.com/christian-bromann)! - feat(langgraph): add node-level error handlers `StateGraph.addNode(name, fn, { errorHandler })` now accepts a first-class node-level error handler. The handler runs ONLY after the failing node's `retryPolicy` is exhausted, so retry and handling stay decoupled. It receives a typed `NodeError { node, error }` and the typed node input state, can return a state update, and can route to a recovery branch via `new Command({ goto })` (saga / compensation flows). Failure provenance is checkpointed (via a reserved `ERROR_SOURCE_NODE` write) so handlers observe the same context after a checkpoint resume. Uncaught node errors without a handler still abort the run as before, and `GraphBubbleUp` errors (such as `interrupt()`) are never swallowed by a handler. `StateGraph.setNodeDefaults({ errorHandler })` now also accepts a graph-wide default handler. It is materialized at `compile()` as a single shared handler and invoked for every regular node that does not set its own `errorHandler`. A per-node handler always takes precedence, the default never catches a failure raised by an error-handler node itself (handler failures fail the run), and the default is not inherited by subgraphs. Ports the Python feature from langchain-ai/langgraph#7233. - [#2450](#2450) [`2f6d873`](2f6d873) Thanks [@christian-bromann](https://github.com/christian-bromann)! - Add node-level timeouts. A `timeout` option is now supported on `StateGraph.addNode`, the functional API (`task`/`entrypoint`), and the `Send` constructor. Pass a number of milliseconds for a hard wall-clock cap, or a `TimeoutPolicy` for finer control: ```ts import { TimeoutPolicy } from "@langchain/langgraph"; // hard wall-clock cap on each attempt builder.addNode("agent", agentFn, { timeout: 60_000 }); // full control builder.addNode("agent", agentFn, { timeout: { runTimeout: 60_000, // hard wall-clock cap, never refreshed idleTimeout: 10_000, // cap on time without observable progress refreshOn: "auto", // "auto" | "heartbeat" }, }); // per-task override new Send("agent", state, { timeout: { idleTimeout: 5_000 } }); ``` When a timeout fires, a `NodeTimeoutError` (carrying `node`, `kind` (`"run"`/`"idle"`), `timeout`, `elapsed`, `runTimeout`, `idleTimeout`) is raised, the attempt's buffered writes are dropped, and the node's `AbortSignal` is aborted. `idleTimeout` is refreshed by observable progress (writes, custom stream-writer calls, child-task scheduling, callback events) or an explicit `runtime.heartbeat()` call. The timer resets per retry attempt, and `NodeTimeoutError` is retryable under the default retry policy. Ports langchain-ai/langgraph#7599, [#7646](https://github.com/langchain-ai/langgraphjs/issues/7646), and [#7659](https://github.com/langchain-ai/langgraphjs/issues/7659). - [#2461](#2461) [`801d955`](801d955) Thanks [@christian-bromann](https://github.com/christian-bromann)! - Add `StateGraph.setNodeDefaults()` for setting graph-wide node policy defaults (`retryPolicy`, `cachePolicy`). Per-node values passed to `addNode` always take precedence, and defaults are resolved at `compile()` time so call order does not matter. Defaults are not inherited by subgraphs. Ports Python's `set_node_defaults()` (langchain-ai/langgraph#7747). ### Patch Changes - [#2179](#2179) [`01c67df`](01c67df) Thanks [@christian-bromann](https://github.com/christian-bromann)! - fix(core): time travel replay/fork for graphs with interrupts and subgraphs Ports Python fixes for stale RESUME writes during replay, wrong subgraph checkpoint loading during time travel, missing fork checkpoints on replay, and direct-to-subgraph time travel. - [#2514](#2514) [`9e0201d`](9e0201d) Thanks [@christian-bromann](https://github.com/christian-bromann)! - fix(schema): expose StateSchema JSON schemas for Studio introspection Route StateSchema runtime definitions through getJsonSchema() and getInputJsonSchema() so LangGraph Studio receives state, input, and context schemas when graphs use the StateSchema primitive. Fixes [#2466](#2466) - [#2471](#2471) [`9b96f60`](9b96f60) Thanks [@christian-bromann](https://github.com/christian-bromann)! - perf(core): skip debug checkpoint snapshots when not streaming them Avoid building full-state `mapDebugCheckpoint` payloads on every tick when no consumer subscribed to `checkpoints` or `debug` stream modes. v3 companion checkpoint envelopes are unchanged (they come from values metadata). - [#2472](#2472) [`8e06ace`](8e06ace) Thanks [@christian-bromann](https://github.com/christian-bromann)! - perf(core): index pending writes for O(1) task-prep lookups Build a PendingWritesIndex once per \_prepareNextTasks call so resume and skip-done-task checks avoid repeated linear scans over checkpointPendingWrites. - [#2473](#2473) [`a8b0036`](a8b0036) Thanks [@christian-bromann](https://github.com/christian-bromann)! - perf(core): optimize applyWrites, interrupt seen, and channel errors Reduce allocations in \_applyWrites, fix O(N²) interrupt versions_seen updates, skip stack traces on EmptyChannelError control flow, and cache task lists in the pregel loop and runner. - [#2444](#2444) [`4096933`](4096933) Thanks [@christian-bromann](https://github.com/christian-bromann)! - feat(remote): add RemoteGraph v3 streaming support Expose the v3 `streamEvents` surface for `RemoteGraph` by adapting remote SDK thread streams to the local `GraphRunStream` shape. - Updated dependencies \[[`a8e7659`](a8e7659), [`2f6d873`](2f6d873)]: - @langchain/langgraph-checkpoint@1.1.0 ## @langchain/langgraph-checkpoint-mongodb@1.3.4 ### Patch Changes - [#2517](#2517) [`67a4f8d`](67a4f8d) Thanks [@jackjin1997](https://github.com/jackjin1997)! - fix: `MongoDBSaver.putWrites` now honors `WRITES_IDX_MAP`, pinning special channels (`__error__`, `__scheduled__`, `__interrupt__`, `__resume__`) to fixed negative indices instead of the call-local ordinal. Previously a mixed `putWrites([[...regular...], [INTERRUPT, …]], taskId)` placed the INTERRUPT at a positive idx that could collide with a regular write at the same `(task_id, idx)`, and the unconditional `$set` upsert silently overwrote whichever row landed there first. The conflict-resolution clause now matches the Postgres / SQLite (TS and Python) checkpointers: `$set` only when every channel is a special one, `$setOnInsert` otherwise. ## @langchain/langgraph-checkpoint-postgres@1.0.3 ### Patch Changes - [#2512](#2512) [`375c73f`](375c73f) Thanks [@jackjin1997](https://github.com/jackjin1997)! - fix: reject SQL `LIKE` wildcards (`%`, `_`) and the backslash escape character in `PostgresStore` namespace labels. `BaseStore.search()` matches namespaces via `namespace_path LIKE ${prefix}%`, and these characters in caller-supplied namespace labels are interpreted as wildcards by Postgres even through a bound parameter — letting a namespace prefix of `["%"]` match every namespace in the store across tenants. `validateNamespace` now throws for these characters at all `search` / `get` / `put` entrypoints, keeping store-wide consistency. CWE-1336. ## @langchain/langgraph-checkpoint-redis@1.0.8 ### Patch Changes - [#2518](#2518) [`9182ea3`](9182ea3) Thanks [@jackjin1997](https://github.com/jackjin1997)! - fix: `RedisSaver.putWrites` now honors `WRITES_IDX_MAP`, pinning special channels (`__error__`, `__scheduled__`, `__interrupt__`, `__resume__`) to fixed negative indices in their Redis key (`checkpoint_write:…:<idx>`) instead of the call-local ordinal. Previously a mixed `putWrites([[…regular…], [INTERRUPT, …]], taskId)` placed the INTERRUPT key at the positive idx of its position in the batch, where a peer task's regular write at the same idx would overwrite it via the unconditional `JSON.SET`. The conflict-resolution clause now matches Postgres / SQLite / MongoDB: unguarded `JSON.SET` when every write is a special channel, `JSON.SET … NX` (insert-or-ignore) otherwise. ## @langchain/langgraph-checkpoint-sqlite@1.0.3 ### Patch Changes - [#2516](#2516) [`f6a6d26`](f6a6d26) Thanks [@jackjin1997](https://github.com/jackjin1997)! - fix: `SqliteSaver.putWrites` now honors `WRITES_IDX_MAP`, pinning special channels (`__error__`, `__scheduled__`, `__interrupt__`, `__resume__`) to fixed negative indices instead of the call-local ordinal. Previously a follow-up `putWrites([[INTERRUPT, …]], taskId)` for the same checkpoint silently `REPLACE`d the regular write previously stored at `idx=0` for that task, losing data. The conflict-resolution clause also now matches the Python checkpointer contract: `OR REPLACE` only when every channel is a special one (so e.g. INTERRUPT→RESUME state transitions overwrite), `OR IGNORE` otherwise. ## @langchain/angular@1.0.21 ### Patch Changes - [#2515](#2515) [`49b8c1a`](49b8c1a) Thanks [@christian-bromann](https://github.com/christian-bromann)! - fix: make AnyStream a true supertype so selector hooks need no cast A concrete `useStream<typeof agent>()` handle was not assignable to `AnyStream` because generic-computed covariant members (`toolCalls`, `values`) don't widen under `any` — `InferToolCalls<any>[]` resolves to `AssembledToolCall<…, never>[]`, narrower than a concrete handle. Override those members with their widest forms (preserving each framework's reactivity wrapper — plain arrays for React/Svelte, `ShallowRef` for Vue, `Signal` for Angular) so the message/tool/value selector hooks accept a fully-typed stream without an `as AnyStream` cast. ## @langchain/react@1.0.21 ### Patch Changes - [#2515](#2515) [`49b8c1a`](49b8c1a) Thanks [@christian-bromann](https://github.com/christian-bromann)! - fix: make AnyStream a true supertype so selector hooks need no cast A concrete `useStream<typeof agent>()` handle was not assignable to `AnyStream` because generic-computed covariant members (`toolCalls`, `values`) don't widen under `any` — `InferToolCalls<any>[]` resolves to `AssembledToolCall<…, never>[]`, narrower than a concrete handle. Override those members with their widest forms (preserving each framework's reactivity wrapper — plain arrays for React/Svelte, `ShallowRef` for Vue, `Signal` for Angular) so the message/tool/value selector hooks accept a fully-typed stream without an `as AnyStream` cast. ## @langchain/svelte@1.0.21 ### Patch Changes - [#2515](#2515) [`49b8c1a`](49b8c1a) Thanks [@christian-bromann](https://github.com/christian-bromann)! - fix: make AnyStream a true supertype so selector hooks need no cast A concrete `useStream<typeof agent>()` handle was not assignable to `AnyStream` because generic-computed covariant members (`toolCalls`, `values`) don't widen under `any` — `InferToolCalls<any>[]` resolves to `AssembledToolCall<…, never>[]`, narrower than a concrete handle. Override those members with their widest forms (preserving each framework's reactivity wrapper — plain arrays for React/Svelte, `ShallowRef` for Vue, `Signal` for Angular) so the message/tool/value selector hooks accept a fully-typed stream without an `as AnyStream` cast. ## @langchain/vue@1.0.21 ### Patch Changes - [#2515](#2515) [`49b8c1a`](49b8c1a) Thanks [@christian-bromann](https://github.com/christian-bromann)! - fix: make AnyStream a true supertype so selector hooks need no cast A concrete `useStream<typeof agent>()` handle was not assignable to `AnyStream` because generic-computed covariant members (`toolCalls`, `values`) don't widen under `any` — `InferToolCalls<any>[]` resolves to `AssembledToolCall<…, never>[]`, narrower than a concrete handle. Override those members with their widest forms (preserving each framework's reactivity wrapper — plain arrays for React/Svelte, `ShallowRef` for Vue, `Signal` for Angular) so the message/tool/value selector hooks accept a fully-typed stream without an `as AnyStream` cast. ## @example/ai-elements@0.1.36 ### Patch Changes - Updated dependencies \[[`01c67df`](01c67df), [`d12d269`](d12d269), [`a8e7659`](a8e7659), [`49b8c1a`](49b8c1a), [`9e0201d`](9e0201d), [`9b96f60`](9b96f60), [`8e06ace`](8e06ace), [`d65a920`](d65a920), [`2f6d873`](2f6d873), [`a8b0036`](a8b0036), [`4096933`](4096933), [`801d955`](801d955)]: - @langchain/langgraph@1.4.0 - @langchain/react@1.0.21 ## @examples/assistant-ui-claude@0.1.36 ### Patch Changes - Updated dependencies \[[`01c67df`](01c67df), [`d12d269`](d12d269), [`a8e7659`](a8e7659), [`49b8c1a`](49b8c1a), [`9e0201d`](9e0201d), [`9b96f60`](9b96f60), [`8e06ace`](8e06ace), [`d65a920`](d65a920), [`2f6d873`](2f6d873), [`a8b0036`](a8b0036), [`4096933`](4096933), [`801d955`](801d955)]: - @langchain/langgraph@1.4.0 - @langchain/react@1.0.21 ## @examples/ui-angular@0.0.46 ### Patch Changes - Updated dependencies \[[`01c67df`](01c67df), [`d12d269`](d12d269), [`a8e7659`](a8e7659), [`49b8c1a`](49b8c1a), [`9e0201d`](9e0201d), [`9b96f60`](9b96f60), [`8e06ace`](8e06ace), [`d65a920`](d65a920), [`2f6d873`](2f6d873), [`a8b0036`](a8b0036), [`4096933`](4096933), [`801d955`](801d955)]: - @langchain/langgraph@1.4.0 - @langchain/angular@1.0.21 ## @examples/ui-multimodal@0.0.22 ### Patch Changes - Updated dependencies \[[`01c67df`](01c67df), [`d12d269`](d12d269), [`a8e7659`](a8e7659), [`49b8c1a`](49b8c1a), [`9e0201d`](9e0201d), [`9b96f60`](9b96f60), [`8e06ace`](8e06ace), [`d65a920`](d65a920), [`2f6d873`](2f6d873), [`a8b0036`](a8b0036), [`4096933`](4096933), [`801d955`](801d955)]: - @langchain/langgraph@1.4.0 - @langchain/react@1.0.21 ## @examples/ui-react@0.0.22 ### Patch Changes - Updated dependencies \[[`01c67df`](01c67df), [`d12d269`](d12d269), [`a8e7659`](a8e7659), [`49b8c1a`](49b8c1a), [`9e0201d`](9e0201d), [`9b96f60`](9b96f60), [`8e06ace`](8e06ace), [`d65a920`](d65a920), [`2f6d873`](2f6d873), [`a8b0036`](a8b0036), [`4096933`](4096933), [`801d955`](801d955)]: - @langchain/langgraph@1.4.0 - @langchain/react@1.0.21 ## langgraph@1.0.40 ### Patch Changes - Updated dependencies \[[`01c67df`](01c67df), [`d12d269`](d12d269), [`a8e7659`](a8e7659), [`9e0201d`](9e0201d), [`9b96f60`](9b96f60), [`8e06ace`](8e06ace), [`d65a920`](d65a920), [`2f6d873`](2f6d873), [`a8b0036`](a8b0036), [`4096933`](4096933), [`801d955`](801d955)]: - @langchain/langgraph@1.4.0 Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR adds task/node-level timeouts.
Why async-only
Python sync code can't be safely cancelled in-process, so timeouts only apply to async nodes/tasks. Configuring a timeout on a sync node raises at compile time (and again as a runtime safety net in
run_with_retry).Public API
A single
timeout=kwarg onadd_node,@task,@entrypoint, andNodeBuilder.set_timeout. Pass a number/timedeltafor the simple case (treated as a hard wall-clock cap), or aTimeoutPolicy(inlanggraph.types) for finer control:run_timeout: hard wall-clock cap on a single attempt.idle_timeout: progress-resetting cap. Refreshed by writes, stream events, child-task scheduling, runtime stream-writer calls, and any LangChain callback event from descendants of this node's run;runtime.heartbeat()is a manual signal for work that doesn't naturally emit any of these.refresh_on="heartbeat"narrows the refresh source to explicitruntime.heartbeat()only — useful when you want a strict idle definition that isn't reset by chatty subordinates.Either timeout firing raises
NodeTimeoutError(kind="run"|"idle")(subclass ofTimeoutError), which carriesnode,timeout,run_timeout,idle_timeout,elapsed, andkind. If the node'sretry_policypermitsTimeoutErrorit'll retry; the timer resets per attempt.What gets cancelled and what doesn't
When a watchdog fires:
_IdleTimedAttemptScopeis closed under a lock so any in-flightCONFIG_KEY_SEND/ stream / child-task scheduling that races with the timeout is dropped atomically.task.writesare cleared (so pre-timeout writes from the failed attempt don't leak into the checkpoint).asyncio.Taskis cancelled; its eventual exception is drained via a done-callback so asyncio doesn't log it.Child tasks that were already scheduled before the timeout fired still complete — they aren't part of the cancelled task's structured cancellation surface. This is intentional and tested.
External-watchdog hook
CONFIG_KEY_TIMED_ATTEMPT_OBSERVERis a per-config callback that receives lifecycle events for each timed attempt:start— fired before the proc runs, withtask_id,task_name,attempt,run_id,thread_id,checkpoint_ns,started_at, and the configuredrun_timeout_secs/idle_timeout_secs/refresh_on.progress— fired onscope.touch()(i.e. each progress signal that resets the idle clock), rate-limited to ~4 events peridle_timeoutwindow so token-rate callbacks don't flood the observer. Carries the same context plusprogress_at.finish— fired withfinished_at,status("success"/"error"),error_type,error_message.ParentCommandandGraphBubbleUpare treated as control flow, not errors.This lets an orchestrating process listen to
start+ rollingprogressto compute its own kill deadline (progress_at + idle_timeout_secs), so it can hard-kill a worker process if the in-process cancellation deadlocks. Observer callbacks run in whatever thread fires them and any exception they raise is logged and swallowed.