Add cycle monitoring#3
Closed
Nuno Campos (nfcampos) wants to merge 2 commits intomainfrom
Closed
Conversation
eeee025 to
1e50542
Compare
5 tasks
4 tasks
Alaina Hardie (trianglegrrl)
added a commit
to trianglegrrl/langgraph
that referenced
this pull request
May 6, 2026
Splits Phase 1 §6 Step 1.2 into 1.2a (Pregel core, differential parity on synthetic fixtures) and 1.2b (LANGGRAPH_BACKEND=rust + async PyO3 bridge). 1.2a closes Phase 1 follow-up langchain-ai#2 (erased dyn ChannelKind); 1.2b takes over the new follow-up langchain-ai#3 (async bridge + LANGGRAPH_BACKEND). What lands: - langgraph-core::pregel — algorithmic core only (no streaming, no interrupts, no subgraphs, no Send, no managed values, no callbacks). Modules: * channel_kind: erased dyn ChannelKind trait + impls for all 9 Step 1.1 channels, going through serde_json at the boundary. * apply_writes: parity port of Python _algo.apply_writes (versions_seen, consume, group writes, bump-step idle pass, finish-pass on tentatively-last superstep). * prepare_tasks: PULL path of prepare_next_tasks. Task ids match Python's xxh3-128 byte-for-byte (xxhash-rust, UUID-formatted). * loop_: PregelLoop::new + with_versioning + with_parent_ns + with_stop builders, tick(), run(), TraceEntry/TraceTask shape for the parity comparison. * checkpoint_helpers: empty_checkpoint() + checkpoint_id_to_bytes() with proper InvalidCheckpointId error variant. All submodules private; pub use surface listed in mod.rs. - langgraph-py bridge: run_pregel_fixture(name, init_json, max_steps=None) entry point with 5 hand-rolled fixtures (linear_chain, conditional_fork, fan_out, recursion, multi_channel_reducer_mix). - parity/scripts/test_pregel_fixtures_via_bridge.py: 24 parametrised tests vs Python StateGraph, user-channel parity. - parity/scripts/test_pregel_differential.py: 5 hypothesis tests, ~1300 random iterations, 0 divergences. Step + recursion-limit are i64 (matches Python's signed int wire type; 1.2b checkpoint interop needs the negative one-step "before any input" state). PregelLoop fields are pub(crate); external mutation goes through put_input/tick/run + the with_* builders. The pregel layer's runtime channel value type is named ChannelValue (distinct from the codec's wire-format Decoded — Phase 0 follow-up langchain-ai#2 plans to migrate Checkpoint::channel_values to codec Decoded later). Verification: 161 cargo tests, clippy clean, 24+5 fixture/differential parity tests, Phase 0 round-trip + conformance still all-green.
Alaina Hardie (trianglegrrl)
added a commit
to trianglegrrl/langgraph
that referenced
this pull request
May 6, 2026
Closes Phase 0 follow-up #1's tracked-by check ("at least one #[pyfunction(async)] exposed in the bridge, plus a Python async test that calls it"). First piece of the combined Step 1.2b + 1.3 milestone — see plan amendment in this commit and the locked architectural decisions in rust/docs/phase-1-followups.md entry langchain-ai#3. What lands: - rust/Cargo.toml: workspace deps for pyo3 0.28, pyo3-async-runtimes 0.28, tokio 1 (rt-multi-thread, macros, sync, time). Bridge Cargo.toml switches to workspace-versioned pyo3. - rust/ffi/langgraph-py/src/async_runtime.rs: new module with `async_echo(s) -> awaitable[str]`. Sleeps 1ms on tokio then returns s.upper() — a wiring proof, not a feature. Replaced by real entry points (run_pregel_async, …) as Step 1.2b's surface grows; kept around because its asyncio test is the cheapest smoke check that the wiring still works after future bridge changes. - parity/scripts/test_async_bridge.py: 3 tests (await, gather, attribute presence). Uses anyio (matching upstream libs/langgraph/tests/test_pregel.py — required for the eventual 87-test parity gate to run on the same async framework). - .omc/plans/langgraph-rust-port-2026-04-30.md: amends §6 to mark Step 1.2b + 1.3 as one combined milestone with the locked architectural decisions (research-backed via the two research/*.md files in this commit). - rust/docs/phase-1-followups.md: entry #1 (DeltaChannel) and entry langchain-ai#3 (async bridge + LANGGRAPH_BACKEND wiring) re-pointed to the combined milestone, with explicit decision log. - research/pyo3-async-pregel-2026-05-05.md + python-rust-backend-swap-2026-05-05.md: pplx research outputs that ground the locked decisions (pyo3-async-runtimes 0.28 is the maintained answer; Pattern B monkeypatch from a separate package is canonical for accelerating frozen-baseline libs). GIL discipline (production pattern from polars / pydantic-core / datafusion-python): release between Python callbacks via short Python::with_gil scopes. The Rust future itself runs without the GIL. Wired through pyo3-async-runtimes::tokio::future_into_py. Parity gate: 3 anyio tests via the bridge venv's Python directly (SESSION-RESUME's bridge install gotcha applies — uv run python silently reverts the .so). What the gate caught: nothing — this is foundation. The first test_async_echo_awaits_via_tokio failure on commit-day was "pytest-asyncio not installed" → fixed by switching to anyio (matches upstream langgraph), which uncovered nothing about the Rust side. Verification: - 161/161 cargo tests pass (unchanged from 1.2a baseline) - cargo clippy --workspace --all-targets -- -D warnings clean - 3/3 new asyncio tests pass - 73/73 Phase 0 corpus + 49 allowlist + strict-reject pass - 58/58 conformance via Rust shadow pass - 19/19 Step 1.1 channels parity (10 fixtures + 9 differential) - 29/29 Step 1.2a Pregel parity (24 fixtures + 5 hypothesis)
Alaina Hardie (trianglegrrl)
added a commit
to trianglegrrl/langgraph
that referenced
this pull request
May 6, 2026
Plumbs `Result<Vec<Write>, NodeError>` through `NodeCallable`,
`PregelLoop::tick()`, and `PregelLoop::run()`. Adds
`PregelError::NodeFailed { node, message }` so Pregel can surface
node-execution failures with the failing node's name attached.
This is the foundation for Task #4b (Python-callable Node wrapper):
when a Python node raises, the bridge will catch the `PyErr`, stash
it in a side-channel registry keyed by node name, and return
`Err(NodeError)`; the Pregel loop propagates as `NodeFailed`; the
bridge driver re-raises the original Python exception. Side-channel
storage lives in the bridge crate so `langgraph-core` stays
PyO3-free.
What lands:
- langgraph-core::pregel::errors:
* `NodeError { message: String }` with std::error::Error +
Display + From<&str>/From<String>. Public.
* `PregelError::NodeFailed { node, message }` new variant.
* Doc-comments updated to reflect the 1.2b ownership.
- langgraph-core::pregel::node: `NodeCallable` type alias bumped
to `Fn(&NodeInput) -> Result<Vec<Write>, NodeError>`. Doc
explains the bridge-side PyErr capture path.
- langgraph-core::pregel::loop_::tick(): node call site now does
`callable(&input).map_err(|e| PregelError::NodeFailed {
node: name, message: e.message })?`. Hot path unchanged for
happy returns.
- All 12 existing closure use-sites wrap their happy-path returns
in `Ok(...)` — 4 in `loop_.rs` tests, 1 in `prepare_tasks.rs`
(`noop_callable`), 5 in `ffi/langgraph-py/src/pregel.rs` fixture
builders, 2 in shared `single_i64_node` / `router_node` helpers.
`panic!` arms still type-check via `!` coercion.
- `pregel/mod.rs` re-exports `NodeError`.
- phase-1-followups.md: documents 2026-05-05 scope decision that
Task langchain-ai#3 (channel translation harness) is subsumed by the
existing differential gate + DeltaChannel parity test (the
"translate to Rust → translate back → equal at every step"
invariant is what differential per-step trace equality already
proves). The Python-`BaseChannel` → Rust-`dyn ChannelKind` glue
moves into the async runner where it has a runtime user.
New test: `node_callable_error_surfaces_as_node_failed` proves a
node returning `Err(NodeError::new("simulated python KeyError ..."))`
surfaces as `PregelError::NodeFailed { node: "boom", message: ... }`
through `PregelLoop::run()`.
What the gate caught: nothing — pure plumbing refactor, all 12
existing closures preserved their semantics. No behavior change for
happy paths; cargo + clippy clean across the workspace.
Verification:
- 181/181 cargo tests pass (180 prior + 1 new error-path test)
- cargo clippy --workspace --all-targets -- -D warnings clean
- 29/29 Step 1.2a Pregel parity (24 fixtures + 5 hypothesis) — proves
refactor preserves behavior on the 5 fixture graphs
- 10/10 Step 1.2b DeltaChannel parity (unchanged)
- 3/3 Step 1.2b async-bridge wiring (unchanged)
- All Phase 0 + Step 1.1 gates remain green (not re-run inline,
last verified at the prior 1.2b foundation commit 5325311)
Alaina Hardie (trianglegrrl)
added a commit
to trianglegrrl/langgraph
that referenced
this pull request
May 6, 2026
…k bridge) Round-trips Python `BaseChannel.checkpoint()` state through Rust `from_checkpoint` for every stdlib channel class. Closes the channel- translation deliverable for the combined Step 1.2b + 1.3 milestone (`rust/docs/STEP-1.2B-PARTIAL-HANDOFF.md`). The runner monkeypatch in sub-step langchain-ai#6 will use `extract_state` / `apply_state` to wire Python channel instances through the Rust loop tick. What landed ----------- - New `rust/ffi/langgraph-py/src/channel_translate.rs` — Rust translation gate. Per-class `round_trip_*` functions parse the msgpack-encoded state, build a Rust channel via `from_checkpoint`, and re-encode the result. 10 stdlib classes covered (LastValue, LastValueAfterFinish, Topic, BinaryOperatorAggregate, EphemeralValue, AnyValue, UntrackedValue, NamedBarrierValue, NamedBarrierValueAfterFinish, DeltaChannel). Custom user-defined channels return `ValueError` (Python: `RustBackendUnsupported`). - New PyO3 entry points `translate_channel_round_trip` (msgpack bytes in, msgpack bytes out) and `supported_channel_classes`. - New Python helper `parity/scripts/_channel_translate.py`: `class_name`, `extract_state`, `apply_state`, `pack_state`, `unpack_state`, `RustBackendUnsupported`. Per-class dispatch via `_EXTRACTORS` / `_BUILDERS` / `_APPLIERS` dicts; symmetry checked at import time. Caller-misuse cases (missing operator/reducer/ names) raise `ValueError`, distinct from `RustBackendUnsupported`. - New parity test `parity/scripts/test_channel_translate.py` — 35 tests covering bridge contract, custom-class rejection, caller-misuse errors, and per-class round-trip semantics for all 10 channels. - Per-class encoding table documented inline (Rust module docstring) and traced back to handoff doc. - Workspace dep: `rmp-serde = "1"` for serde-style msgpack on the Rust side. Python uses `ormsgpack` (already in bridge venv). Wire format ----------- msgpack bytes — matches the locked architectural decision in `phase-1-followups.md` entry langchain-ai#3 §6. Python packs with `ormsgpack`, Rust decodes via `rmp_serde` into `serde_json::Value`. Same encoding family the rest of the project uses for checkpoint blobs; future expansion to ext-coded values (LangChain messages, etc.) layers on without changing the bridge surface. Parity gate ----------- For each channel class, drive the round-trip Python.checkpoint() -> msgpack -> Rust.from_checkpoint() -> Rust.checkpoint() -> msgpack -> Python.from_checkpoint() -> .get() and assert the seeded Python channel observes the same state as the original. For `DeltaChannel` snapshot blobs, the Rust side collapses to sentinel (matching Python's invariant); we verify the replay target instead. What the gate caught -------------------- - Python's `DeltaChannel.from_checkpoint(MISSING)` is asymmetric: it sets `value = typ()` rather than leaving the channel MISSING. Test `test_missing_round_trips_missing` documents the asymmetry with a `fresh.get() == {}` assertion. - Subclasses of stdlib channels need a distinct error path from fully-custom channels: the runner ought to know "we know the parent shape but you customised it" vs "we have no idea what this is". `class_name` walks the supported-class MRO to give a precise message. - Caller misuse (missing `operator` / `reducer` / `names` in `init_args`) is `ValueError`, not `RustBackendUnsupported`. Two failure modes wearing one exception turns 5-minute debugs into 30-minute ones. Test counts ----------- - Cargo workspace: 211 passed (was 210; +1 invalid_msgpack test). Clippy clean. - Phase 0: 73/73 round-trip, 49 allowlist, strict reject; 58 conformance. - Phase 1 + 1.2b foundation + #4c: 104 passed (was 69; +35 channel translate tests).
Alaina Hardie (trianglegrrl)
added a commit
to trianglegrrl/langgraph
that referenced
this pull request
May 6, 2026
…doff for langchain-ai#6/langchain-ai#7 Milestone update for the combined Step 1.2b + 1.3 final stretch. Locked architectural decision (2026-05-06) ------------------------------------------ The original plan §6 and `phase-1-followups.md` entry langchain-ai#3 §5 left the door open to "a tighter cut decided in implementation" for the `langgraph_rs.backend` monkeypatch — i.e., replacing only `tick()` (approach B) or only `_algo.apply_writes` + `prepare_next_tasks` (approach C) instead of the full `SyncPregelLoop` (approach A). The user has explicitly chosen approach A: full `SyncPregelLoop` replacement. Reasoning captured in the new handoff doc: * A is the only approach where `LANGGRAPH_BACKEND=rust` actually means "Rust drives the loop" — B and C still leave Python orchestrating most per-tick work. * B's per-tick re-sync of channel state is wasteful and adds an extra parity surface that's correctness risk we don't need. * C is essentially a third copy of `test_pregel_differential.py`'s coverage — buys us nothing new. * "Done right the first time" — the full replacement is bigger but architecturally honest; a tighter cut is technical debt that would need to be redone before Step 1.4 streaming or Phase 2. The "re-build from checkpoint each tick" guidance from the original phase-1-followups langchain-ai#3 §6 is also superseded: under approach A, Rust state is constructed once at `__enter__` (Python → Rust via `_channel_translate.extract_state`) and applied once at `__exit__` (Rust → Python via `apply_state`). No per-tick re-sync. What this commit changes ------------------------ - New `rust/docs/STEP-1.2B-FINAL-HANDOFF.md`: the comprehensive handoff brief for the next session picking up langchain-ai#6 and langchain-ai#7. Covers: * Where we are (status table through `c03c7ac6`). * Locked architectural decision (approach A). * langchain-ai#6 sub-step breakdown (#6a Maturin layout switch → #6b backend.py monkeypatch → #6c Pregel runtime bridge entry point). * langchain-ai#7 iteration loop (87-test gate). * `__init__.py` re-export shim contents (drop-in for the layout switch). * Replaced symbols list pattern for `backend.py`. * `RustBackendUnsupported` rejection sites for the 4 deliberately out-of-scope feature families (custom channels, subgraphs, Send, interrupts, stream modes outside values/updates). * Verification block, hard rules, bridge install gotcha, lessons-learned forwarding from prior handoffs. - `rust/docs/STEP-1.2B-PARTIAL-HANDOFF.md`: prepended a SUPERSEDED notice pointing at the new final handoff for langchain-ai#6/langchain-ai#7. The partial-handoff content is preserved as historical context for what shipped in #4c and langchain-ai#5. - `rust/docs/phase-1-followups.md` entry langchain-ai#3 §5 + §6: amended to record the approach A decision and the supersession of the per-tick-resync line. - `.omc/plans/langgraph-rust-port-2026-04-30.md` §6 Step 1.2b+1.3 Locked decisions §4 + §5: same amendments, with a pointer to the final handoff doc. What's not changing ------------------- The 5 hard architectural decisions in §6 ("combined milestone", "async runtime: pyo3-async-runtimes", "GIL discipline", "errors via PregelExecutionError::NodeFailed", "channel translation by class name") remain locked. Approach A is the runtime-shape decision that sits *above* those. Test counts ----------- Unchanged — pure docs commit. Latest baseline (HEAD = `c03c7ac6`): * Cargo: 220 passed, clippy clean. * Phase 0: 73/73 + 49 + strict reject; 58 conformance. * Phase 1 + 1.2b foundation + #4c + langchain-ai#5: 129 passed.
Alaina Hardie (trianglegrrl)
added a commit
to trianglegrrl/langgraph
that referenced
this pull request
May 6, 2026
…hon scaffolding) Land the Python-side activation surface for the Rust backend. Sub-task #6b of the combined Step 1.2b + 1.3 milestone (`rust/docs/STEP-1.2B-FINAL-HANDOFF.md`); Rust runtime bridge entry point follows in #6c, full 87-test gate is langchain-ai#7. Approach choice (locked in this commit body) -------------------------------------------- The handoff's "approach A — full SyncPregelLoop replacement" left an implementation question: stand up a parallel duck-typed shadow class, or subclass `SyncPregelLoop` and override only the algorithmic core? We chose **subclass + override** after surfacing the trade-off: * Subclass inherits `BackgroundExecutor` / `ExitStack` / lifecycle event queue / status machinery / checkpoint persistence / `accept_push` / `output_writes` / `match_cached_writes` from upstream. None of those is a parity surface we want to grow in Python, and reimplementing 30+ methods just to hit `Pregel.invoke`'s read sites is the wrong cost shape for V0.1. * Approach A's distinguishing property over B — "Rust state lives across the whole graph execution; no per-tick *bidirectional* re-sync" — is preserved by syncing once at `__enter__` (Python channels → Rust) and once at `__exit__` (Rust → Python channels). A subclass that overrides the algorithmic core (`__enter__` validation + Rust seed, `tick`/`after_tick` per-superstep BSP work, `__exit__` flush) gets approach A's behaviour with significantly less Python-side parity surface than a duck-typed shadow. What landed ----------- - `rust/ffi/langgraph-py/python/langgraph_rs/backend.py` — the activation module: * `REPLACED_SYMBOLS` tuple (top-of-file, auditable diff against upstream): both `langgraph.pregel._loop.{Sync,Async}PregelLoop` AND `langgraph.pregel.main.{Sync,Async}PregelLoop`. The second pair is load-bearing — `pregel/main.py` imports the loop classes directly (`from langgraph.pregel._loop import SyncPregelLoop, AsyncPregelLoop`), so the runtime call sites at `main.py:2847` (sync) and `main.py:3299` (async) read the local module attribute. A single-namespace patch leaves `Pregel.stream` / `astream` instantiating the upstream class. Caught by smoke-test langchain-ai#3 below. * `_RustSyncPregelLoop(SyncPregelLoop)` — subclass with overrides: - `__enter__` calls `super().__enter__()`, then iterates `self.channels` validating each via `_channel_translate.class_name` (raises `RustBackendUnsupported` for custom or subclassed channels); then rejects unsupported `interrupt_before` / `interrupt_after` / non-`{values,updates}` stream modes. - `tick()` raises `NotImplementedError` pointing at sub-step #6c until the Rust runtime bridge entry point lands. Auditably non-functional rather than silent fallback. * `_RustAsyncPregelLoop(AsyncPregelLoop)` — subclass that allows construction (so `Pregel.astream`'s `async with` setup doesn't crash before the rejection) but raises `RustBackendUnsupported` from `__aenter__`. Async parity is a deferred follow-up after the 87-test sync gate is green. * `_install_monkeypatches()` is idempotent and gated by `LANGGRAPH_BACKEND=rust` at import time. `is_active()` exposes the install state to tests. - `rust/ffi/langgraph-py/python/langgraph_rs/_channel_translate.py` — moved (via `git mv`) from `parity/scripts/_channel_translate.py`. Production backend code shouldn't depend on parity-test infrastructure; the helper now lives in the package and the parity test imports it from there. (Functional contents unchanged.) - `parity/scripts/test_channel_translate.py` — import switched from the `sys.path.insert(...) + from _channel_translate import ...` shim to `from langgraph_rs._channel_translate import ...`. 35 tests pass unchanged. - `parity/scripts/test_backend_activation.py` — new, 10 smoke tests pinning #6b's surface (replacement list, both-namespace patch, subclass MRO, idempotency, every rejection site, `#6c` stub pointer, async-`__aenter__` rejection). - `conftest.py` (project root) — top-level pytest hook that imports `langgraph_rs.backend` when `LANGGRAPH_BACKEND=rust` is set. Lives at the repo root so it covers both `parity/scripts/` and `libs/langgraph/tests/` (the 87-test gate's home for langchain-ai#7); does nothing without the env var. Out of scope / explicitly rejected (`RustBackendUnsupported`) ------------------------------------------------------------- - Custom user-defined channel classes (anything not in the 10 stdlib set surfaced by `langgraph_rs._channel_translate`). - `interrupt_before` / `interrupt_after` (V0.1 deliberately excludes interrupts; the 87-test filter excludes them via `-k "not interrupt"` but a filter slip surfaces here). - Stream modes outside `values` and `updates`. - Async (`AsyncPregelLoop`) — symbol replaced symmetrically but rejects at `__aenter__`. Out of scope / deferred to #6c ------------------------------ - Subgraphs / `Send` / nested-`Pregel` rejection. Those need per-node introspection at `__enter__` time; deferred to #6c alongside the actual Rust call so we don't grow validation that isn't yet exercised. - The actual Rust call. `tick()` raises `NotImplementedError` pointing at #6c. Activating `LANGGRAPH_BACKEND=rust` and invoking any graph that passes the rejection sites will fail predictably. Parity gate ----------- - Without `LANGGRAPH_BACKEND=rust`: every existing parity gate unchanged in count and result. Importing the module is a no-op, upstream `SyncPregelLoop`/`AsyncPregelLoop` symbols untouched. - With `LANGGRAPH_BACKEND=rust`: 10 new smoke tests in `test_backend_activation.py` pin both the replacement surface and the rejection paths. No graph actually runs end-to-end — that's #6c. What the gate caught -------------------- 1. Single-namespace monkeypatch is insufficient. First wiring patched only `langgraph.pregel._loop.{Sync,Async}PregelLoop`; activating the env var and calling `graph.invoke(...)` did NOT raise the #6c `NotImplementedError` because `pregel/main.py` had imported the class directly into its module namespace at langgraph load time, and the `with SyncPregelLoop(...)` call site read the local reference. Fixed by extending `_install_monkeypatches` to patch `langgraph.pregel.main` as well, and pinning that in `REPLACED_SYMBOLS`. 2. `_channel_translate.py` was reachable from the parity tests via a `sys.path.insert(...)` shim, but the backend module needs it as a real package import. Moved into `langgraph_rs/` so production code doesn't depend on `parity/scripts/` layout. 3. The async test originally tried to introspect docstrings on the override; that was over-engineered and brittle. Replaced with a direct `asyncio.run(instance.__aenter__())` pytest.raises check. Test counts ----------- - Cargo workspace: 220 passed; clippy clean. No Rust changes. - Phase 0: 73/73 corpus + 49 allowlist + strict reject; 58 conformance pass / 0 fail. - Phase 1 + 1.2b foundation + #4c + langchain-ai#5 + #6a: 129 passed (unchanged; no LANGGRAPH_BACKEND set). - Phase 1 + 1.2b foundation + #4c + langchain-ai#5 + #6a + #6b: 139 passed (+10 backend activation tests).
Alaina Hardie (trianglegrrl)
added a commit
to trianglegrrl/langgraph
that referenced
this pull request
May 6, 2026
…nly gate green Closes the combined Step 1.2b + 1.3 milestone. The ``LANGGRAPH_BACKEND=rust`` filter on ``libs/langgraph/tests/test_pregel.py`` matches **81 tests** (the handoff's "87" estimate was written before the test set drifted; the ``-k "memory and not streaming and not interrupt and not subgraph and not send"`` filter is verbatim). All 81 pass on first run after sub-step #6c landed — no triage iteration was needed. What landed ----------- - ``parity/scripts/run_87_test_gate.sh`` — runnable wrapper that sets ``NO_DOCKER=true`` (skips redis/postgres fixtures the bridge venv doesn't carry) and ``LANGGRAPH_BACKEND=rust``, points pytest at the filter, and forwards extra args. Single command for re-running the gate locally. - ``rust/ffi/langgraph-py/pyproject.toml`` — added ``[gate-87]`` dependency-group capturing the four collection-time deps the upstream conftest pulls in (``redis``, ``pytest-mock``, ``syrupy``, ``pycryptodome``). The bridge venv was missing these because the set is what ``libs/langgraph/.venv`` carries for its own test suite, not what the bridge needs for codec parity. Documenting in the dependency-group keeps the install command self-describing (``uv pip install --group gate-87``). - ``rust/docs/phase-1-followups.md`` — entry langchain-ai#3 (async PyO3 bridge + ``LANGGRAPH_BACKEND=rust`` wiring) marked **closed**. Added the amendment note that the implementation chose subclass + override for ``_RustSyncPregelLoop`` (rather than the literal stand-alone duck-typed shadow class the prose example sketched), with the rationale matching the design discussion at the start of #6b. The async surface stays deferred — ``_RustAsyncPregelLoop`` raises at ``__aenter__`` until a phase that needs streaming / ``astream`` parity owns it. Bridge-venv setup deltas (one-time, since this commit) ------------------------------------------------------ - ``redis``, ``pytest-mock``, ``syrupy``, ``pycryptodome`` installed via the new ``gate-87`` dependency group. - ``libs/checkpoint-sqlite`` and ``libs/checkpoint-postgres`` installed in editable mode so the conftest can import ``langgraph.cache.sqlite``. (The other libs were already editable-installed by Phase 0.) Parity gate (the milestone gate) -------------------------------- :: NO_DOCKER=true LANGGRAPH_BACKEND=rust \ rust/ffi/langgraph-py/.venv/bin/python -m pytest \ libs/langgraph/tests/test_pregel.py \ -k "memory and not streaming and not interrupt and not subgraph and not send" Result: ``81 passed, 376 deselected in 9.98s``. Sanity check confirmed the Rust runtime is genuinely driving the loop (not a silent fallback to upstream Python): instrumenting ``run_pregel_loop_topology`` with a call counter shows it's invoked on every ``graph.invoke`` under ``LANGGRAPH_BACKEND=rust``, both ``langgraph.pregel._loop`` and ``langgraph.pregel.main`` namespaces resolve ``SyncPregelLoop`` to ``_RustSyncPregelLoop``, and ``backend.is_active()`` returns ``True``. What the gate caught -------------------- Nothing. The 81 tests passed on first run after the bridge wheel was rebuilt with #6c and the bridge venv had its collection-time deps installed. The handoff explicitly warned to "expect failures to send you back to #4c / langchain-ai#5 / langchain-ai#6 for incremental fixes"; that budget went unused. Plausible reasons: 1. The four channel-translation rejection sites (``CONFIG_KEY_READ`` shadow, panic-stub binop, custom-channel gate, async ``__aenter__``) cleanly cover the corners that would have been the most likely failure surfaces. The 87-test filter ``-k`` excludes the patterns those rejections would trip on (``streaming``, ``interrupt``, ``subgraph``, ``send``). 2. The local-state shadow ``CONFIG_KEY_READ`` reader the wrapper provides is enough for every conditional-edge test in the filter — none of them read channels the routing node didn't write. 3. The translation surface from sub-step #4c (1,200+ hypothesis iterations across the 10 stdlib channel classes) was already verified, so the per-class state encoding round-trips cleanly under load. Test counts ----------- - Cargo workspace: 220 passed; clippy clean. - Phase 0: 73/73 corpus + 49 allowlist + strict reject; 58 conformance pass / 0 fail. - Phase 1 + 1.2b foundation + #4c + langchain-ai#5 + #6a + #6b + #6c (LANGGRAPH_BACKEND unset): 141 passed. - **Combined Step 1.2b + 1.3 milestone gate (LANGGRAPH_BACKEND=rust): 81 passed / 0 failed.**
8 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Monitor for cycles (ie. infinite loops) in the computation graph. Note this cannot be done as a preprocessing step, as one of the design goals is to allow cyclical computations, we just want to limit these cycles to break infinite loops