fix(sdk-py): make `tools_agent` fake model stateless by nick-hollon-lc · Pull Request #7930 · langchain-ai/langgraph

Nick Hollon (nick-hollon-lc) · 2026-05-28T17:48:38Z

Problem

test_tools.py::test_tools_async (sdk-py integration suite) flakes with AssertionError: expected at least one tool call handle (assert []), while test_tools_sync passes. Observed on #7927's CI but the test predates that PR (added in #7833) and is unchanged on its branch — this is a pre-existing flake.

Root cause

integration/graph/tools_agent.py drives create_agent with a module-global FakeMessagesListChatModel. That model keeps an instance counter i that persists and cycles 0 -> 1 -> 0 across runs. The graph only works if every run starts at an even index (first model call issues the search tool call). On the licensed server (multiple queue workers / graph warmup), a run can start mid-cycle at an odd index — the model then replies "done." first, create_agent ends after one model call, and zero tools-channel events reach the wire. thread.tool_calls correctly yields nothing and the assertion fails.

This is why only the async test flaked: it runs before the sync test, absorbs the odd-parity run (one model call), and flips the shared cursor back to even, so the sync test then passes.

Evidence

Reproduced the exact graph outside Docker: at even i the run emits 1 tool call + 1 ToolMessage; at odd i it emits 0 of each. Verified the state-based replacement emits exactly one tool call across repeated runs with i flipped each time.
The failing CI run's server logs show the tools_agent run succeeded (run_exec_ms=15) with no tool-channel events — the server emitted none because the graph made no tool call. Not an SDK delivery bug.

Fix

Derive the reply from conversation state rather than a cycling response list: issue the search tool call until a ToolMessage is present, then a terminating AIMessage. Order-independent, so every run emits exactly one tool call regardless of prior invocation count. Still subclasses FakeMessagesListChatModel (not GenericFakeChatModel) to preserve the _stream behavior that keeps tool_calls intact.

Test plan

sdk-py integration test green (the flaky check)
Local: ruff format / ruff check clean; graph compiles.

… flake `tools_agent.py` drove `create_agent` with a module-global `FakeMessagesListChatModel` whose response cursor `i` persists and cycles `0 -> 1 -> 0` across runs. The test relies on every run starting at an even index so the first model call issues the `search` tool call. On the licensed server (multiple queue workers / graph warmup) a run can start mid-cycle at an odd index, so the model replies "done." first and emits no tool call. The `tools` channel then produces zero events and `test_tools_async` fails with "expected at least one tool call handle". This is order-dependent, which is why only `test_tools_async` flaked: it runs before `test_tools_sync`, absorbs the odd-parity run (one model call), and resets the shared cursor back to even so the sync test passes. Derive the reply from conversation state instead: issue the `search` tool call until a `ToolMessage` is present, then a terminating `AIMessage`. This is order-independent, so every run emits exactly one tool call regardless of how many times the model was previously invoked.

github-actions Bot added the internal label May 28, 2026

Mason Daugherty (mdrxy) changed the title ~~fix(sdk-py): make tools_agent fake model stateless to fix integration flake~~ fix(sdk-py): make tools_agent fake model stateless May 28, 2026

Mason Daugherty (mdrxy) approved these changes May 28, 2026

View reviewed changes

Nick Hollon (nick-hollon-lc) merged commit ea4aa79 into main May 28, 2026
132 of 134 checks passed

Nick Hollon (nick-hollon-lc) deleted the nh/fix-tools-agent-flake branch May 28, 2026 20:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(sdk-py): make `tools_agent` fake model stateless#7930

fix(sdk-py): make `tools_agent` fake model stateless#7930
Nick Hollon (nick-hollon-lc) merged 1 commit into
mainfrom
nh/fix-tools-agent-flake

Nick Hollon (nick-hollon-lc) commented May 28, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Nick Hollon (nick-hollon-lc) commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Root cause

Evidence

Fix

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Nick Hollon (nick-hollon-lc) commented May 28, 2026 •

edited

Loading