Skip to content

feat(langgraph): add before_builtins opt-in for stream transformers#7882

Merged
Nick Hollon (nick-hollon-lc) merged 2 commits into
mainfrom
nh/stream-transformer-before-builtins
May 21, 2026
Merged

feat(langgraph): add before_builtins opt-in for stream transformers#7882
Nick Hollon (nick-hollon-lc) merged 2 commits into
mainfrom
nh/stream-transformer-before-builtins

Conversation

@nick-hollon-lc

@nick-hollon-lc Nick Hollon (nick-hollon-lc) commented May 21, 2026

Copy link
Copy Markdown
Contributor

Adds a before_builtins: ClassVar[bool] = False flag on StreamTransformer. When True, the mux registers the transformer ahead of the rest, preserving relative order within each lane.

Motivation

Content-mutating transformers — PII redaction, content filters, profanity scrubbers — need to run before built-in transformers like MessagesTransformer, which eagerly snapshots text fields (delta.text, delta.reasoning) into its projection. Today there's no way to register a transformer ahead of the built-ins, so any mutation a user transformer makes to delta.text lands too late: MessagesTransformer has already pushed the original string into the ChatModelStream text accumulator.

Motivating use case: PII redaction middleware in langchain (see langchain-ai/langchain#37591). Stream-level redaction wants to mutate delta.text before any consumer projection captures it. Without before_builtins, the only workable Python-side approach is wrap_model_call, which is heavier and only catches model output (not tool deltas, custom events, or subgraph outputs).

API

class _PIIRedactor(StreamTransformer):
    before_builtins: ClassVar[bool] = True

    def process(self, event):
        # mutate delta.text in place; MessagesTransformer sees the redacted string
        ...

The flag is a class attribute so transformers carry the placement decision with them — callers wiring middleware don't need to know whether the transformer needs pre-lane placement; the transformer itself declares it.

Ordering contract

Within each lane, the supplied order is preserved. The full registration order ends up as:

  1. All before_builtins=True factories, in supplied order
  2. All other factories, in supplied order
  3. Pre-built transformers= instances, partitioned the same way

Built-ins keep the default before_builtins=False, so registration order is identical to before for anyone who hasn't opted in. No backwards-compatibility break.

Foot-gun (documented on the class)

Pre-lane transformers see tasks events before LifecycleTransformer and SubgraphTransformer consume them. Mutating event["params"]["namespace"] or the data dict's id / result / error / interrupts fields will desync their bookkeeping. Observe freely; mutate only fields no built-in reads.

The canonical safe mutation is delta.text (and delta.reasoning) on messages events — exactly the case content-filter transformers need.

Tests

tests/test_stream_before_builtins.py covers:

  • Pre-lane factories register before others, and run first on dispatch
  • Within-lane order is preserved
  • A pre-lane redactor is registered before MessagesTransformer even when supplied second
  • A pre-lane observer doesn't break LifecycleTransformer's bookkeeping (read-only observation is safe)
  • Default is False on the base class and all built-ins
  • End-to-end: a pre-lane redactor mutating delta.text results in the redacted string landing in MessagesTransformer's text accumulator — the actual proof that pre-lane placement does what it's supposed to do

Existing stream-transformer tests (test_stream_data_transformers, test_stream_messages_transformer, test_stream_lifecycle_transformer, test_stream_subgraph_transformer — 133 tests) all still pass.

Adds a `before_builtins: ClassVar[bool] = False` flag on
`StreamTransformer`. When `True`, the mux registers the transformer
ahead of the rest, preserving relative order within each lane. This
lets content-mutating transformers (PII redaction, content filters,
etc.) run before built-ins like `MessagesTransformer` that eagerly
snapshot text fields into their projections.

Behavior is unchanged for existing transformers — the default is
`False` and built-ins keep it `False`, so registration order is
identical to before for anyone who hasn't opted in.

The class docstring documents the foot-gun: pre-lane transformers
see `tasks` events before `LifecycleTransformer` /
`SubgraphTransformer` consume them, so mutating
`event["params"]["namespace"]` or the data dict's
`id` / `result` / `error` / `interrupts` fields will desync their
bookkeeping. Observe freely; mutate only fields no built-in reads
(`delta.text` on `messages` events is the canonical case).
@nick-hollon-lc Nick Hollon (nick-hollon-lc) marked this pull request as ready for review May 21, 2026 15:27
Comment thread libs/langgraph/langgraph/stream/_mux.py Outdated
@nick-hollon-lc Nick Hollon (nick-hollon-lc) merged commit 8215a9d into main May 21, 2026
67 checks passed
@nick-hollon-lc Nick Hollon (nick-hollon-lc) deleted the nh/stream-transformer-before-builtins branch May 21, 2026 15:54
Nick Hollon (nick-hollon-lc) added a commit that referenced this pull request May 21, 2026
releasing 1.2.1

Notable changes since 1.2.0:

- feat(langgraph): add `before_builtins` opt-in for stream transformers
(#7882)
- fix(langgraph): keep tool results out of v3 messages (#7838)
- chore(deps): bump langsmith from 0.7.31 to 0.8.0 (#7788)
- chore(deps): bump idna from 3.11 to 3.15 (#7866)
Nick Hollon (nick-hollon-lc) added a commit that referenced this pull request May 27, 2026
releasing 1.2.1

Notable changes since 1.2.0:

- feat(langgraph): add `before_builtins` opt-in for stream transformers
(#7882)
- fix(langgraph): keep tool results out of v3 messages (#7838)
- chore(deps): bump langsmith from 0.7.31 to 0.8.0 (#7788)
- chore(deps): bump idna from 3.11 to 3.15 (#7866)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants