You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This PR exposes on_batch_complete as an optional parameter on DataDesigner.create, allowing callers to register a per-batch callback without accessing engine internals directly. The change is a thin pass-through: the new parameter is collected at the create boundary and forwarded unchanged to builder.build().
data_designer.py: Adds on_batch_complete: Callable[[Path], None] | None = None to create, updates the docstring, and inserts the argument into the builder.build() call.
test_data_designer.py: Adds a focused unit test that mocks the builder and asserts the callback is forwarded by identity to builder.build().
Confidence Score: 5/5
The change is a straightforward parameter pass-through with no new logic introduced; safe to merge.
The new parameter is optional, defaults to None, and is forwarded verbatim to an existing builder.build() argument that already handles the callback. No existing call sites are affected, and the unit test confirms end-to-end forwarding by identity.
Adds on_batch_complete as an optional keyword argument to DataDesigner.create, forwarding it directly to builder.build(); import of Callable added, docstring updated.
Adds test_create_forwards_on_batch_complete_callback which mocks internal builder methods and verifies the callback is passed through to builder.build() via keyword argument.
PR #663 Review — Expose on_batch_complete via DataDesigner.create
Summary
Threads an existing engine-level on_batch_complete: Callable[[Path], None] | None parameter
out through the public DataDesigner.create method on the interface package, so users can
register a per-batch callback without reaching into engine internals. Closes #662.
The change is minimal (47/-1) and surgical:
packages/data-designer/src/data_designer/interface/data_designer.py: add the kwarg and
forward it to builder.build(...).
packages/data-designer/tests/interface/test_data_designer.py: a unit test asserting the
callback is forwarded with identity to the underlying builder.
Findings
Correctness
Signature matches the engine: dataset_builder.py:223 declares on_batch_complete: Callable[[Path], None] | None = None, identical to the new
interface kwarg. The forwarding call uses keyword args, so positional drift is
not a risk.
Default None preserves existing behavior — a pure additive change with no
caller impact.
Callable is imported from collections.abc (the project convention) rather
than typing, consistent with from __future__ import annotations already in
use in this file.
Project conventions
Follows the layering rules in AGENTS.md: interface forwards into engine; no
reverse imports. ✅
Type annotations and modern syntax (Callable[[Path], None] | None) align with
STYLEGUIDE expectations. ✅
Docstring entry added in the Args: section, matching the existing style of
surrounding parameters.
Test coverage
test_create_forwards_on_batch_complete_callback asserts identity (is) on the
forwarded callback and the num_records kwarg — a tight, focused check. Good.
The test only exercises the wiring; behavior of the callback (invocation per
batch, path correctness) remains covered by engine-level tests. That separation
of concerns is appropriate for an interface-layer change.
Minor nit: the dummy callback uses del path to silence "unused argument"
lint, which is fine but def on_batch_complete(_path: Path) -> None: pass
would be more idiomatic. Not blocking.
Risk / blast radius
Backward compatible: new kwarg with a default of None.
No public-API break; no changes to engine, config, or CLI.
No performance implications — the callback is only invoked on batch completion
in the engine; this PR doesn't add any per-record overhead.
Documentation
Public-facing docstring is updated. Consider whether the Fern docs site
(fern/) has an example or API reference that should mention on_batch_complete on create(). If create() is auto-generated from the
signature, no action needed; otherwise a short note + tiny example would help
users discover the feature. Not blocking.
Verdict
LGTM. Small, well-scoped, test-covered, backward compatible, and consistent with
the layering invariants in AGENTS.md. The only optional follow-up is verifying
the docs site surfaces the new parameter; everything else is good as-is.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
📋 Summary
Exposes
on_batch_completevia theDataDesigner.createmethod so that users can configure the callback without having to reach intoengineinternals.🔗 Related Issue
Closes #662 #662
🔄 Changes
DataDesigner.create🧪 Testing
make testpasses✅ Checklist