Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: NVIDIA-NeMo/DataDesigner
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: v0.5.2
Choose a base ref
...
head repository: NVIDIA-NeMo/DataDesigner
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: v0.5.3
Choose a head ref
  • 18 commits
  • 77 files changed
  • 5 contributors

Commits on Mar 5, 2026

  1. fix: cache notebook builds to avoid flaky upstream model failures (#370)

    * fix: cache notebook builds to avoid failures from flaky upstream models
    
    The build-notebooks CI executes all tutorial notebooks on every run.
    When an upstream model (e.g. black-forest-labs/flux.2-pro) is down, the
    entire docs build fails even if no notebooks changed.
    
    Add per-notebook caching based on source file SHA-256 hashes. Unchanged
    notebooks are served from cache, and only modified ones are re-executed.
    On the first CI run (empty cache), the workflow seeds the cache from the
    last successful build artifact.
    
    Also add a minimal test script (test_flux_image_gen.py) to reproduce the
    flux.2-pro health check failure locally.
    
    Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
    
    * fix: address review comments on notebook caching
    
    - Don't write .sha256 during seeding so changed notebooks are detected
    - Rename TMPDIR to SEED_TMPDIR to avoid shadowing the POSIX env var
    - Use portable sha256 helper (sha256sum with shasum fallback)
    
    Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
    
    * fix: only seed cache when truly empty, restore hash writing
    
    Skip artifact seeding when a partial cache was restored (it already has
    correct per-file hashes). Only seed + write current hashes when the
    cache dir is completely empty (true bootstrapping).
    
    Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
    
    * fix: restrict artifact seed lookup to main branch
    
    Prevents seeding from feature branch runs that may have different
    notebook sources.
    
    Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
    
    * fix: add actions:read permission for artifact seeding
    
    The seed step uses gh run list and gh run download which require
    actions:read. Without it, these calls silently fail and the cold-start
    cache bootstrapping never executes.
    
    Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
    
    * fix: only use notebook cache when called from build-docs
    
    Scheduled Monday runs and manual workflow_dispatch should execute all
    notebooks to catch regressions (e.g. library changes that break a
    notebook). Caching is only used via workflow_call (from build-docs)
    where the goal is fast, resilient doc deployment.
    
    Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
    
    * fix: use jq // empty to avoid "null" string on empty run list
    
    Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
    
    * feat: add use_cache input flag to notebook and docs workflows
    
    Replace event_name-based cache logic with an explicit use_cache boolean
    input. Defaults:
    - build-notebooks: workflow_call=true, dispatch=false, schedule=false
    - build-docs: dispatch=true (toggleable), release=false
    
    This gives full control over caching from the GitHub Actions UI.
    
    ---------
    
    Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
    andreatgretel and claude authored Mar 5, 2026
    Configuration menu
    Copy the full SHA
    2564834 View commit details
    Browse the repository at this point in the history
  2. feat: canonical model client types, protocols, and LiteLLM bridge ada…

    …pter (#359)
    
    * plans for model facade overhaul
    
    * update plan
    
    * add review
    
    * address feedback + add more details after several self reviews
    
    * update plan doc
    
    * address nits
    
    * Add cannonical objects
    
    * self-review feedback + address
    
    * add LiteLLMRouter protocol to strongly type bridge router param
    
    Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
    
    * simplify some things
    
    * add a protol for http response like object
    
    * move HttpResponse
    
    * update PR-1 architecture notes for lifecycle and router protocol
    
    Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
    
    * Address PR #359 feedback: exception wrapping, shared parsing, test improvements
    
    - Wrap all LiteLLM router calls in try/except to normalize raw exceptions
      into canonical ProviderError at the bridge boundary (blocking review item)
    - Extract reusable response-parsing helpers into clients/parsing.py for
      shared use across future native adapters
    - Add async image parsing path using httpx.AsyncClient to avoid blocking
      the event loop in agenerate_image
    - Add retry_after field to ProviderError for future retry engine support
    - Fix _to_int_or_none to parse numeric strings from providers
    - Create test conftest.py with shared mock_router/bridge_client fixtures
    - Parametrize duplicate image generation and error mapping tests
    - Add tests for exception wrapping across all bridge methods
    
    * Use contextlib to dry out some code
    
    * Address Greptile feedback: HTTP-date retry-after parsing, docstring clarity
    
    - Parse RFC 7231 HTTP-date strings in Retry-After header (used by
      Azure and Anthropic during rate-limiting) in addition to numeric
      delay-seconds
    - Clarify collect_non_none_optional_fields docstring explaining why
      f.default is None is the correct check for optional field forwarding
    - Add tests for HTTP-date and garbage Retry-After values
    
    * Address Greptile feedback: FastAPI detail parsing, comment fixes
    
    - Fix misleading comment about prompt field defaults in _IMAGE_EXCLUDE
    - Handle list-format detail arrays in _extract_structured_message for
      FastAPI/Pydantic validation errors
    - Document scope boundary for vision content in collect_raw_image_candidates
    
    * address feedback
    
    ---------
    
    Co-authored-by: Johnny Greco <jogreco@nvidia.com>
    Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
    3 people authored Mar 5, 2026
    Configuration menu
    Copy the full SHA
    68a7f71 View commit details
    Browse the repository at this point in the history

Commits on Mar 6, 2026

  1. fix: processor artifacts type, discovery, and loading (#366)

    * fix: processor artifacts type annotation, discovery, and loading
    
    - Fix PreviewResults.processor_artifacts type from dict[str, list[str] | str]
      to dict[str, list[dict]], matching the actual data produced by
      df.to_dict(orient="records")
    - Add ArtifactStorage.load_processor_dataset() to centralize loading logic,
      handling both directory-based (batched) and single-file (preview) layouts
    - Add ArtifactStorage.list_processor_names() to discover processor outputs
      from disk rather than iterating config, fixing a bug where processors
      that don't write artifacts (e.g. DropColumnsProcessor) would crash the
      library's preview
    - Simplify DatasetCreationResults.load_processor_dataset() to delegate to
      ArtifactStorage
    
    * fix: deduplicate list_processor_names when both dir and file exist
    
    * refactor: extract standalone functions for processor discovery and loading
    
    Extract list_processor_names() and load_processor_dataset() as module-level
    functions that accept a Path, so consumers can use them without constructing
    an ArtifactStorage. The ArtifactStorage methods now delegate to these.
    
    * refactor: move standalone functions to io_helpers in config package
    
    Move list_processor_names() and load_processor_dataset() to
    data_designer.config.utils.io_helpers so they're usable by any consumer
    that depends on the lightweight config package, without needing the
    engine's ArtifactStorage. The standalone functions raise FileNotFoundError;
    the ArtifactStorage methods wrap this as ArtifactStorageError.
    
    * chore: trim docstrings and deduplicate tests
    
    * fix: align get_processor_file_paths with list_processor_names
    
    Reuse list_processor_names so both methods discover directories and
    single parquet files consistently.
    andreatgretel authored Mar 6, 2026
    Configuration menu
    Copy the full SHA
    c64b0a5 View commit details
    Browse the repository at this point in the history
  2. feat: add ExecutionGraph, CompletionTracker, and Task model for async…

    … scheduler (#356)
    
    * feat: add ExecutionGraph, CompletionTracker, and Task model for async scheduler
    
    Add the foundational data structures for the async task-queue dataset
    builder (plan #346, PR 1/4):
    
    - ExecutionGraph: column-level static DAG with topological ordering,
      critical path, task counts, cell-dependency resolution, Mermaid output,
      and side-effect column mapping (__trace, __reasoning_content).
    - CompletionTracker: lightweight (column, row_group, row_index) completion
      state with row dropping and ready-task enumeration.
    - Task/TaskResult/TaskTrace: frozen hashable task dataclass, result
      container, and opt-in tracing record.
    
    All three are pure data structures with no side effects on the existing
    codebase. They live in new modules under engine/dataset_builders/utils/
    and are only imported by code introduced in later PRs.
    
    56 unit tests covering graph construction, validation, dependency
    resolution, completion tracking, row drops, and task model semantics.
    
    Refs #346
    
    * refactor: extract readiness helpers and cache topological order
    
    Add `is_ready` and `is_batch_ready` methods to CompletionTracker to
    simplify `ready_tasks`. Cache topological order in ExecutionGraph since
    the graph is immutable after construction. Move DatasetBuilderColumnConfigT
    type alias to multi_column_configs. Fix license header years.
    
    * refactor: address PR review feedback
    
    - Rename all_complete → is_all_complete for boolean method convention
    - Add ColumnName, RowGroup, RowIndex type aliases for readability
    - Add public mutation API to ExecutionGraph (add_column, add_edge,
      set_side_effect, resolve_side_effect) and rewrite build_execution_graph
      to use it instead of private attributes
    - Change TaskTrace.from_task from @staticmethod to @classmethod
    
    * refactor: address remaining PR review feedback
    
    - Rename RowGroup type alias to RowGroupIndex for consistency
    - Convert ExecutionGraph from dataclass to plain class
    - Move build_execution_graph logic to ExecutionGraph.create() classmethod
    
    * refactor: event-driven frontier for CompletionTracker
    
    Replace the poll-based get_ready_tasks (O(C × R × G) per tick) with an
    event-driven frontier maintained on mark_complete/mark_batch_complete/
    drop_row. get_ready_tasks now returns O(frontier) instead of scanning
    all columns × rows × row groups.
    
    * refactor: extract ready_ctx fixture in completion tracker tests
    
    - Add ReadyTasksFixture dataclass and ready_ctx pytest fixture to
      deduplicate graph/tracker/dispatched setup across get_ready_tasks tests
    - Align test with ExecutionGraph.create API rename
    - Remove redundant inline comments
    
    * fix: validate tracker args and resolve side-effect name collisions
    
    - CompletionTracker now raises ValueError when graph/row_groups
      are provided without each other
    - resolve_side_effect prefers real columns over aliases when a
      name collision exists
    
    * refactor: address PR review feedback — naming, CellRef, batch semantics
    
    - Fix critical_path() crash on empty graph (early return)
    - Fix is_all_complete batch semantics via _batch_complete tracking set
    - Add row-group size mismatch validation in mark_row_range_complete
    - Add unknown row_group validation in mark_cell_complete
    - Rename methods for verb-prefix convention:
      upstream → get_upstream_columns, downstream → get_downstream_columns,
      critical_path → get_longest_dependency_chain,
      mark_complete → mark_cell_complete,
      mark_batch_complete → mark_row_range_complete
    - Introduce CellRef NamedTuple, remove ColumnName/RowGroupIndex/RowIndex aliases
    - Delete deprecated build_execution_graph() wrapper
    - Return defensive copy from topological_order()
    - Add regression tests for fixed bugs
    
    * fix: prevent completed tasks from re-entering the frontier
    
    Skip adding downstream tasks to the frontier when they are already
    marked complete, avoiding redundant work in CompletionTracker.
    
    * harden completion tracker and execution graph APIs
    
    - Enforce strategy-safe completion: mark_cell_complete rejects
      non-CELL_BY_CELL columns, mark_row_range_complete rejects
      CELL_BY_CELL columns (ValueError in graph mode)
    - Return defensive copies from ExecutionGraph public API
      (columns, get_upstream/downstream_columns)
    - Add re-enqueue regression tests for cell and batch paths
    - Add immutability tests for ExecutionGraph collections
    
    * address second round of review feedback
    
    - Reject duplicate column names in add_column with ValueError
    - Validate buffer_size > 0 in task_count
    - Use _batch_complete for batch upstream readiness checks
    - Remove duplicate section header in test file
    
    * fix AGENTS.md compliance violations
    
    - Add `from __future__ import annotations` to 5 files missing it
    - Rename ExecutionGraph methods to start with action verbs
      (strategy → get_strategy, topological_order → get_topological_order,
      upstream_by_strategy → split_upstream_by_strategy,
      task_count → compute_task_count, cell_dependencies → compute_cell_dependencies)
    - Reorder methods in CompletionTracker and ExecutionGraph:
      __init__ → properties → classmethods → public → private
    
    * address review feedback: CellRef dataclass, batch done-guard, is_complete API
    
    - Convert CellRef from NamedTuple to frozen dataclass
    - Change is_complete to accept CellRef instead of 3 positional args
    - Unify batch done-guards in _enqueue_downstream and _reevaluate_batch_tasks
      to use rg_batch_complete instead of rg_completed
    
    * fix test to use cell_by_cell column for row-group validation test
    
    * address review feedback: constructor, assertions, root columns, test fix
    
    - Split CompletionTracker into __init__() + with_graph() classmethod
    - Replace assert with RuntimeError in private methods
    - Add get_root_columns() to ExecutionGraph
    - Remove "no locks needed" from docstring
    - Fix re-enqueue regression test to exercise the actual scenario
    - Remove unused ready_ctx fixture parameter
    
    * rename CellRef to SliceRef
    
    A slice naturally represents both a single cell and a full row group,
    removing the semantic mismatch of CellRef representing batches.
    
    * add missing tests for drop_row unblock, buffer_size, and duplicate column
    andreatgretel authored Mar 6, 2026
    Configuration menu
    Copy the full SHA
    9889dc1 View commit details
    Browse the repository at this point in the history
  3. fix: handle discriminated unions in oneOf pruning validator (#376)

    * fix: handle discriminated unions in oneOf pruning validator
    
    The pruning validator modifies instances in-place during oneOf
    validation. When trying a wrong variant, it strips properties needed
    by the correct variant, causing all variants to fail.
    
    Add a discriminator-aware oneOf validator that reads the discriminator
    mapping to select the correct variant directly, skipping the
    try-all-variants loop that causes the corruption.
    
    Fixes #375
    
    * test: add regression test for non-discriminated oneOf fallback
    andreatgretel authored Mar 6, 2026
    Configuration menu
    Copy the full SHA
    3f8d735 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    e78be09 View commit details
    Browse the repository at this point in the history
  5. chore: add Claude Code skill for code review (#372)

    * Added skill for code review
    
    * Address CR feedback
    
    * more feedback from greptile
    nabinchha authored Mar 6, 2026
    Configuration menu
    Copy the full SHA
    5f8dba4 View commit details
    Browse the repository at this point in the history

Commits on Mar 9, 2026

  1. fix: replace removed DuckDB record_batch() with to_arrow_reader() (#380)

    DuckDB 1.5.0 (released 2026-03-09) removed the record_batch() method
    from the Python Relation API, breaking SeedDatasetColumnGenerator.
    
    Migrate to the stable to_arrow_reader() API and bump the minimum
    DuckDB version to >=1.5.0.
    
    Fixes #379
    andreatgretel authored Mar 9, 2026
    Configuration menu
    Copy the full SHA
    88989d1 View commit details
    Browse the repository at this point in the history
  2. fix: patch litellm ImageURLListItem to make index field optional (#384)…

    … (#385)
    
    * monkey patch litellm to make index optional
    
    * address greptile feedbaack
    
    * fix tests
    
    * fix: add model_rebuild calls for proper test isolation
    
    Rebuild the Pydantic Message model after restoring the TypedDict to
    unpatched state and in the finally block so cached schemas don't leak
    across tests. Also assert that Message construction fails without the
    patch to prove it is necessary.
    
    Made-with: Cursor
    
    * fix greptile's weak suggestion
    nabinchha authored Mar 9, 2026
    Configuration menu
    Copy the full SHA
    351d701 View commit details
    Browse the repository at this point in the history
  3. chore: improve test guidelines in AGENTS.md (#387)

    Add explicit rules for testing public APIs only, requiring type
    annotations on tests/fixtures, and keeping imports at module level.
    Fix fixture reference to reflect the actual pytest_plugins layout.
    
    Made-with: Cursor
    nabinchha authored Mar 9, 2026
    Configuration menu
    Copy the full SHA
    7384da2 View commit details
    Browse the repository at this point in the history

Commits on Mar 10, 2026

  1. fix: raise clear error when all records are dropped during generation (

    …#383)
    
    * fix: raise clear error when all records are dropped during generation (#382)
    
    When every record fails during generation (e.g. LLM or image model errors),
    the empty dataset previously reached the profiler, which raised a misleading
    DatasetProfilerConfigurationError about missing columns. Now both preview()
    and create() check for an empty dataset before profiling and raise a
    DataDesignerGenerationError with an actionable message instead.
    
    Made-with: Cursor
    
    * fix: keep load_dataset_with_dropped_columns inside profiling try/except
    
    Moves the call back inside the try/except block so ArtifactStorageError
    is caught and re-raised as DataDesignerProfilingError, preserving the
    documented API contract. Also reduces test num_records to 1.
    
    Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
    
    * Address Andre's feedback
    
    * more small feedback
    
    ---------
    
    Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
    nabinchha and claude authored Mar 10, 2026
    Configuration menu
    Copy the full SHA
    340087f View commit details
    Browse the repository at this point in the history

Commits on Mar 11, 2026

  1. feat: add async generator migration with symmetric bridging and state…

    …fulness (#378)
    
    * feat: add async generator migration with symmetric bridging and statefulness
    
    - Symmetric generate/agenerate bridging in base ColumnGenerator
    - is_stateful property; SeedDatasetColumnGenerator declares True
    - Async wrappers for FromScratchColumnGenerator and ColumnGeneratorFullColumn
    - Native async paths for ImageCellGenerator and EmbeddingCellGenerator
    - CustomColumnGenerator.agenerate with full validation parity
    - Extract _postprocess_result for shared sync/async output validation
    
    * fix: avoid blocking caller on sync bridge timeout
    
    Use explicit pool lifecycle instead of context manager so that
    a TimeoutError releases the caller immediately via
    shutdown(wait=False) rather than blocking on pool.__exit__.
    
    * fix: widen agenerate type signature to match generate
    
    Add @overload declarations so the base agenerate accepts both
    dict and pd.DataFrame, mirroring the existing generate pattern.
    
    * fix: ensure pool shutdown on sync bridge success path
    
    The else clause after return was unreachable, leaking the
    ThreadPoolExecutor on every successful call. Capture the result
    first, shut down the pool, then return.
    
    * fix: use try/finally for pool shutdown in sync bridge
    
    Ensures ThreadPoolExecutor is shut down on all exit paths,
    including non-TimeoutError exceptions from the coroutine.
    
    * refactor: extract shared validation in ImageCellGenerator
    
    Move duplicated input validation and prompt rendering into
    _prepare_image_inputs, shared by generate and agenerate.
    
    * refactor: extract shared input prep in EmbeddingCellGenerator
    
    * address PR review feedback
    
    - add _is_overridden helper for symmetric generate/agenerate guards
    - move defensive .copy() into base agenerate, remove subclass overrides
    - re-raise as builtin TimeoutError for Python 3.10 compat
    - rename is_stateful to is_order_dependent with improved docstring
    - replace brittle .fget test with object.__new__
    - add async tests for ImageCellGenerator and EmbeddingCellGenerator
    andreatgretel authored Mar 11, 2026
    Configuration menu
    Copy the full SHA
    8fff7c0 View commit details
    Browse the repository at this point in the history
  2. docs: add Enterprise Text-to-SQL and Search Agent recipes (#395)

    feat: add Nemotron Super Text-to-SQL and Search Agent recipes
    Add two new recipes derived from the Nemotron Super post-training pipelines:
    Nemotron Super Text-to-SQL:
    - Five-stage pipeline: seeding, prompt generation, schema with distractors,
      dialect-specific SQL, validation + quality scoring
    - 14 conditional samplers (10 industries, 50 topics, complexity-gated task
      types, data quality concepts, knowledge dependencies, 100 style combos)
    - Dialect-specific prompts for SQLite, MySQL, and PostgreSQL
    - 5 LLM judges (prompt, SQL, context, data quality, knowledge) with 15
      scoring dimensions and flat score extraction columns
    - Per-dialect syntax validation via CodeValidatorParams
    Nemotron Super Search Agent:
    - Four-stage pipeline: Wikidata KG seed paths, two-stage riddle generation
      (draft + BrowseComp-style obfuscation), Tavily web search trajectories
      via MCP, structured JSON formatting
    - Tavily hosted MCP endpoint (streamable_http) -- no local server or extra
      dependencies beyond data-designer
    - Full tool-call trace capture (with_trace=ALL_MESSAGES) for SFT data
    - Built-in demo seeds (3 Wikidata paths) for quick testing
    Both recipes include ASCII pipeline diagrams, Nemotron Super context in
    docstrings, dev note links in the markdown pages, and follow existing
    recipe conventions (PEP 723 metadata, --model-alias/--num-records/
    --artifact-path CLI args).
    dhruvnathawani authored Mar 11, 2026
    Configuration menu
    Copy the full SHA
    7de879a View commit details
    Browse the repository at this point in the history
  3. refactor: Decouple ModelFacade from LiteLLM via ModelClient adapter (#…

    …373)
    
    * plans for model facade overhaul
    
    * update plan
    
    * add review
    
    * address feedback + add more details after several self reviews
    
    * update plan doc
    
    * address nits
    
    * Add cannonical objects
    
    * self-review feedback + address
    
    * add LiteLLMRouter protocol to strongly type bridge router param
    
    Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
    
    * simplify some things
    
    * add a protol for http response like object
    
    * move HttpResponse
    
    * update PR-1 architecture notes for lifecycle and router protocol
    
    Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
    
    * Address PR #359 feedback: exception wrapping, shared parsing, test improvements
    
    - Wrap all LiteLLM router calls in try/except to normalize raw exceptions
      into canonical ProviderError at the bridge boundary (blocking review item)
    - Extract reusable response-parsing helpers into clients/parsing.py for
      shared use across future native adapters
    - Add async image parsing path using httpx.AsyncClient to avoid blocking
      the event loop in agenerate_image
    - Add retry_after field to ProviderError for future retry engine support
    - Fix _to_int_or_none to parse numeric strings from providers
    - Create test conftest.py with shared mock_router/bridge_client fixtures
    - Parametrize duplicate image generation and error mapping tests
    - Add tests for exception wrapping across all bridge methods
    
    * Use contextlib to dry out some code
    
    * Address Greptile feedback: HTTP-date retry-after parsing, docstring clarity
    
    - Parse RFC 7231 HTTP-date strings in Retry-After header (used by
      Azure and Anthropic during rate-limiting) in addition to numeric
      delay-seconds
    - Clarify collect_non_none_optional_fields docstring explaining why
      f.default is None is the correct check for optional field forwarding
    - Add tests for HTTP-date and garbage Retry-After values
    
    * Address Greptile feedback: FastAPI detail parsing, comment fixes
    
    - Fix misleading comment about prompt field defaults in _IMAGE_EXCLUDE
    - Handle list-format detail arrays in _extract_structured_message for
      FastAPI/Pydantic validation errors
    - Document scope boundary for vision content in collect_raw_image_candidates
    
    * add PR-2 architecture notes for model facade overhaul
    
    * save progress on pr2
    
    * small refactor
    
    * address feedback
    
    * Address greptile comment in pr1
    
    * refactor ProviderError from dataclass to regular Exception
    
    - Replace @DataClass + __post_init__ with explicit __init__ that calls
      super().__init__ properly, avoiding brittle field-ordering dependency
    - Store cause via __cause__ only, removing the redundant .cause attr
    - Update match pattern in handle_llm_exceptions for non-dataclass type
    - Rename shadowed local `fields` to `optional_fields` in TransportKwargs
    
    * Address greptile feedback
    
    * PR feedback
    
    * track usage tracking in finally block for images
    
    * pr feedback
    
    * wrap facade close in try/catch
    
    * clean up stray params
    
    * fix stray inclusion of metadata
    
    * small regression fix
    
    * address more feedback
    
    ---------
    
    Co-authored-by: Johnny Greco <jogreco@nvidia.com>
    Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
    3 people authored Mar 11, 2026
    Configuration menu
    Copy the full SHA
    ebd3322 View commit details
    Browse the repository at this point in the history

Commits on Mar 12, 2026

  1. Configuration menu
    Copy the full SHA
    eac63a1 View commit details
    Browse the repository at this point in the history
  2. feat(cli): bootstrap default configs on CLI startup (#401)

    * feat(cli): bootstrap default configs on command run
    
    * fix(cli): use active interpreter in bootstrap warning
    
    * refactor(cli): simplify bootstrap warning flow
    
    * refactor(cli): bootstrap defaults in main entrypoint
    
    * refactor(cli): keep bootstrap ownership in main
    
    * test(cli): cover lazy dispatch and runtime failure flag
    
    * refactor(cli): remove redundant bootstrap state
    
    * test(cli): assert bootstrap warning includes error
    
    * test: address cli bootstrap review feedback
    johnnygreco authored Mar 12, 2026
    Configuration menu
    Copy the full SHA
    b94b88b View commit details
    Browse the repository at this point in the history
  3. fix: pin chardet<6 to suppress RequestsDependencyWarning (#405)

    requests 2.32.5 asserts chardet<6 at import time, but sqlfluff and
    diff_cover pull in chardet without an upper bound, resolving to 7.1.0
    on fresh installs. Add a workspace constraint to cap chardet until a
    new requests release ships the fix from psf/requests#7220.
    
    Closes #404
    andreatgretel authored Mar 12, 2026
    Configuration menu
    Copy the full SHA
    bca79d8 View commit details
    Browse the repository at this point in the history
  4. fix: add chardet<6 constraint to published engine package (#406)

    * fix: add chardet<6 constraint to published engine package
    
    The workspace-level constraint-dependencies in [tool.uv] is not included
    in published wheel metadata, so PyPI consumers still get chardet>=6 via
    sqlfluff, triggering RequestsDependencyWarning from requests<2.33.
    
    Move the pin to an explicit dependency in data-designer-engine so it
    ships with the package.
    
    * chore: remove redundant workspace-level chardet constraint
    
    Now that chardet<6 is an explicit dependency of data-designer-engine,
    the workspace constraint-dependencies entry is no longer needed.
    johnnygreco authored Mar 12, 2026
    Configuration menu
    Copy the full SHA
    447ed59 View commit details
    Browse the repository at this point in the history
Loading