Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: NVIDIA-NeMo/DataDesigner
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: v0.5.5
Choose a base ref
...
head repository: NVIDIA-NeMo/DataDesigner
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: v0.5.6
Choose a head ref
  • 15 commits
  • 70 files changed
  • 6 contributors

Commits on Apr 2, 2026

  1. fix: use --bare and --tools in health probe CLI check (#489)

    The "Verify Claude CLI" step fails on the CI runner because Claude
    Code tries to initialize keychain, LSP, plugins, and CLAUDE.md
    discovery before making the API call. On a bare runner these
    resources don't exist, causing exit code 1.
    
    - Add --bare to skip all initialization and force ANTHROPIC_API_KEY auth
    - Add --tools "" to disable tool definitions (health check doesn't need
      them, and this avoids sending a large payload to the gateway)
    andreatgretel authored Apr 2, 2026
    Configuration menu
    Copy the full SHA
    0d80858 View commit details
    Browse the repository at this point in the history

Commits on Apr 6, 2026

  1. Configuration menu
    Copy the full SHA
    58870bb View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    f78c4e0 View commit details
    Browse the repository at this point in the history
  3. docs: add skip column config option for conditional column generation (

    …#479) (#480)
    
    * plan: add skip_when for conditional column generation (#479)
    
    Adds implementation plan for a `skip_when` field on `SingleColumnConfig`
    that enables conditional column generation. When the Jinja2 expression
    evaluates truthy, the cell is set to None and the generator is skipped.
    Skips auto-propagate through the DAG to downstream columns.
    
    Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
    
    * plan: remove HopChain example from skip_when plan
    
    Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
    
    * plan: replace HopChain example with generic product review example
    
    Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
    
    * plan: add open questions on skip sentinel value and row filtering
    
    Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
    
    * plan: major revision — SkipConfig model, sync engine support, decouple propagation
    
    - Introduce SkipConfig(when, value) as nested model on SingleColumnConfig
    - Move propagate_skip to SingleColumnConfig as independent field, fixing
      bug where columns with no SkipConfig couldn't participate in propagation
    - Add full sync engine implementation (Steps 4a-4d) covering both
      _fan_out_with_threads and _run_full_column_generator dispatch paths
    - Add serialization boundary stripping for both DatasetBatchManager (sync)
      and RowGroupBufferManager (async)
    - Simplify architecture diagrams for readability
    - Update all references, design decisions, verification plan
    
    Made-with: Cursor
    
    * updates
    
    * plan: document get_required_columns for skip propagation
    
    - Explain why propagation must not use get_upstream_columns() once
      skip.when adds DAG edges; add _required_columns and
      get_required_columns() to the execution graph plan
    - Point async _run_cell at get_required_columns for parity with sync
    - Clarify DropSkippedRowsProcessorConfig vs stripping __skipped__ for
      DataFrames; tighten resolved-questions wording
    - Extend DAG/graph verification with gating_col regression case
    
    Refs #479
    
    Made-with: Cursor
    
    * plan: centralize __skipped__ handling in skip_provenance
    
    - Document new skip_provenance.py (key constant, read/write/strip API)
    - Point sync builder, async scheduler, and batch buffers at shared helpers
    - Strip metadata before every DataFrame from buffer dicts, including
      FULL_COLUMN active subsets
    - Split §3 into skip_evaluator vs skip_provenance; extend verification
    
    Refs #479
    
    Made-with: Cursor
    
    * plan: align doc title with SkipConfig / skip.when
    
    Drop legacy skip_when naming in headings and #362 cross-reference.
    
    Refs #479
    
    Made-with: Cursor
    
    * plan: address review — delimiter validation, centralized error handling, caller-owns-deserialization
    
    - SkipConfig._validate_when_syntax now checks find_undeclared_variables
      is non-empty, rejecting expressions without {{ }} delimiters that
      would silently skip every row
    - evaluate_skip_when centralizes try/except so both sync and async
      engines get identical fail-safe behavior on eval errors
    - evaluate_skip_when takes a single pre-deserialized record; caller
      runs deserialize_json_values once and passes to both skip eval and
      generator (no double deserialization, no redundant parameter)
    - Update _should_skip_cell, async _run_cell, Files Modified table,
      and verification section accordingly
    
    Refs #479
    
    Made-with: Cursor
    
    * plan: add get_side_effect_columns accessor to execution graph spec
    
    Document _side_effects_by_producer inverse map and
    get_side_effect_columns() accessor on ExecutionGraph, needed by
    _write_skip_to_record / apply_skip_to_record to clear __trace,
    __reasoning_content, etc. on skip. Added to both Step 2b metadata
    section and Files Modified table.
    
    The __skipped__ leak into active_df (greptile's other P1) was already
    fixed in 7046378 via strip_skip_metadata_from_records.
    
    Refs #479
    
    Made-with: Cursor
    
    ---------
    
    Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
    nabinchha and claude authored Apr 6, 2026
    Configuration menu
    Copy the full SHA
    d4443d7 View commit details
    Browse the repository at this point in the history
  4. chore: plan 427, PR 2 of agent-first development plan (#478)

    * save progress
    
    * undo review-code skill change
    
    * delete status file
    
    * small tweaks
    
    * Fix 429 info
    
    * update workind on skill info
    
    * updates
    
    * Update architecture/overview.md
    
    Co-authored-by: Johnny Greco <jogreco@nvidia.com>
    
    * fix: correct symbol names and CLI commands in architecture docs
    
    Address review comments:
    - models.md: describe clients as native httpx adapters, not SDK wrappers
    - agent-introspection.md: use actual family keys (columns, samplers, etc.) not column-types
    - cli.md: use correct command `data-designer config models`
    - plugins.md: SEED_READER not SEED_SOURCE, inject_into_processor_config_type_union
    
    Made-with: Cursor
    
    ---------
    
    Co-authored-by: Johnny Greco <jogreco@nvidia.com>
    nabinchha and johnnygreco authored Apr 6, 2026
    Configuration menu
    Copy the full SHA
    4768a36 View commit details
    Browse the repository at this point in the history

Commits on Apr 7, 2026

  1. Configuration menu
    Copy the full SHA
    7891dd5 View commit details
    Browse the repository at this point in the history
  2. fix: prevent skill load failure when data-designer CLI is not install…

    …ed (#501)
    
    * fix: prevent skill load failure when data-designer CLI is not installed
    
    Append `|| true` to the shell command that resolves the data-designer
    path so it always exits 0. Without this, the skill fails to load
    entirely when the CLI is missing, and the "If blank, see
    Troubleshooting" fallback is never reached.
    
    * fix: use explicit NOT_FOUND sentinel when data-designer CLI is missing
    
    Replace `|| true` (blank output) with `|| echo NOT_FOUND` so the agent
    sees a clear signal. Update the instruction to bold/imperative so it
    actually gets followed.
    
    * fix: move CLI resolution into workflow steps instead of skill preamble
    
    Remove the \!`command` substitution from SKILL.md and add a "Resolve CLI
    command" step to both workflows. The agent now runs the lookup itself
    and uses the result as the data-designer executable for all subsequent
    commands. If the command fails, the agent stops and follows
    Troubleshooting.
    
    * fix: use CLI_NOT_FOUND sentinel to avoid triggering agent error-fixing
    
    The resolve command now always exits 0 and outputs CLI_NOT_FOUND when
    the executable is missing, so the agent evaluates a value rather than
    reacting to a shell error.
    
    * fix: require user permission before installing data-designer
    
    Update Troubleshooting to ask the user before creating a venv or
    installing packages, instead of attempting it automatically.
    johnnygreco authored Apr 7, 2026
    Configuration menu
    Copy the full SHA
    d3209e8 View commit details
    Browse the repository at this point in the history

Commits on Apr 8, 2026

  1. ci: add PR review workflow and recipe for agentic CI (#498)

    * ci: add PR review workflow and recipe
    
    Add the remaining Phase 1 deliverables for the agentic CI plan:
    
    - PR review recipe that composes the existing review-code skill
    - PR review workflow with collaborator-only gate, auth detection,
      pre-flight checks, and re-review label support
    - Mark Phase 1 items complete in the plan (except docs)
    
    * fix: use explicit draft == false instead of ! operator in workflow if
    
    The ! operator in a >- YAML block may cause parsing issues. Use
    explicit comparison instead.
    
    * fix: address review feedback + simplify if condition for debugging
    
    - Fix: only re-review label triggers on labeled events (greptile)
    - Fix: use printf instead of echo -e for prompt assembly (greptile)
    - Debug: simplify if condition to isolate why job is skipping
    
    * debug: set if to true to test runner connectivity
    
    * debug: add job to dump event context for if-condition debugging
    
    * fix: use collaborator API for permission check instead of author_association
    
    author_association in webhook payloads reports NONE when org membership
    is private, causing the job to skip even for members. Replace with a
    gate job that checks collaborator permissions via the API, which works
    regardless of org visibility settings.
    
    * fix: disable prompt caching and skip posting on review failure
    
    - Set DISABLE_PROMPT_CACHING=1 for Bedrock-backed endpoints that don't
      support cache_control parameters
    - Don't post a comment when the review file isn't produced, just emit
      a warning annotation on the workflow run
    
    * fix: rename label to agent-review, remove synchronize trigger
    
    - Rename re-review -> agent-review for clarity
    - Remove synchronize from trigger types so reviews are opt-in on
      subsequent pushes (use the agent-review label to retrigger)
    - Reviews still auto-run on PR open and draft -> ready transitions
    
    * fix: validate PR number input and remove unused auth mode step
    
    * fix: address review feedback - quoting, checkout ordering, stale docs
    
    - Pass all step outputs through env vars instead of direct expression
      injection in shell (PR number, model name)
    - Resolve head SHA before checkout so dispatch doesn't clone at wrong ref
    - Use set -o pipefail + continue-on-error instead of || true
    - Remove stale synchronize references from plan doc
    
    * fix: add specific review guidance for plan docs
    
    * fix: check labeler permission for agent-review on external PRs
    
    For labeled events, check the sender (who added the label) instead of
    the PR author. This lets maintainers authorize agent reviews on PRs
    from external contributors by adding the agent-review label.
    andreatgretel authored Apr 8, 2026
    Configuration menu
    Copy the full SHA
    6b92351 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    5f04e5d View commit details
    Browse the repository at this point in the history
  3. docs: add async engine dev note (#490)

    * fix: address review feedback on async engine dev note
    
    - Fix wall-clock claim: 41% -> 22% to match benchmark table
    - Fix dual-model speedup rounding: 1.7x -> 1.6x (10.0/6.1 = 1.64)
    - Fix run_config API: use dd.set_run_config() instead of passing to create()
    
    * docs: add async engine dev note
    
    Add "Async All the Way Down" dev note covering the async task-queue
    scheduler built across PRs #356, #378, #404, #429, #456. Includes
    benchmark results, architecture diagrams, and DAG shape illustrations.
    
    * feat: add docs preview workflow for PRs
    
    Build MkDocs site on PRs that touch docs and deploy to Cloudflare
    Pages. Each PR gets a browseable preview URL posted as a comment.
    Notebook tutorials use placeholder stubs since they require API
    keys to execute.
    
    Requires CLOUDFLARE_API_TOKEN and CLOUDFLARE_ACCOUNT_ID repo secrets.
    
    * fix: update speedup chart alt text from 1.7x to 1.6x
    
    * docs: improve timeline figure context and labeling
    
    Add DAG subtitle to sync-vs-async timeline figure and bridge the
    surrounding text to explain which workload shape is being shown.
    
    * edits+additions to async-all-the-way-down dev notes
    
    * clarify two semaphore dance
    
    * remove dead link
    
    * replace hero image
    
    * docs: update scale figures with nginx-accurate data and adjust sizing
    
    Regenerate scale-model-timeline and scale-boxplot from nginx access
    logs (column_progress.csv, sync/summary.json) instead of buffered
    execution logs. Optimize both PNGs to palette mode. Adjust figure
    widths and update model timeline commentary.
    
    * add link from owning-the-model-stack to async-dev-node
    
    * docs: address review feedback on async blog post
    
    - Tighten intro to a concise abstract, move pipeline narrative into
      "The Bottleneck Was Structural" section
    - Remove multi-column generators / seed readers paragraph (TMI)
    - Clarify sync engine ran columns sequentially within each batch
    
    ---------
    
    Co-authored-by: Nabin Mulepati <nmulepati@nvidia.com>
    andreatgretel and nabinchha authored Apr 8, 2026
    Configuration menu
    Copy the full SHA
    0e90ea6 View commit details
    Browse the repository at this point in the history
  4. fix: use non-blocking dispatch to prevent pipeline starvation (#504) (#…

    …505)
    
    Replace blocking semaphore acquire in the dispatch loop with a
    non-blocking try_acquire that breaks out when the semaphore is full.
    This causes the outer loop to re-query the frontier, picking up
    newly-ready downstream tasks instead of draining a stale snapshot.
    
    Fixes #504
    
    Made-with: Cursor
    nabinchha authored Apr 8, 2026
    Configuration menu
    Copy the full SHA
    c27ad62 View commit details
    Browse the repository at this point in the history

Commits on Apr 9, 2026

  1. feat: add Pi Coding Agent rollout seed source (#513) (#514)

    Add support for ingesting Pi Coding Agent session artifacts as an agent
    rollout seed source. Pi sessions are tree-structured JSONL files; the
    handler resolves the active conversation path by walking from the last
    entry back to the root via id/parentId links.
    
    Key points:
    - Tree-structured sessions with automatic active-path resolution
    - Entry-level types: model_change, compaction, branch_summary,
      custom_message, thinking_level_change
    - Message roles: user, assistant (inline ToolCall/ThinkingContent/
      TextContent blocks), toolResult, bashExecution (synthesized as
      tool-call pairs), custom, compactionSummary, branchSummary
    - Extract shared normalize_message_content to utils.py (was duplicated
      in Hermes handler)
    johnnygreco authored Apr 9, 2026
    Configuration menu
    Copy the full SHA
    fdd5ebb View commit details
    Browse the repository at this point in the history
  2. fix: always return ISO-8601 from datetime postproc (#484) (#512)

    * fix: always return ISO-8601 from datetime postproc (#484)
    
    The DatetimeFormatMixin.postproc heuristics inferred output format from
    value distribution, silently stripping date/time components for small
    datasets or narrow date ranges. Replace with deterministic ISO-8601
    output via vectorized strftime. Users who need custom formats can still
    set convert_to on the SamplerColumnConfig.
    
    * docs: update convert_to docstring and add DatetimeFormatMixin docstring
    
    The SamplerColumnConfig.convert_to docstring incorrectly stated that
    only "float", "int", or "str" are accepted. Datetime/timedelta samplers
    accept strftime format strings. Also document the ISO-8601 default.
    
    * test: add regression test for #484 via DataDesigner.preview API
    
    Captures the exact reproducer from the issue: a single-record datetime
    preview through the public DataDesigner.preview() interface must return
    a full ISO-8601 timestamp, not a bare year string.
    
    * test: trim redundant datetime tests, align reproducer with issue #484
    
    - Remove postproc_same_day_records (subsumed by same_month + no_convert_to)
    - Remove postproc_always_parseable (subsumed by stdlib_fromisoformat)
    - Remove all_same_month integration test (subsumed by narrow_range_single_day)
    - Update single_record test to use unit="h" matching the issue reproducer
    
    * fix: address review nits — move datetime import to module scope, drop redundant isinstance
    johnnygreco authored Apr 9, 2026
    Configuration menu
    Copy the full SHA
    4a28136 View commit details
    Browse the repository at this point in the history
  3. fix: include multi_modal_context columns in required_columns (#520) (#…

    …522)
    
    `LLMTextColumnConfig.required_columns` and `ImageColumnConfig.required_columns`
    only extracted dependencies from Jinja2 prompt templates, missing columns
    referenced via `multi_modal_context`. This caused the async engine's execution
    graph to dispatch LLM tasks before their multi-modal seed columns were loaded,
    resulting in KeyError failures under DATA_DESIGNER_ASYNC_ENGINE=1.
    nabinchha authored Apr 9, 2026
    Configuration menu
    Copy the full SHA
    fd477a6 View commit details
    Browse the repository at this point in the history
  4. docs: add LiteLLM supply-chain incident notice to README (#516)

    * docs: add LiteLLM supply-chain incident notice to README
    
    * Update README.md
    
    Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
    johnnygreco and greptile-apps[bot] authored Apr 9, 2026
    Configuration menu
    Copy the full SHA
    6505ce4 View commit details
    Browse the repository at this point in the history
Loading