Skip to content

feat: support simple conditional column generation #479

@nabinchha

Description

@nabinchha

Problem

DataDesigner's DAG executes every column for every row unconditionally. In multi-stage synthesis pipelines, expensive downstream generation (LLM calls, segmentation, etc.) runs even when an earlier gate column indicates the row should be filtered out.

Today the only workarounds are:

  1. Generate all columns unconditionally and post-filter — wasting LLM calls on rows that will be discarded
  2. Split into multiple DataDesigner.create() calls with intermediate filtering — losing single-pipeline ergonomics

Proposed Feature

Add a skip_when field to column configs that accepts a Jinja2 expression. When the expression evaluates truthy for a row, generation is skipped and the cell is set to None. Skips should auto-propagate through the DAG — downstream columns that depend on a skipped column also skip without requiring explicit configuration.

Example Use Case

config_builder.add_column(
    name="complexity_score", column_type="llm-structured", ...
)
config_builder.add_column(
    name="categories",
    column_type="llm-structured",
    skip_when="{{ complexity_score.overall_complexity_score < 6 }}",
    ...
)
# Everything downstream of categories auto-skips — no extra config needed
config_builder.add_column(name="instances", ...)
config_builder.add_column(name="multi_hop_query", ...)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions