Skip to content

Cryptic "User provided prompt generation template is invalid" error when a referenced template field is empty or missing #629

@nabinchha

Description

@nabinchha

Describe the bug

When a Jinja prompt template references a field that is None, an empty string, or missing from the row's record, the user sees a cryptic and unactionable error:

🛑 Failed to process column 'preferred_english_name': User provided prompt generation template is invalid.

Two distinct render-time failures collapse into the same unhelpful message:

Bug 1 — empty render (the more common case): when {{ x }} renders to '' (because x is '', missing from the record dict, or a chain that resolves to empty), the SECURE Jinja path raises a sensible internal error ("User template renders to empty text.") but sanitize_user_exceptions strips that detail and replaces it with the generic "User provided prompt generation template is invalid.". The user has no way to know:

  • which field caused the issue,
  • which row caused it,
  • or what to do about it.

Bug 3 — raw UndefinedError for missing nested attributes: when {{ person.address.street }} is evaluated and address is missing from the person dict, a raw Jinja UndefinedError (e.g. 'dict object' has no attribute 'address') leaks all the way up because the sanitizer only catches UserTemplateError / TemplateSyntaxError. Same root cause as Bug 1, same right answer, but currently surfaces differently and just as confusingly.

This is common in two real-world patterns:

  • Person sampler with non-en_SG locales — fields like preferred_english_name only exist for en_SG personas (see PII_FIELDS in packages/data-designer-config/src/data_designer/config/utils/constants.py). Templates that reference them break for any other locale.
  • Seed datasets with sparse columns — real-world tabular data has empty cells everywhere; any template referencing a sparse column hits this on those rows.

Steps/Code to reproduce bug

from data_designer.engine.processing.ginja.environment import WithJinja2UserTemplateRendering
from data_designer.config.run_config import JinjaRenderingEngine


class Demo(WithJinja2UserTemplateRendering):
    def __init__(self):
        self._jinja_rendering_engine = JinjaRenderingEngine.SECURE


# Bug 1 — empty render via missing key
demo = Demo()
demo.prepare_jinja2_template_renderer(
    "{{ person.preferred_english_name }}",
    dataset_variables=["person"],
)
demo.render_template({"person": {"first_name": "John", "last_name": "Doe"}})
# UserTemplateError: User provided prompt generation template is invalid.

# Bug 3 — raw UndefinedError on missing nested key
demo = Demo()
demo.prepare_jinja2_template_renderer(
    "Hi {{ person.address.street }}",
    dataset_variables=["person"],
)
demo.render_template({"person": {}})
# UndefinedError: 'dict object' has no attribute 'address'

End-to-end inside _run_batch, the user sees:

DataDesignerGenerationError: 🛑 Error generating preview dataset: 🛑 Failed to process column 'preferred_english_name': User provided prompt generation template is invalid.

Expected behavior

The error should be actionable and identify:

  • the column being processed (already happens via _run_batch),
  • which referenced field(s) in the row were None, empty, or missing,
  • the recommended remedies: Jinja conditional fallback and SkipConfig.

Example shape we should aim for:

🛑 Failed to process column 'preferred_english_name':
Template rendered to empty text. This usually happens when one or more referenced fields are None, empty, or missing.

Likely culprits in this row:
  - person.preferred_english_name = None

To handle missing values, you can:

  1. Provide a fallback in your template using a Jinja conditional:
       {{ person.preferred_english_name if person.preferred_english_name else 'N/A' }}

  2. Skip rows where required fields are missing using SkipConfig:
       skip=SkipConfig(when="{{ person.preferred_english_name is none }}")

Proposed fix (high level)

  1. New EmptyTemplateRenderError(UserTemplateError) subclass that bypasses sanitize_user_exceptions (mirrors the existing UserTemplateUnsupportedFiltersError pattern).
  2. AST helper that extracts dotted/bracketed access chains from the parsed template and pairs each chain with the value it would resolve to in the current row.
  3. _assert_rendered_text_not_empty and a new UndefinedError branch in UserTemplateSandboxEnvironment.safe_render use the helper to build an actionable, copy-pasteable error message naming the offending chain(s).
  4. Sanitizer bypass for EmptyTemplateRenderError so the message survives.

End-to-end propagation already prepends the column name via _run_batch, so no other layers need changes.

Affected files

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions