Skip to content

docs: add structured outputs SDG dev notes#338

Merged
dhruvnathawani merged 18 commits into
mainfrom
dhruv/devnotes/structured-outputs
Feb 25, 2026
Merged

docs: add structured outputs SDG dev notes#338
dhruvnathawani merged 18 commits into
mainfrom
dhruv/devnotes/structured-outputs

Conversation

@dhruvnathawani

@dhruvnathawani dhruvnathawani commented Feb 19, 2026

Copy link
Copy Markdown
Contributor

Summary

Add a dev note documenting the structured outputs SDG pipeline used to generate training data for Nemotron Nano v3's structured output capabilities.

What's in the post

  • Motivation: Why structured output reliability matters for agentic AI applications (14.81% error rate on baseline)
  • Benchmark results: JSONSchemaBench (80.2% → 86.9%) and StructEval-Text (64.5% → 72.1%) with per-format breakdown (CSV, JSON, TOML, XML, YAML)
  • Pipeline walkthrough: Seed data → diversity samplers → schema generation → conversation → structured output → rejection sampling (3x rollouts) → programmatic validation
  • ASCII pipeline diagram showing the 4-stage flow
  • Screenshot of display_sample_record() output showing a complete generated record
  • Discussion of LLMStructuredColumnConfig vs dynamic per-record schemas
  • Published dataset: nvidia/Nemotron-RL-instruction_following-structured_outputs (9,949 samples, CC BY 4.0)
  • Caveats: TOML/XML challenges, schema depth diminishing returns
  • Collapsible demo script using default DD config (pip install + run)

Files changed

  • docs/devnotes/posts/structured-outputs.md (new)
  • docs/devnotes/posts/images/structured-outputs-sample-record.png (new)
  • docs/devnotes/.authors.yml (added dnathawani)

@dhruvnathawani dhruvnathawani changed the title devnotes: Add Structured Outputs SDG Blog Post docs: Add Structured Outputs SDG dev notes Feb 19, 2026
@dhruvnathawani dhruvnathawani changed the title docs: Add Structured Outputs SDG dev notes docs: add structured outputs SDG dev notes Feb 19, 2026
@dhruvnathawani dhruvnathawani marked this pull request as ready for review February 19, 2026 07:41
@dhruvnathawani dhruvnathawani requested a review from a team as a code owner February 19, 2026 07:41
@greptile-apps

greptile-apps Bot commented Feb 19, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

Adds comprehensive documentation for the structured outputs SDG pipeline used to generate training data for Nemotron Nano v3's structured output capabilities.

The post includes:

  • Benchmark results showing significant improvements (JSONSchemaBench: 80.2% → 86.9%, StructEval-Text: 64.5% → 72.1%)
  • Detailed 4-stage pipeline architecture with ASCII diagram
  • Technical walkthrough of schema generation, diversity sampling, multi-rollout generation, and rejection sampling
  • Working demo code with installation instructions
  • Link to published dataset on HuggingFace (9,949 samples, CC BY 4.0)
  • Discussion of design choices and caveats

The documentation is well-structured, technically accurate, and follows the existing devnotes format established in other posts.

Confidence Score: 5/5

  • This PR is safe to merge with no risk
  • Documentation-only PR with three well-formed files: a properly formatted markdown devnote, a valid screenshot image, and a clean author registry update. No code changes or functional modifications.
  • No files require special attention

Important Files Changed

Filename Overview
docs/devnotes/.authors.yml Adds new author dnathawani to the documentation authors list with proper formatting
docs/devnotes/posts/images/structured-outputs-sample-record.png Screenshot showing sample Data Designer output with seed columns, schema, conversation, and validation results
docs/devnotes/posts/structured-outputs.md Comprehensive dev note documenting structured output SDG pipeline with benchmarks, architecture diagram, and working demo code

Last reviewed commit: f3765f3

┌─────────────────────────────────────────────────────────────────┐
│ STAGE 2: DIVERSITY CONTROLS │

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would suggest your AI to make the boxes a bit wider to avoid the warping. Would make those look nicer

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion, done


---

## **Step 1: Seed Data and Schema Generation**

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many headings use ##. Maybe not all need to be a heading and just some need to be in bold only.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed

3. **Diversity at every level.** Diverse topics, diverse schemas (depth/width/rigidity), diverse formats, diverse prompts. Each dimension independently improves robustness.
4. **Rejection sampling is cheap insurance.** 3x rollouts push per-record validity from ~80% to >95%. The marginal token cost is small compared to the quality gain.
5. **Validation must be programmatic.** LLM judges assess *design quality* but cannot reliably detect *schema violations*. `jsonschema` + format parsers are non-negotiable.
6. **The hardest formats need the most data.** TOML and XML lag behind JSON and YAML. The pipeline makes it easy to oversample hard formats.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your demo script only does JSON right. Maybe a brief note here how what is needed to extend this to TOML/XML etc

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, made a note


The stakes are high. When an LLM serves as a backend for tool-calling agents, a single malformed JSON response doesn't just produce a bad answer; it crashes the entire agentic pipeline. The function call fails, the agent can't recover, and the user sees an error. OpenAI, Anthropic, and Google have all invested heavily in structured output guarantees for exactly this reason.

When we measured our baseline model, roughly 1 in 5 structured outputs was malformed. For an API serving thousands of requests, that's hundreds of failures per hour. Our goal was to reduce this as much as possible through targeted synthetic data.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On JSONSchemaBench and 35% on StructEval-Text, right?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarified

mvansegbroeck
mvansegbroeck previously approved these changes Feb 19, 2026

@mvansegbroeck mvansegbroeck left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great blogpost. Few minor changes but approving already.

@greptile-apps greptile-apps Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Comment thread docs/devnotes/posts/structured-outputs.md Outdated

config.with_seed_dataset(
dd.DataFrameSeedSource(df=seed_df),
sampling_strategy=SamplingStrategy.SHUFFLE,

@nabinchha nabinchha Feb 23, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this can just be dd.SamplingStrategy.SHUFFLE? Then you wouldn't need to explicitly import SamplingStrategy up top.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed, thanks


**Key Resources:**

- **Dataset (download):** [nvidia/Nemotron-RL-instruction_following-structured_outputs](https://huggingface.co/datasets/nvidia/Nemotron-RL-instruction_following-structured_outputs) (CC BY 4.0)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This dataset doesn't yet have a datadesigner tag. Can it be added?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nor a link in references

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, will reach out to add these

nabinchha
nabinchha previously approved these changes Feb 23, 2026
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

@nabinchha nabinchha left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@dhruvnathawani dhruvnathawani merged commit f07624b into main Feb 25, 2026
47 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants