Skip to content

docs: consolidated seed reader documentation#481

Merged
eric-tramel merged 1 commit into
mainfrom
worktree-docs+consolidated-seed-reader-docs
Mar 31, 2026
Merged

docs: consolidated seed reader documentation#481
eric-tramel merged 1 commit into
mainfrom
worktree-docs+consolidated-seed-reader-docs

Conversation

@eric-tramel

Copy link
Copy Markdown
Contributor

Summary

  • Add DirectorySeedSource, FileContentsSeedSource, and AgentRolloutSeedSource sections to the seed datasets concept page with code examples, exposed column lists, and cross-links
  • Add FileSystemSeedReader plugin authoring guide (docs/plugins/filesystem_seed_reader.md) covering the manifest/hydration contract, inline reader pattern, selection semantics, and packaging guidance
  • Add Markdown Section Seed Reader recipe (runnable single-file 1:N filesystem reader example)
  • Update plugin overview and example docs to reference FileSystemSeedReader and the new guide
  • Fix designer.preview()data_designer.preview() bug in the complete example

Supersedes #425 and #452.

… rollout sources

Add comprehensive documentation for DirectorySeedSource, FileContentsSeedSource,
and AgentRolloutSeedSource to the seed datasets concept page. Add FileSystemSeedReader
plugin authoring guide and Markdown section seed reader recipe. Supersedes #425 and #452.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@eric-tramel eric-tramel requested a review from a team as a code owner March 31, 2026 15:12
@eric-tramel eric-tramel changed the title docs: consolidated seed reader documentation for filesystem and agent rollout sources docs: consolidated seed reader documentation Mar 31, 2026
@greptile-apps

greptile-apps Bot commented Mar 31, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This is a documentation-only PR that consolidates seed reader documentation for filesystem and agent rollout sources, adds a new FileSystemSeedReader plugin authoring guide, and introduces a runnable Markdown Section Seed Reader recipe. It also fixes a real bug where designer.preview() was incorrectly used in the Complete Example instead of data_designer.preview().

Key changes:

  • docs/concepts/seed-datasets.md: Adds DirectorySeedSource, FileContentsSeedSource, and AgentRolloutSeedSource sections with column listings, code examples, and cross-links. Fixes the designer.preview()data_designer.preview() bug in the Complete Example section.
  • docs/plugins/filesystem_seed_reader.md (new): Step-by-step guide covering the build_manifest/hydrate_row contract, inline reader pattern (no packaging required), manifest-based selection semantics, and the optional packaging path. Accurately reflects the engine implementation.
  • docs/assets/recipes/plugin_development/markdown_seed_reader.py (new): Self-contained, runnable uv run recipe that fans each Markdown file into one row per ATX heading section. Logic, output_columns schema, and manifest-based IndexRange selection all check out correctly against the framework source.
  • docs/recipes/plugin_development/markdown_seed_reader.md (new): Landing page for the recipe with correct relative paths, --8<-- snippet include, and download link.
  • docs/plugins/overview.md, docs/plugins/example.md, docs/recipes/cards.md, mkdocs.yml: Navigation and cross-link updates to surface the new content.

Confidence Score: 5/5

Safe to merge — documentation-only changes with a correct bug fix; all links, snippet paths, and Python logic verified against the engine source.

All changed files are documentation. Every cross-link and relative path resolves correctly against the mkdocs base paths. The new Python recipe was verified against the FileSystemSeedReader engine source: output_columns assignment is correct (base class declares ClassVar), hydrate_row return type matches the framework signature, and output_columns in the recipe matches the emitted record schema exactly. The only code change is the designer.preview() → data_designer.preview() bug fix, which is clearly correct given the variable name on line 304. No P0 or P1 findings.

No files require special attention.

Important Files Changed

Filename Overview
docs/concepts/seed-datasets.md Adds DirectorySeedSource, FileContentsSeedSource, and AgentRolloutSeedSource sections with column listings and code examples; also fixes the designer.preview()data_designer.preview() bug in the Complete Example.
docs/plugins/filesystem_seed_reader.md New guide covering the FileSystemSeedReader plugin contract (build_manifest/hydrate_row), inline reader pattern, manifest-based selection semantics, and packaging step; accurate against the engine implementation.
docs/assets/recipes/plugin_development/markdown_seed_reader.py New self-contained runnable recipe demonstrating a 1:N FileSystemSeedReader with ATX-heading-based Markdown section fanout; logic is correct and output_columns match the hydrated record schema exactly.
docs/recipes/plugin_development/markdown_seed_reader.md New recipe landing page with correct relative paths for the download button, the --8<-- snippet include, and the cross-link back to the plugin guide.
mkdocs.yml Adds the Plugin Development recipe section and the FileSystemSeedReader Plugins page to the nav; paths are correct and ordered consistently.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[User calls DataDesigner.preview / create] --> B{Seed source type?}
    B -->|LocalFile / HuggingFace / DataFrame| C[Built-in SeedReader]
    B -->|DirectorySeedSource / FileContentsSeedSource| D[FileSystemSeedReader]
    B -->|AgentRolloutSeedSource| E[AgentRolloutSeedReader]
    B -->|DirectorySeedSource + custom reader| F[Custom FileSystemSeedReader\ne.g. MarkdownSectionDirectorySeedReader]
    D --> G[build_manifest\nenumerate matching files]
    F --> G
    G --> H[IndexRange / PartitionBlock / shuffle\noperates on manifest rows]
    H --> I[hydrate_row per manifest row\n1:1 or 1:N fanout]
    I --> J[output_columns schema validation]
    J --> K[DuckDB registration\n→ seed columns available in Jinja2 templates]
    E --> K
    C --> K
Loading

Reviews (1): Last reviewed commit: "docs: consolidated seed reader documenta..." | Re-trigger Greptile

@eric-tramel eric-tramel self-assigned this Mar 31, 2026
@eric-tramel eric-tramel added the documentation Improvements or additions to documentation label Mar 31, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants