Skip to content

docs: prepare Fern docs workflow#622

Merged
andreatgretel merged 28 commits into
mainfrom
andreatgretel/chore/fern-generated-artifacts
May 12, 2026
Merged

docs: prepare Fern docs workflow#622
andreatgretel merged 28 commits into
mainfrom
andreatgretel/chore/fern-generated-artifacts

Conversation

@andreatgretel

@andreatgretel andreatgretel commented May 8, 2026

Copy link
Copy Markdown
Contributor

Summary

This PR prepares DataDesigner's Fern docs workflow for local validation, preview, and publishing while keeping the existing MkDocs path active. It also removes generated Fern artifacts from git, adds version/dev-note handling, and hardens preview/publish automation before the eventual docs cutover.

Related Issue

N/A

Changes

Added

Changed

  • Generated Fern notebook artifacts and API reference output are gitignored and regenerated locally/CI from source.
  • Notebook conversion now prefers executed notebooks in docs/notebooks/ when present and falls back to docs/notebook_source/*.py without outputs.
  • Fern preview URL is parsed from Fern CLI output instead of hard-coded.
  • Hosted preview deploys remain gated to same-repo PRs; fork PRs still build/check without receiving secrets.
  • Dev Notes are modeled as rolling on latest, with release YAMLs including only posts available at that release point.
  • Older Fern versions route now uses /nemo/datadesigner/older-versions, with the previous doubled route redirected.

Fixed

  • Fern publish now checks that the release version exists before deploying.
  • Fern publish release tag selection handles release events, manual input, and latest-release fallback.
  • Stale preview runs are canceled.
  • Fern preview failure no longer blocks MkDocs preview deploy/comment updates.
  • Generated MDX quoting for MCP provider docstrings.
  • Dev Notes author metadata rendering.
  • Recipe cards and VLM recipe pages missing from the Fern port.

Fern Make Targets

The intended user-facing surface is small. Full command descriptions also live in fern/README.md and make help.

Primary local commands:

Command What it does
make serve-fern-docs-locally Install docs dependencies, generate Fern artifacts, then serve Fern docs locally with fern docs dev. Does not publish.
make check-fern-docs-locally Install docs dependencies, generate Fern artifacts, then run fern check.
make generate-fern-notebooks-with-outputs Execute notebooks, refresh Colab notebooks, then convert executed notebooks to Fern format. Requires provider API keys for notebooks that call models.
make prepare-fern-release VERSION=X.Y.Z Prepare Fern release metadata for a new version train.
make check-fern-release-version VERSION=X.Y.Z Verify fern/docs.yml and fern/versions/vX.Y.Z.yml are prepared before release/manual Fern publish.

Support and CI targets:

Command What it does
make install-docs-deps Install docs and notebook dependency groups with DOCS_PYTHON_VERSION (default 3.13).
make generate-fern-api-reference Generate fern/code-reference/data-designer/ with py2fern from packages/data-designer-config/src/data_designer/config. No Fern login/token required.
make generate-fern-api-reference-native Generate API reference with Fern CLI via fern docs md generate. Requires Fern auth.
make generate-fern-notebooks Convert docs/notebook_source/*.py into gitignored fern/components/notebooks/*.{json,ts}. Uses executed notebooks per file when present.
make prepare-fern-docs Generate all local Fern artifacts.
make check-fern-docs Run prepare-fern-docs, then fern check. CI-oriented; assumes dependencies are already installed.

Attention Areas

Reviewers: Please pay special attention to the following:

  • docs-preview-deploy.yml - new workflow_run deploy workflow. GitHub only fully activates new workflow_run workflows once the file exists on the default branch, so the actual handoff can only be tested after merge.
  • docs-preview.yml - PR build/check workflow now uploads the prepared site/, fern/, and metadata artifact without secrets.
  • Makefile - local Fern targets use py2fern==0.1.6 so contributors can generate API reference locally without Fern login.
  • MkDocs should remain published in parallel for the next few releases while Fern output, CI, release publishing, and contributor workflows settle.

Testing

  • .venv/bin/ruff check --fix .
  • .venv/bin/ruff format .
  • git diff --check
  • workflow YAML parse
  • .venv/bin/mkdocs build
  • make check-fern-docs
  • make serve-fern-docs-locally with browser spot-checks for versions, Dev Notes, notebooks, recipes, API reference, and older-version routing
  • npx -y fern-api@4.106.0 generate --docs
  • PR CI is passing
  • Docs preview build workflow is passing
  • workflow_run deploy handoff tested after merge
  • Unit tests added/updated - N/A, docs workflow/content changes
  • E2E tests added/updated - N/A, docs workflow/content changes

Checklist

  • Follows commit message conventions
  • Commits are signed off (DCO) - DCO Assistant is passing, but branch commits do not include Signed-off-by trailers
  • Architecture docs updated - N/A; docs workflow notes were added instead

Description updated with AI

@github-actions

github-actions Bot commented May 8, 2026

Copy link
Copy Markdown
Contributor

MkDocs preview: https://4ee2ac0b.dd-docs-preview.pages.dev

Fern preview: https://nvidia-preview-pr-622.docs.buildwithfern.com/nemo/datadesigner​

Notebook tutorials are rendered without execution outputs in previews.

@andreatgretel andreatgretel marked this pull request as ready for review May 8, 2026 20:08
@andreatgretel andreatgretel requested a review from a team as a code owner May 8, 2026 20:08
@github-actions

github-actions Bot commented May 8, 2026

Copy link
Copy Markdown
Contributor

PR #622 Review — docs: prepare Fern docs workflow

Summary

This PR lands the scaffolding needed to run the Fern docs stack alongside MkDocs for the next few releases: a Fern CI/publish workflow, a Dev-Notes-only publish workflow, Fern preview publishing in the existing docs-preview workflow, local-build Make targets (prepare-fern-docs, check-fern-docs, serve-fern-docs-locally), and the content updates (v0.5.9 version entry, VLM long-doc recipes + dev note, older-versions redirect page). Generated Fern notebook artifacts move from committed snapshots to gitignored outputs regenerated locally/in CI. The only non-docs production code change is a two-line cosmetic tweak to packages/data-designer-config/src/data_designer/config/mcp.py (docstring quotes) and a py2fern==0.1.6 pin in pyproject.toml. Scope is well-contained and matches the PR description.

55 files changed; ~7k additions are dominated by new MDX recipe pages and regenerated notebook JSON/TS. Non-docs code is small.

Findings

Workflows

  • .github/workflows/build-fern-docs.yml (new) — Clean structure: reusable build-notebooks.yml produces the notebooks artifact, publish job downloads it and runs make check-fern-docsfern generate --docs. Top-level permissions: {} plus narrow per-job perms is correct. The explicit FERN_TOKEN empty-guard before npx fern-api is good practice. Version is resolved from fern.config.json with jq so the CLI stays pinned — nice.
  • .github/workflows/publish-fern-devnotes.yml (new) — Reuses notebook artifacts from the last successful build-fern-docs.yml or build-docs.yml run. Two things to verify:
    1. Fallback target build-docs.yml — make sure a workflow by that exact filename exists in main. If not, the fallback silently contributes nothing and the step errors only after exhausting both, which is still safe but the loop would be redundant. Consider simplifying to just build-fern-docs.yml, or confirm build-docs.yml is a real workflow on main.
    2. gh run download ... --name notebooks exits non-zero when the artifact is missing/expired (artifacts default to 90 days). A freshly-cut release window that relies on a 90+ day old build would fail. Not blocking, but worth flagging so someone can trigger build-fern-docs.yml manually if this ever happens.
  • .github/workflows/docs-preview.yml — Preview URL is hand-constructed: https://nvidia-preview-pr-${PR}.docs.buildwithfern.com/nemo/datadesigner. If Fern ever changes its preview URL scheme this silently goes stale. Not a blocker since you can confirm from a real preview run, but a future-proofing option would be to capture the URL from fern generate --docs --preview stdout. The same-repo gate (github.event.pull_request.head.repo.full_name == github.repository) is applied consistently to Fern deploy, Cloudflare deploy, and both comment steps — good.
  • Concurrency group docs-preview-${{ github.event.pull_request.number }} with cancel-in-progress: true is the right pattern for PR-push churn.
  • build-notebooks.yml adds an explicit uv sync --all-packages --group notebooks --group docs step; matches the Makefile's shift to .venv/bin/* invocations. Consistent.

Makefile

  • Replacing uv run --python $(DOCS_PYTHON) with .venv/bin/<tool> paths is a meaningful behavioral shift: the docs targets now require make install-dev-notebooks (or equivalent uv sync --group docs --group notebooks) to have been run first. The build_notebooks_cached.sh precondition check ([ ! -x "$DOCS_JUPYTEXT" ]) catches this for the caching path, but generate-fern-notebooks, generate-fern-api-reference, convert-execute-notebooks (non-cache branch), and generate-colab-notebooks do not have an equivalent guard — they'll fail with a generic "No such file" if the venv is missing or stale. Minor: consider adding a one-line check or at least documenting the precondition in the help text.
  • convert-execute-notebooks (non-cache branch) now exit 1s on any notebook failure, up from the previous "warn and continue." The SKILL.md and CONTRIBUTING.md note this, but contributors missing an API key for a single image notebook will now see the whole target fail. Intended behavior per the changelog, but worth flagging for anyone reading the diff looking for regressions.
  • rm -rf docs/notebooks inside convert-execute-notebooks and inside build_notebooks_cached.sh are both destructive-by-design — reasonable for deterministic outputs, but if a contributor was hand-editing a notebook under docs/notebooks/ that work will be blown away. Not a real concern given the source-of-truth is docs/notebook_source/*.py.
  • FERN ?= npx -y fern-api@$(FERN_VERSION) with FERN_VERSION resolved via jq from fern/fern.config.json — nicely keeps CI and local in sync.

Notebook pipeline

  • fern/scripts/ipynb-to-fern-json.pyis_colab_injected_cell generalization is good: primary check is the nemo_colab_inject metadata flag set by generate_colab_notebooks.py, with regex heuristics as a fallback for notebooks that weren't tagged (e.g., older Colab snapshots). The regex patterns are narrow enough they shouldn't false-positive on legitimate cells. Handling source=None in _join_source is a reasonable defensive change.
  • docs/scripts/generate_colab_notebooks.pymark_colab_injected helper is clean and localized. Good.
  • The one purely-cosmetic reformat of _MD assignment and the collapsed f-string in write_ts_export are noise in the diff but harmless.

Fern content & routing

  • fern/docs.yml — The ~34 version redirects to nvidia-nemo.github.io/DataDesigner/<ver>/ all use permanent: false (302), which is correct during migration. The :path* parameter pass-through is idiomatic. The self-redirect /older-versions/older-versions → /older-versions defends against Fern's slug doubling when a section's skip-slug: true doesn't fully suppress the nav slug — sensible guard.
  • fern/versions/latest.yml and v0.5.9.yml — Both reuse ./v0.5.8/pages/ for almost everything, with ./v0.5.9/pages/ only for recipe cards and VLM content. This matches the SKILL.md's "reuse shared pages, diverge only when needed" story. Watch for drift: any future edit to a shared page will land in both latest and v0.5.9, which is the intended behavior but worth calling out in contributor docs (SKILL.md does address this).
  • Older Dev Notes nested section in both latest.yml and v0.5.8.yml — matches the description goal of trimming the sidebar on latest while keeping discoverability via the Dev Notes index.
  • VLM recipe MDX pages — 8 new recipe pages + an overview card page. I did not line-by-line check each (4.7k lines of MDX content) but spot checks show consistent frontmatter, no leftover position:/authors: keys that would trip Fern's runtime, and cross-links use absolute paths as required.

Non-docs production code

  • packages/data-designer-config/src/data_designer/config/mcp.py — Only change is Literal["sse", "streamable_http"]Literal['sse', 'streamable_http'] (and the stdio equivalent) in the docstrings. The stated reason is MDX-quoting issues from py2fern. This is benign at runtime (the typed annotation on the class field is unchanged) and scoped to two docstrings. One small consistency nit: grep the rest of data_designer.config for other Literal["..."] docstring patterns and either normalize all of them or leave a short # py2fern: single-quote Literal args comment so a future edit doesn't silently revert this. Non-blocking.

pyproject / dependencies

  • py2fern==0.1.6 pinned exactly in the docs group. Pin is appropriate for a known-temporary dependency, and the PR description commits to revisiting this once Fern ships a login-free local build. Lock updates in uv.lock are consistent.

Risks

  1. Fern preview URL format is hard-coded; relies on Fern's buildwithfern.com preview scheme staying stable. Low risk during migration, worth a TODO to capture from CLI output later.
  2. Devnotes publish workflow depends on reusable-workflow artifacts being available within the 90-day retention window. Low risk under normal release cadence.
  3. Makefile targets require make install-dev-notebooks; a contributor running make check-fern-docs on a clean clone without the notebooks group installed will hit opaque failures. Consider a minimal precondition guard or doc update.
  4. No new tests — correct, since changes are docs workflow/content only.
  5. No structural impact analysis file found at /tmp/structural-impact-622.md; section omitted.

Verdict

Approve (with optional follow-ups). This is a well-scoped docs-infrastructure PR. Workflow permissions are minimal, fork secrets are properly gated, the local-build Make targets are consistent with what CI runs, and the py2fern dependency is explicitly acknowledged as a temporary pin. The one non-docs production change (mcp.py docstring quotes) is cosmetic and limited. The optional follow-ups above are all quality-of-life improvements, not blockers.

@greptile-apps

greptile-apps Bot commented May 8, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR sets up a parallel Fern docs publishing workflow alongside the existing MkDocs path, including preview, release, and rolling Dev Notes automation. It also removes generated Fern artifacts from git tracking, adds version guard scripts, and ports VLM long-document content into the Fern structure.

  • New CI workflows: build-fern-docs.yml handles release/manual Fern publish with version validation; publish-fern-devnotes.yml republishes only when Dev Notes content changes; docs-preview.yml is hardened so Fern check/preview failures do not block MkDocs preview or PR comments.
  • fern-release-version.py: A new script automates preparing and validating Fern version entries (prepare/check subcommands), used by both local make targets and CI pre-publish checks.
  • Notebook pipeline: generate-fern-notebooks now prefers executed notebooks in docs/notebooks/ and falls back to source .py files; generated .json/.ts artifacts are gitignored and regenerated in CI.

Confidence Score: 5/5

This PR is safe to merge; it adds new CI workflows and docs tooling without touching application logic.

All changes are scoped to docs workflows, Fern configuration, notebook conversion scripts, and a single docstring fix in mcp.py. The previous review findings have all been addressed. The new fern-release-version.py script's regex patterns align correctly with the actual docs.yml format, the bash exit-0-in-function pattern in publish-fern-devnotes.yml is correct, and the workflow permission model is sound.

No files require special attention. The docs-preview-deploy.yml workflow_run handoff is noted as untestable until after merge, which is a GitHub platform limitation acknowledged in the PR description.

Important Files Changed

Filename Overview
.github/workflows/build-fern-docs.yml New workflow for release and manual Fern publish; correctly gates on version validation and downloaded notebooks, with Fern version pinned from fern.config.json.
.github/workflows/docs-preview.yml Significantly reworked preview workflow; Fern check and deploy both use continue-on-error, and a terminal step re-raises the Fern check failure so broken docs are still visible while MkDocs preview always posts.
.github/workflows/publish-fern-devnotes.yml Rolling Dev Notes workflow; reuses notebooks from the last successful docs run via a bash function that calls exit 0 on first successful download — correct bash idiom for early-exit within a loop.
fern/scripts/fern-release-version.py Well-structured prepare/check script with correct regex patterns aligned to the actual docs.yml YAML format; version slug normalization handles refs/tags/ prefix and bare version strings.
fern/scripts/ipynb-to-fern-json.py Adds GitHub Light syntax-highlight style with CSS variable shiki-light/shiki-dark mapping; Colab cell detection broadened from badge-only to metadata flag + content patterns.
Makefile Adds Fern local targets (prepare-fern-docs, check-fern-docs, serve-fern-docs-locally, generate-fern-api-reference, prepare-fern-release); convert-execute-notebooks switches from soft-fail to hard-fail on notebook errors.
packages/data-designer-config/src/data_designer/config/mcp.py Docstring-only change: switches double-quoted Literal strings to single-quoted to fix MDX rendering of the generated API reference.
.gitignore Correctly ignores generated Fern notebook artifacts (*.json, *.ts) under fern/components/notebooks/ that are now regenerated in CI.
docs/scripts/generate_colab_notebooks.py Adds nemo_colab_inject metadata flag to all Colab-injected cells so ipynb-to-fern-json.py can strip them by metadata rather than by content heuristics alone.

Reviews (16): Last reviewed commit: "ci: align fern release docs behavior" | Re-trigger Greptile

Comment thread .github/workflows/docs-preview.yml Outdated
Comment thread .github/workflows/docs-preview-deploy.yml Outdated
@andreatgretel

Copy link
Copy Markdown
Contributor Author

Created follow-up issue #637 to reintroduce the Curator-style split for docs preview build/check vs hosted preview deploy after this PR lands: #637

For this PR, we kept deploy inline so the hosted preview can update on the current branch.

Comment thread .github/workflows/docs-preview.yml
@johnnygreco

Copy link
Copy Markdown
Contributor

Thanks for putting this together, @andreatgretel.

Summary

This PR moves Fern docs toward a generated-artifact workflow, adds local/CI Make targets, wires Fern preview/publish automation, and keeps MkDocs previews active during the migration. The implementation matches the stated intent overall, but I found two docs-CI edge cases worth fixing before merge.

Findings

Warnings — Worth addressing

.github/workflows/publish-fern-devnotes.yml:50 — Dev-note publishing can miss the latest notebook artifacts

  • What: The dev-notes workflow looks for prior notebook artifacts with gh run list --workflow "$workflow" --status success --branch main .... The successful build-docs.yml release runs in this repo have tag refs like v0.5.9 as their headBranch, not main, so this filter skips the primary release artifacts and can fall back to an older manual main run or fail once that artifact expires.
  • Why: A dev-note-only Fern publish depends on these downloaded notebooks before make check-fern-docs. If the lookup selects stale artifacts, previews/publishes can use old tutorial output; if no matching manual main run is retained, the workflow fails even though recent release artifacts exist.
  • Suggestion: Query successful release runs first without the main branch filter, then optionally fall back to a manual main run. For example, use gh api/gh run list with an event=release filter for build-fern-docs.yml and build-docs.yml, and only use --branch main for the workflow-dispatch fallback.

fern/scripts/fern-release-version.py:15 — Release guard rejects existing rc tag format

  • What: VERSION_RE accepts X.Y.Z, vX.Y.Z, and separator-based suffixes like 0.6.0-rc1, but the repo’s release tags use PEP 440-style rc suffixes such as v0.5.8rc1 and v0.6.0rc5. Running the new guard with v0.6.0rc1 currently fails as an invalid version before it can even check Fern metadata.
  • Why: build-fern-docs.yml runs this guard on release events and manual release_tag input. If prerelease GitHub releases should publish or validate docs, the workflow breaks for the project’s existing tag convention; if they should not publish, the workflow should explicitly skip prereleases instead of failing with an invalid-version error.
  • Suggestion: Either extend the regex to accept the repo’s rc tag format, e.g. \d+\.\d+\.\d+(?:(?:a|b|rc)\d+|[-.][0-9A-Za-z]+)*, or add an explicit prerelease skip in the workflow via github.event.release.prerelease.

What Looks Good

The same-repo gating in docs-preview.yml keeps deployment secrets away from fork PRs while still running build/check validation.

The preview workflow now isolates Fern check/preview failures well enough to still post MkDocs preview status, which addresses the earlier review thread without turning Fern instability into a silent missing-comment failure.

The generated Fern artifacts are treated consistently as build outputs: gitignored, regenerated through Make targets, and documented in fern/README.md.

Verdict

Needs changes: the release-version guard and dev-note artifact lookup should be adjusted so the new Fern CI paths behave reliably across release and dev-note publishing workflows.


This review was generated by an AI assistant.

@andreatgretel

Copy link
Copy Markdown
Contributor Author

Thanks, this was helpful. Addressed both in c02b685.

  • publish-fern-devnotes.yml now checks successful release runs first, without the main branch filter, then falls back to manual workflow_dispatch runs on main. I verified this selects the latest release docs run for v0.5.9 instead of the older manual main run.
  • For rc tags, we kept Fern aligned with MkDocs behavior: rc tags exist, but we do not publish prerelease docs. build-fern-docs.yml now explicitly skips GitHub prerelease release events instead of failing later in the version guard.

CI is green again, including docs preview and Greptile.

@andreatgretel andreatgretel merged commit 46dc8b2 into main May 12, 2026
50 checks passed
@andreatgretel andreatgretel deleted the andreatgretel/chore/fern-generated-artifacts branch May 14, 2026 01:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants