Skip to content

chore: regenerate only changed colab notebooks in CI and make target #413

@andreatgretel

Description

@andreatgretel

Context

The make generate-colab-notebooks target and the check-colab-notebooks CI workflow currently regenerate all colab notebooks every time, even when only one source file changed. This causes PRs to include cell-ID-only diffs across unrelated notebooks (e.g. PR #403 touched only notebook 4's source but the diff includes 188 lines of cell-ID changes across notebooks 1-3, 5-6).

The CI already filters cell-ID diffs to avoid false failures, but the unnecessary regeneration still creates noisy commits.

Proposal

1. CI workflow: regenerate only changed source files

In .github/workflows/check-colab-notebooks.yml, detect which docs/notebook_source/*.py files changed and pass them to the script's existing --files flag:

- name: Get changed notebook sources
  id: changed
  run: |
    FILES=$(git diff --name-only ${{ github.event.pull_request.base.sha || 'HEAD~1' }} -- docs/notebook_source/*.py | xargs -I{} basename {} || true)
    echo "files=$FILES" >> "$GITHUB_OUTPUT"

- name: Generate Colab notebooks
  run: |
    if [ -n "${{ steps.changed.outputs.files }}" ]; then
      make generate-colab-notebooks FILES="${{ steps.changed.outputs.files }}"
    else
      make generate-colab-notebooks
    fi

2. Makefile: add a FILES parameter

generate-colab-notebooks:
	@echo "📓 Generating Colab-compatible notebooks..."
ifdef FILES
	uv run --group docs python docs/scripts/generate_colab_notebooks.py --files $(FILES)
else
	uv run --group docs python docs/scripts/generate_colab_notebooks.py
endif
	@echo "✅ Colab notebooks created in docs/colab_notebooks/"

3. Remove the cell-ID diff filter

Once only changed notebooks are regenerated, the cell-ID filtering hack in the CI diff check becomes unnecessary and can be removed (or kept as a safety net).

Benefits

  • Cleaner PR diffs (no unrelated notebook churn)
  • Faster CI (regenerate 1 notebook instead of 6)
  • Simpler diff check (no need to filter cell IDs)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions