Add CI check to detect unused media files in docs#19099
Merged
harupy merged 5 commits intomlflow:masterfrom Dec 1, 2025
Merged
Add CI check to detect unused media files in docs#19099harupy merged 5 commits intomlflow:masterfrom
harupy merged 5 commits intomlflow:masterfrom
Conversation
0408ee3 to
1ffb978
Compare
Contributor
|
Documentation preview for 9b795e1 is available at: More info
|
4d1403e to
817f6ea
Compare
harupy
commented
Nov 28, 2025
harupy
commented
Nov 28, 2025
harupy
commented
Nov 28, 2025
harupy
commented
Nov 28, 2025
Contributor
There was a problem hiding this comment.
Pull request overview
This PR introduces tooling to automatically detect and remove unused documentation images, successfully cleaning up approximately 100 unused image files totaling 30.60 MB. The solution includes a bash script for detecting unused images and integrates it into the CI workflow to prevent future accumulation of unused assets.
- Added a bash script (
dev/remove-unused-images.sh) to identify and optionally remove unused documentation images - Created a GitHub Actions composite action to install ripgrep as a dependency
- Integrated the unused image check into the docs CI workflow to run on every PR
Reviewed changes
Copilot reviewed 3 out of 106 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| dev/remove-unused-images.sh | New bash script that scans for image files in docs and identifies those not referenced in the codebase, with support for both check-only and removal modes |
| .github/workflows/docs.yml | Adds CI check to detect unused images during PR validation, preventing future accumulation of unreferenced documentation assets |
| .github/actions/setup-ripgrep/action.yml | Reusable composite action to install ripgrep (text search tool) used by the unused images detection script |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
harupy
commented
Nov 28, 2025
harupy
commented
Nov 28, 2025
harupy
commented
Nov 28, 2025
f4b5401 to
a5a04a2
Compare
7edc159 to
e24544c
Compare
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
e24544c to
c0f64ea
Compare
…d mp4 files - Add mp4 support to the unused media detection script - Rename script from find-unused-images.sh to find-unused-media.sh - Update pre-commit hook configuration - Remove 18 unused mp4 files from docs 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
harupy
commented
Dec 1, 2025
harupy
commented
Dec 1, 2025
Member
Author
There was a problem hiding this comment.
% git ls-tree -r -l HEAD | awk '{sum += $4} END {printf "%.2f MB\n", sum/1024/1024}'
255.33 MB
This PR reduces the repo size by about 20%.
- Remove unused-media pre-commit hook from .pre-commit-config.yaml - Add setup-ripgrep composite action with configurable version - Add unused media check step to lint.yml workflow - Update find-unused-media.sh to use system rg with helpful error message 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
f3d8052 to
5716185
Compare
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Co-authored-by: Yuki Watanabe <31463517+B-Step62@users.noreply.github.com> Signed-off-by: Harutaka Kawamura <hkawamura0130@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
🛠 DevTools 🛠
Install mlflow from this PR
For Databricks, use the following command:
What changes are proposed in this pull request?
This PR adds a CI check to detect unused media files (images and videos) in the documentation directories and removes existing unused files.
dev/find-unused-media.shscript that detects unused images (png, jpg, gif, webp, ico, avif) and videos (mp4) in docs/setup-ripgrepcomposite action with configurable versionlint.ymlworkflow to run the unused media detectionHow is this PR tested?
Verified that the removed files are not referenced in the documentation.
Does this PR require documentation update?
Release Notes
Is this a user-facing change?
How should the PR be classified in the release notes? Choose one:
rn/none- No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" sectionShould this PR be included in the next patch release?
🤖 Generated with Claude Code