CICD bug fix: ensure data/ symlinks exist before jit-cache AOT compilation#3158
CICD bug fix: ensure data/ symlinks exist before jit-cache AOT compilation#3158aleozlx merged 1 commit intoflashinfer-ai:mainfrom
Conversation
📝 WalkthroughWalkthroughThe JIT-cache compilation now runs a repository-level Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning Review ran into problems🔥 ProblemsGit: Failed to clone repository. Please run the Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
/bot run |
There was a problem hiding this comment.
Code Review
This pull request updates the JIT cache build process to ensure that necessary data directory symlinks are created. A critical issue was identified where the import of the main build backend would fail due to a name collision with the current file; a suggestion was provided to use importlib for a path-based import.
The jit-cache wheel build runs AOT compilation without installing the main flashinfer package first, so the flashinfer/data/ symlinks (including data/cccl for vendored CCCL headers) don't exist. This causes <cuda/cmath> to fall through to the CTK copy which may lack cuda::fast_mod_div on older toolkits. Call the main build_backend._create_data_dir() before importing flashinfer.aot to ensure all symlinks are in place. Made-with: Cursor
b87d66b to
aec03ab
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@flashinfer-jit-cache/build_backend.py`:
- Around line 85-87: The current import build_backend picks up the jit-cache
module from sys.modules causing AttributeError when calling _create_data_dir;
replace that import with an explicit load using
importlib.util.spec_from_file_location to load the root backend by its file path
under a unique module name (e.g., "root_build_backend"), execute the spec to
create the module object, and then call
root_module._create_data_dir(use_symlinks=True) so you avoid name collision with
the existing build_backend module in sys.modules.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 09da80e6-75c0-4c6f-8bd7-308c9fed9294
📥 Commits
Reviewing files that changed from the base of the PR and between d454492 and b87d66b87318c78e57459b42320bf9f7857e7d13.
📒 Files selected for processing (1)
flashinfer-jit-cache/build_backend.py
There was a problem hiding this comment.
🧹 Nitpick comments (1)
flashinfer-jit-cache/build_backend.py (1)
82-93: LGTM — correctly loads the root backend without module-name collision.Using
importlib.util.spec_from_file_locationhere is the right fix: a plainimport build_backendwould resolve to this same-named jit-cache module (already insys.modulesas the active PEP 517 backend) and_create_data_dirwould not be found. Placing this beforefrom flashinfer import aotalso ensuresflashinfer/data/{cccl,cutlass,spdlog,csrc,include}exist beforeflashinfer.jit.envsnapshots its path constants at import time (seeflashinfer/jit/env.py:146-158), which is exactly what's needed to prevent the CCCL fallback described in the PR.One small robustness nit:
specandspec.loaderare used without aNoneguard. If the path ever becomes wrong (e.g., layout change), you'd get a confusingAttributeErroronNone.exec_moduleinstead of a clear error. Optional tightening:Optional hardening
spec = importlib.util.spec_from_file_location( "main_build_backend", Path(__file__).parent.parent / "build_backend.py" ) + if spec is None or spec.loader is None: + raise RuntimeError( + "Unable to load root build_backend.py for flashinfer data-dir setup" + ) main_build_backend = importlib.util.module_from_spec(spec) spec.loader.exec_module(main_build_backend) main_build_backend._create_data_dir(use_symlinks=True)Also worth confirming on CI: when this runs inside
python -m build --wheelwithout--no-isolation, the parent3rdparty/ccclsubmodule must already be checked out on the build host, otherwise_create_data_dirwill happily create a symlink to a non-existent target and AOT will still fail — just later and with a different error. Given the PR description mentions this runs with the CCCL submodule present, this should be fine, but a pre-flight check that3rdparty/ccclis non-empty before symlinking would make failures more actionable.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@flashinfer-jit-cache/build_backend.py` around lines 82 - 93, Guard against a missing spec or loader and validate the CCCL submodule before calling main_build_backend._create_data_dir: after creating spec via importlib.util.spec_from_file_location (the spec variable) check that spec is not None and spec.loader is not None and raise a clear RuntimeError explaining the failed import path (the Path used to locate "build_backend.py") if either is missing; then exec_module on main_build_backend and before calling main_build_backend._create_data_dir(use_symlinks=True) verify the expected 3rdparty/cccl directory (relative to the same parent Path) exists and is non-empty, and raise a descriptive error if it is missing so the symlink creation cannot point to a non-existent target.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@flashinfer-jit-cache/build_backend.py`:
- Around line 82-93: Guard against a missing spec or loader and validate the
CCCL submodule before calling main_build_backend._create_data_dir: after
creating spec via importlib.util.spec_from_file_location (the spec variable)
check that spec is not None and spec.loader is not None and raise a clear
RuntimeError explaining the failed import path (the Path used to locate
"build_backend.py") if either is missing; then exec_module on main_build_backend
and before calling main_build_backend._create_data_dir(use_symlinks=True) verify
the expected 3rdparty/cccl directory (relative to the same parent Path) exists
and is non-empty, and raise a descriptive error if it is missing so the symlink
creation cannot point to a non-existent target.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 62aafd33-de88-4d24-87a5-eefa51ff6dc3
📥 Commits
Reviewing files that changed from the base of the PR and between b87d66b87318c78e57459b42320bf9f7857e7d13 and aec03ab.
📒 Files selected for processing (1)
flashinfer-jit-cache/build_backend.py
|
/bot run |
<!-- .github/pull_request_template.md --> ## 📌 Description Follow up to #3158. This adds git submodule update --init --recursive in the jit-cache build_backend.py before AOT compilation begins, ensuring all submodules are populated. ## 🔍 Related Issues #3159 ## 🚀 Pull Request Checklist Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete. ### ✅ Pre-commit Checks - [x] I have installed `pre-commit` by running `pip install pre-commit` (or used your preferred method). - [x] I have installed the hooks with `pre-commit install`. - [x] I have run the hooks manually with `pre-commit run --all-files` and fixed any reported issues. > If you are unsure about how to set up `pre-commit`, see [the pre-commit documentation](https://pre-commit.com/). ## 🧪 Tests - [x] Tests have been added or updated as needed. - [x] All tests are passing (`unittest`, etc.). ## Reviewer Notes <!-- Optional: anything you'd like reviewers to focus on, concerns, etc. --> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Chores** * Improved Git submodule handling in the build system to enhance reliability during JIT cache compilation. <!-- end of auto-generated comment: release notes by coderabbit.ai --> Co-authored-by: Alex Yang <aleyang@nvidia.com>
📌 Description
The jit-cache wheel build (
scripts/build_flashinfer_jit_cache_whl.sh) runspython -m build --wheelwithout first installing the main flashinfer package. The AOT compilation imports flashinfer directly from the source tree, but theflashinfer/data/symlinks are never created because the mainbuild_backend._create_data_dir()is never called.This wasn't a problem before the CCCL submodule because CUTLASS/spdlog headers are self-contained — there's no CTK-bundled copy to shadow them. CCCL is different: the CTK also ships CCCL headers at
$cuda_home/include/, so when the vendoreddata/ccclsymlink is missing,#include <cuda/cmath>silently falls through to the CTK copy, which on older toolkits doesn't havecuda::fast_mod_div.Call the main build_backend._create_data_dir() before importing flashinfer.aot to ensure all symlinks are in place.
Made-with: Cursor
🔍 Related Issues
#3159
🚀 Pull Request Checklist
Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete.
✅ Pre-commit Checks
pre-commitby runningpip install pre-commit(or used your preferred method).pre-commit install.pre-commit run --all-filesand fixed any reported issues.🧪 Tests
unittest, etc.).Reviewer Notes
Summary by CodeRabbit