Skip to content

CICD bug fix: ensure data/ symlinks exist before jit-cache AOT compilation#3158

Merged
aleozlx merged 1 commit intoflashinfer-ai:mainfrom
kahyunnam:knam/cccl-fix
Apr 24, 2026
Merged

CICD bug fix: ensure data/ symlinks exist before jit-cache AOT compilation#3158
aleozlx merged 1 commit intoflashinfer-ai:mainfrom
kahyunnam:knam/cccl-fix

Conversation

@kahyunnam
Copy link
Copy Markdown
Member

@kahyunnam kahyunnam commented Apr 23, 2026

📌 Description

The jit-cache wheel build (scripts/build_flashinfer_jit_cache_whl.sh) runs python -m build --wheel without first installing the main flashinfer package. The AOT compilation imports flashinfer directly from the source tree, but the flashinfer/data/ symlinks are never created because the main build_backend._create_data_dir() is never called.
This wasn't a problem before the CCCL submodule because CUTLASS/spdlog headers are self-contained — there's no CTK-bundled copy to shadow them. CCCL is different: the CTK also ships CCCL headers at $cuda_home/include/, so when the vendored data/cccl symlink is missing, #include <cuda/cmath> silently falls through to the CTK copy, which on older toolkits doesn't have cuda::fast_mod_div.

Call the main build_backend._create_data_dir() before importing flashinfer.aot to ensure all symlinks are in place.

Made-with: Cursor

🔍 Related Issues

#3159

🚀 Pull Request Checklist

Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete.

✅ Pre-commit Checks

  • I have installed pre-commit by running pip install pre-commit (or used your preferred method).
  • I have installed the hooks with pre-commit install.
  • I have run the hooks manually with pre-commit run --all-files and fixed any reported issues.

If you are unsure about how to set up pre-commit, see the pre-commit documentation.

🧪 Tests

  • Tests have been added or updated as needed.
  • All tests are passing (unittest, etc.).

Reviewer Notes

Summary by CodeRabbit

  • Bug Fixes
    • Ensure the JIT cache build now initializes required filesystem layout and data directories before compilation so builds succeed even when the package hasn't been previously installed.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 23, 2026

📝 Walkthrough

Walkthrough

The JIT-cache compilation now runs a repository-level build_backend.py via importlib.util and calls _create_data_dir(use_symlinks=True) at the start of _compile_jit_cache to ensure the flashinfer/data/ symlink layout exists before importing flashinfer.aot and continuing compilation.

Changes

Cohort / File(s) Summary
JIT-Cache Build Initialization
flashinfer-jit-cache/build_backend.py
Added dynamic execution of the repo build_backend.py and an explicit _create_data_dir(use_symlinks=True) call at the start of _compile_jit_cache, ensuring required flashinfer/data/ symlinked directories exist prior to importing flashinfer.aot and compiling the JIT cache.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Suggested labels

run-ci

Suggested reviewers

  • yzh119
  • yongwww
  • bkryu
  • jimmyzho
  • nvmbreughe

Poem

🐰 I hopped into build scripts, light and quick,
Ran root setup so symlinks click-click,
A tiny dir, a nimble link,
Now the cache compiles in just a blink — ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main bug fix: ensuring data/ symlinks exist before jit-cache AOT compilation, which matches the changeset's core objective.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed The PR description comprehensively covers what the change does, why it's needed, the underlying problem with CCCL symlinks, and the solution implemented. All required checklist items are marked complete.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@kahyunnam
Copy link
Copy Markdown
Member Author

/bot run

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the JIT cache build process to ensure that necessary data directory symlinks are created. A critical issue was identified where the import of the main build backend would fail due to a name collision with the current file; a suggestion was provided to use importlib for a path-based import.

Comment thread flashinfer-jit-cache/build_backend.py Outdated
The jit-cache wheel build runs AOT compilation without installing the
main flashinfer package first, so the flashinfer/data/ symlinks
(including data/cccl for vendored CCCL headers) don't exist. This
causes <cuda/cmath> to fall through to the CTK copy which may lack
cuda::fast_mod_div on older toolkits.

Call the main build_backend._create_data_dir() before importing
flashinfer.aot to ensure all symlinks are in place.

Made-with: Cursor
@flashinfer-bot
Copy link
Copy Markdown
Collaborator

GitLab MR !592 has been created, and the CI pipeline #49310192 is currently running. I'll report back once the pipeline job completes.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@flashinfer-jit-cache/build_backend.py`:
- Around line 85-87: The current import build_backend picks up the jit-cache
module from sys.modules causing AttributeError when calling _create_data_dir;
replace that import with an explicit load using
importlib.util.spec_from_file_location to load the root backend by its file path
under a unique module name (e.g., "root_build_backend"), execute the spec to
create the module object, and then call
root_module._create_data_dir(use_symlinks=True) so you avoid name collision with
the existing build_backend module in sys.modules.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 09da80e6-75c0-4c6f-8bd7-308c9fed9294

📥 Commits

Reviewing files that changed from the base of the PR and between d454492 and b87d66b87318c78e57459b42320bf9f7857e7d13.

📒 Files selected for processing (1)
  • flashinfer-jit-cache/build_backend.py

Comment thread flashinfer-jit-cache/build_backend.py Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
flashinfer-jit-cache/build_backend.py (1)

82-93: LGTM — correctly loads the root backend without module-name collision.

Using importlib.util.spec_from_file_location here is the right fix: a plain import build_backend would resolve to this same-named jit-cache module (already in sys.modules as the active PEP 517 backend) and _create_data_dir would not be found. Placing this before from flashinfer import aot also ensures flashinfer/data/{cccl,cutlass,spdlog,csrc,include} exist before flashinfer.jit.env snapshots its path constants at import time (see flashinfer/jit/env.py:146-158), which is exactly what's needed to prevent the CCCL fallback described in the PR.

One small robustness nit: spec and spec.loader are used without a None guard. If the path ever becomes wrong (e.g., layout change), you'd get a confusing AttributeError on None.exec_module instead of a clear error. Optional tightening:

Optional hardening
     spec = importlib.util.spec_from_file_location(
         "main_build_backend", Path(__file__).parent.parent / "build_backend.py"
     )
+    if spec is None or spec.loader is None:
+        raise RuntimeError(
+            "Unable to load root build_backend.py for flashinfer data-dir setup"
+        )
     main_build_backend = importlib.util.module_from_spec(spec)
     spec.loader.exec_module(main_build_backend)
     main_build_backend._create_data_dir(use_symlinks=True)

Also worth confirming on CI: when this runs inside python -m build --wheel without --no-isolation, the parent 3rdparty/cccl submodule must already be checked out on the build host, otherwise _create_data_dir will happily create a symlink to a non-existent target and AOT will still fail — just later and with a different error. Given the PR description mentions this runs with the CCCL submodule present, this should be fine, but a pre-flight check that 3rdparty/cccl is non-empty before symlinking would make failures more actionable.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@flashinfer-jit-cache/build_backend.py` around lines 82 - 93, Guard against a
missing spec or loader and validate the CCCL submodule before calling
main_build_backend._create_data_dir: after creating spec via
importlib.util.spec_from_file_location (the spec variable) check that spec is
not None and spec.loader is not None and raise a clear RuntimeError explaining
the failed import path (the Path used to locate "build_backend.py") if either is
missing; then exec_module on main_build_backend and before calling
main_build_backend._create_data_dir(use_symlinks=True) verify the expected
3rdparty/cccl directory (relative to the same parent Path) exists and is
non-empty, and raise a descriptive error if it is missing so the symlink
creation cannot point to a non-existent target.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@flashinfer-jit-cache/build_backend.py`:
- Around line 82-93: Guard against a missing spec or loader and validate the
CCCL submodule before calling main_build_backend._create_data_dir: after
creating spec via importlib.util.spec_from_file_location (the spec variable)
check that spec is not None and spec.loader is not None and raise a clear
RuntimeError explaining the failed import path (the Path used to locate
"build_backend.py") if either is missing; then exec_module on main_build_backend
and before calling main_build_backend._create_data_dir(use_symlinks=True) verify
the expected 3rdparty/cccl directory (relative to the same parent Path) exists
and is non-empty, and raise a descriptive error if it is missing so the symlink
creation cannot point to a non-existent target.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 62aafd33-de88-4d24-87a5-eefa51ff6dc3

📥 Commits

Reviewing files that changed from the base of the PR and between b87d66b87318c78e57459b42320bf9f7857e7d13 and aec03ab.

📒 Files selected for processing (1)
  • flashinfer-jit-cache/build_backend.py

@kahyunnam
Copy link
Copy Markdown
Member Author

/bot run

@flashinfer-bot
Copy link
Copy Markdown
Collaborator

GitLab MR !592 has been updated with latest changes, and the CI pipeline #49310596 is currently running. I'll report back once the pipeline job completes.

@aleozlx aleozlx mentioned this pull request Apr 23, 2026
@aleozlx aleozlx added the v0.6.10 release blocker label for 0.6.10 label Apr 23, 2026
@aleozlx aleozlx enabled auto-merge (squash) April 23, 2026 23:16
@aleozlx aleozlx merged commit 8eedd64 into flashinfer-ai:main Apr 24, 2026
77 of 115 checks passed
aleozlx added a commit that referenced this pull request Apr 30, 2026
<!-- .github/pull_request_template.md -->

## 📌 Description

Follow up to #3158. This
adds git submodule update --init --recursive in the jit-cache
build_backend.py before AOT compilation begins, ensuring all submodules
are populated.

## 🔍 Related Issues

#3159 

## 🚀 Pull Request Checklist

Thank you for contributing to FlashInfer! Before we review your pull
request, please make sure the following items are complete.

### ✅ Pre-commit Checks

- [x] I have installed `pre-commit` by running `pip install pre-commit`
(or used your preferred method).
- [x] I have installed the hooks with `pre-commit install`.
- [x] I have run the hooks manually with `pre-commit run --all-files`
and fixed any reported issues.

> If you are unsure about how to set up `pre-commit`, see [the
pre-commit documentation](https://pre-commit.com/).

## 🧪 Tests

- [x] Tests have been added or updated as needed.
- [x] All tests are passing (`unittest`, etc.).

## Reviewer Notes

<!-- Optional: anything you'd like reviewers to focus on, concerns, etc.
-->


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Chores**
* Improved Git submodule handling in the build system to enhance
reliability during JIT cache compilation.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Co-authored-by: Alex Yang <aleyang@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

v0.6.10 release blocker label for 0.6.10

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants