Fix #1641: Use /usr/local/cuda as default CUDA_HOME if possible, like torch.utils.cpp_extension.CUDA_HOME#1948
Conversation
…ossible, like torch.utils.cpp_extension.CUDA_HOME
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. WalkthroughThe Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
Summary of ChangesHello @netanel-haber, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request enhances the robustness of CUDA environment detection by implementing a fallback mechanism for Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
flashinfer/jit/cpp_ext.py (1)
50-59: Implementation looks good; consider enhancing the error message.The fallback logic correctly implements the intended behavior of matching PyTorch's default CUDA path. The existence check prevents using a non-existent path, and the conditional structure is sound.
Minor suggestion: The error message could be more actionable:
if not os.path.exists(cuda_home): raise RuntimeError( - f"Could not find nvcc and default {cuda_home=} doesn't exist" + f"Could not find nvcc and default {cuda_home=} doesn't exist. " + f"Please set CUDA_HOME or CUDA_PATH environment variable, or install CUDA at {cuda_home}" )Note on static analysis: Ruff flags TRY003 (long exception message), but the current message length is reasonable for a RuntimeError with important context, so this can be safely ignored.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
flashinfer/jit/cpp_ext.py(1 hunks)
🧰 Additional context used
🪛 Ruff (0.14.0)
flashinfer/jit/cpp_ext.py
57-59: Avoid specifying long messages outside the exception class
(TRY003)
There was a problem hiding this comment.
Code Review
This pull request correctly implements a fallback to /usr/local/cuda for CUDA_HOME when nvcc is not found in the system's PATH, which aligns with PyTorch's behavior. This is a good improvement for environments where CUDA_HOME is not explicitly set. I have added one suggestion to make the nvcc discovery logic more robust by handling an edge case.
| if nvcc_path.returncode == 0: | ||
| cuda_home = os.path.dirname( | ||
| os.path.dirname(nvcc_path.stdout.decode("utf-8").strip()) | ||
| ) | ||
| else: | ||
| cuda_home = "/usr/local/cuda" # This default value is from: https://github.com/pytorch/pytorch/blob/ceb11a584d6b3fdc600358577d9bf2644f88def9/torch/utils/cpp_extension.py#L115 | ||
| if not os.path.exists(cuda_home): | ||
| raise RuntimeError( | ||
| f"Could not find nvcc and default {cuda_home=} doesn't exist" | ||
| ) |
There was a problem hiding this comment.
The current logic does not handle the edge case where which nvcc succeeds (returns exit code 0) but produces no output to stdout. In this scenario, cuda_home would become an empty string, which could lead to unexpected behavior in downstream functions like get_cuda_version that expect a valid path.
To make the logic more robust, it's best to also check that nvcc_path.stdout is not empty before attempting to determine cuda_home from it. I've also slightly improved the error message for better clarity.
| if nvcc_path.returncode == 0: | |
| cuda_home = os.path.dirname( | |
| os.path.dirname(nvcc_path.stdout.decode("utf-8").strip()) | |
| ) | |
| else: | |
| cuda_home = "/usr/local/cuda" # This default value is from: https://github.com/pytorch/pytorch/blob/ceb11a584d6b3fdc600358577d9bf2644f88def9/torch/utils/cpp_extension.py#L115 | |
| if not os.path.exists(cuda_home): | |
| raise RuntimeError( | |
| f"Could not find nvcc and default {cuda_home=} doesn't exist" | |
| ) | |
| if nvcc_path.returncode == 0 and nvcc_path.stdout: | |
| cuda_home = os.path.dirname( | |
| os.path.dirname(nvcc_path.stdout.decode("utf-8").strip()) | |
| ) | |
| else: | |
| cuda_home = "/usr/local/cuda" # This default value is from: https://github.com/pytorch/pytorch/blob/ceb11a584d6b3fdc600358577d9bf2644f88def9/torch/utils/cpp_extension.py#L115 | |
| if not os.path.exists(cuda_home): | |
| raise RuntimeError( | |
| f"Could not find nvcc and default cuda home '{cuda_home}' does not exist" | |
| ) |
/usr/local/cuda as default CUDA_HOME if possible, like torch.utils.cpp_extension.CUDA_HOME
|
Hello @yzh119 , is there anything else I'm missing here? |
Fix #1641: Use
/usr/local/cudaas defaultCUDA_HOMEif possible, liketorch.utils.cpp_extension.CUDA_HOME📌 Description
Before 1641,
torch.utils.cpp_extension.CUDA_HOMEwas used, and it sets a default of/usr/loca/cudaif it cannot findnvcc. This pr re-integrates that logic back into sglang, so in environments whereCUDA_HOMEandCUDA_PATHaren't defined explicitly, said default is used.See:
#1641 (comment)
🚀 Pull Request Checklist
✅ Pre-commit Checks
pre-commitby runningpip install pre-commit(or used your preferred method).pre-commit install.pre-commit run --all-filesand fixed any reported issues.🧪 Tests
unittest, etc.).Summary by CodeRabbit
Release Notes