Fix autotuner crash when input tensor is None#2756
Fix autotuner crash when input tensor is None#2756samuellees merged 5 commits intoflashinfer-ai:mainfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughPreserve None-valued optional inputs in autotuner input preparation and relax fused MoE routing validations by deriving token counts from Changes
Sequence Diagram(s)(Skipped — changes are bug fixes and validation relaxations that do not introduce a new multi-component control flow requiring visualization.) Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 792a36d6-79c6-4bd0-944d-642934e658c4
📥 Commits
Reviewing files that changed from the base of the PR and between fe06b91 and af6cb712bf09e609ae590e65672a290eff2b2e42.
📒 Files selected for processing (1)
flashinfer/autotuner.py
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses a critical bug in the autotuner's input preparation logic that caused a crash when optional tensor inputs were Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
The pull request effectively addresses a critical crash by correctly handling None inputs in the _prepare_input_tensors function. The change ensures that optional tensors passed as None are preserved as-is, preventing _create_tensor_like from attempting to access attributes on a None object. This improves the robustness of the autotuner's input preparation process.
| # Some callers pass None for optional tensors (e.g. routing_logits | ||
| # in non-routed MoE). Preserve None as-is. | ||
| tensors.append(None) | ||
| elif any(isinstance(d, DynamicDim) for d in p): |
There was a problem hiding this comment.
The current fix correctly handles None inputs. However, the _create_tensor_like function expects origin_tensor to be a torch.Tensor. If inputs[i] is not None but also not a torch.Tensor (e.g., a Python scalar like an int or float), and p contains DynamicDim, calling _create_tensor_like with a non-tensor object will still lead to a crash (e.g., when trying to access .dtype). To ensure robustness and align with the pattern in _prepare_input_tensors_with_batches that uses isinstance(t, torch.Tensor) checks for non-tensor inputs, the elif condition should explicitly check if inputs[i] is a torch.Tensor before attempting to create a tensor-like object.
| elif any(isinstance(d, DynamicDim) for d in p): | |
| elif isinstance(inputs[i], torch.Tensor) and any(isinstance(d, DynamicDim) for d in p): |
|
Hi @he-yufeng I tried your PR, but now I get this error: |
|
Thanks for testing @trevor-m! The crash in Fixed in 9043934:
Could you try again with the latest commit? |
|
@he-yufeng Thanks, it's working now |
samuellees
left a comment
There was a problem hiding this comment.
Could you please add a smoke test for the fix?
| ) | ||
| # topk_ids/expert_weights can be empty(0) when routing_logits is provided, | ||
| # or real tensors when pre-computed routing is used. | ||
| if topk_ids.numel() > 0: |
There was a problem hiding this comment.
expert_weights is checked like if expert_weights is not None and expert_weights.numel() > 0:
Could you keep a similar check style for topk_ids, please?
|
Good catch, updated topk_ids check to match the expert_weights style. |
Thanks @he-yufeng ! Could you add a smoke test for your code path, please? I believe the PR will be moved forward very fast once the test is ready ^ ^ |
|
Added two smoke tests in |
samuellees
left a comment
There was a problem hiding this comment.
LGTM. @he-yufeng Could you please resolve the conflict with main branch? Thanks
|
/bot run |
|
[FAILED] Pipeline #46957861: 1/20 passed |
|
/bot run |
|
[FAILED] Pipeline #47022144: 1/20 passed |
trtllm_fp8_block_scale_routed_moe passes routing_logits=None for non-routed calls, but _prepare_input_tensors assumes all inputs are tensors and crashes with AttributeError: 'NoneType' has no attribute 'dtype' in _create_tensor_like. Skip None inputs and preserve them as-is, matching the existing pattern in _prepare_input_tensors_with_batches which already handles non-tensor inputs gracefully. Fixes flashinfer-ai#2749
get_valid_tactics() and forward() both accessed routing_logits.shape[0] to get num_tokens, but routing_logits is None when pre-computed routing is used (trtllm_fp8_block_scale_routed_moe passes routing_logits=None). Use hidden_states.shape[0] instead, which is always available. Also guard the shape assertions for topk_ids/expert_weights that can be empty(0) tensors depending on the routing mode.
Cover the _prepare_input_tensors and choose_one paths when an optional tensor (e.g. routing_logits in non-routed MoE) is None, which previously caused AttributeError on .dtype/.shape.
1f2473b to
260ee5e
Compare
|
/bot run |
|
[FAILED] Pipeline #47045960: 12/20 passed |
|
/bot run |
|
[FAILED] Pipeline #47074819: 13/20 passed |
|
/bot run |
|
[FAILED] Pipeline #47092245: 11/20 passed |
|
Hi @he-yufeng , the CI seems passed. Some error are un-relative with this PR. This blocks some other test cases. You can run pre-commit this way: Line 20 in 31b63bc Please let me know if you meet any question~ |
|
/bot run |
|
[FAILED] Pipeline #47225408: 11/20 passed |
Fixes #2749.
trtllm_fp8_block_scale_routed_moepassesrouting_logits=Nonefor non-routed calls, but_prepare_input_tensorsassumes all inputs are tensors and crashes in_create_tensor_liketrying to access.dtypeonNone.Fix: skip
Noneinputs and preserve them as-is. This matches the existing pattern in_prepare_input_tensors_with_batcheswhich already handles non-tensor inputs withisinstance(t, torch.Tensor)checks.Summary by CodeRabbit
Bug Fixes
Tests