fix: Skipping attention sink Blackwell test outside of Blackwell#1978
Conversation
WalkthroughModified GPU compatibility check in attention sink tests to restrict execution only to SM100/SM103 GPUs by changing the skip condition from excluding SM110/SM120/SM121 to rejecting all compute capabilities except compute_capability[0] == 10. Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~3 minutes
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
tests/attention/test_attention_sink_blackwell.py (1)
144-145: LGTM! Consider harmonizing skip messages.The skip condition fix is correct and consistent with the first test. However, the skip message differs slightly from line 44 ("trtllm-gen only supports" vs. "These tests are only guaranteed to work on"). Consider using consistent wording across both tests.
Optional: Harmonize the message with line 44 for consistency:
- pytest.skip("These tests are only guaranteed to work on SM100 and SM103 GPUs.") + pytest.skip("trtllm-gen only supports SM100 and SM103 GPUs.")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
tests/attention/test_attention_sink_blackwell.py(2 hunks)
🔇 Additional comments (1)
tests/attention/test_attention_sink_blackwell.py (1)
43-44: LGTM! Skip logic correctly fixed.The change from a blacklist approach (
in [11, 12]) to a whitelist approach (!= 10) correctly ensures tests only run on SM100 and SM103 GPUs (compute capability 10.x), preventing execution on incompatible architectures like Hopper SM90.
…shinfer-ai#1978) <!-- .github/pull_request_template.md --> `test_attention_sink_blackwell.py` checks `flashinfer.prefill.trtllm_batch_context_with_kv_cache` and `flashinfer.decode.trtllm_batch_decode_with_kv_cache` which are only supported on Blackwell SM100 and SM103. Existing check only skips testing of SM 11x or 12x, which causes failures on Hopper SM90. Test outputs: * H200: * Before Fix: `144 failed, 1 warning in 9.20s` * After Fix: `144 skipped, 1 warning in 0.42s` * B200: * After Fix: `144 passed in 34.64s ` <!-- What does this PR do? Briefly describe the changes and why they’re needed. --> <!-- Link any related issues here --> Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete. - [x] I have installed `pre-commit` by running `pip install pre-commit` (or used your preferred method). - [x] I have installed the hooks with `pre-commit install`. - [x] I have run the hooks manually with `pre-commit run --all-files` and fixed any reported issues. > If you are unsure about how to set up `pre-commit`, see [the pre-commit documentation](https://pre-commit.com/). - [x] Tests have been added or updated as needed. - [x] All tests are passing (`unittest`, etc.). <!-- Optional: anything you'd like reviewers to focus on, concerns, etc. --> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> * **Tests** * Updated GPU compatibility checks for attention sink tests to target specific GPU architectures (SM100/SM103). Tests now run exclusively on supported GPU models with updated filtering criteria. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
…shinfer-ai#1978) <!-- .github/pull_request_template.md --> `test_attention_sink_blackwell.py` checks `flashinfer.prefill.trtllm_batch_context_with_kv_cache` and `flashinfer.decode.trtllm_batch_decode_with_kv_cache` which are only supported on Blackwell SM100 and SM103. Existing check only skips testing of SM 11x or 12x, which causes failures on Hopper SM90. Test outputs: * H200: * Before Fix: `144 failed, 1 warning in 9.20s` * After Fix: `144 skipped, 1 warning in 0.42s` * B200: * After Fix: `144 passed in 34.64s ` <!-- What does this PR do? Briefly describe the changes and why they’re needed. --> <!-- Link any related issues here --> Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete. - [x] I have installed `pre-commit` by running `pip install pre-commit` (or used your preferred method). - [x] I have installed the hooks with `pre-commit install`. - [x] I have run the hooks manually with `pre-commit run --all-files` and fixed any reported issues. > If you are unsure about how to set up `pre-commit`, see [the pre-commit documentation](https://pre-commit.com/). - [x] Tests have been added or updated as needed. - [x] All tests are passing (`unittest`, etc.). <!-- Optional: anything you'd like reviewers to focus on, concerns, etc. --> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> * **Tests** * Updated GPU compatibility checks for attention sink tests to target specific GPU architectures (SM100/SM103). Tests now run exclusively on supported GPU models with updated filtering criteria. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
📌 Description
test_attention_sink_blackwell.pychecksflashinfer.prefill.trtllm_batch_context_with_kv_cacheandflashinfer.decode.trtllm_batch_decode_with_kv_cachewhich are only supported on Blackwell SM100 and SM103.Existing check only skips testing of SM 11x or 12x, which causes failures on Hopper SM90.
Test outputs:
144 failed, 1 warning in 9.20s144 skipped, 1 warning in 0.42s144 passed in 34.64s🔍 Related Issues
🚀 Pull Request Checklist
Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete.
✅ Pre-commit Checks
pre-commitby runningpip install pre-commit(or used your preferred method).pre-commit install.pre-commit run --all-filesand fixed any reported issues.🧪 Tests
unittest, etc.).Reviewer Notes
Summary by CodeRabbit