[Bugfix] Fix ROCm UVA CPU weight offloading broken by #32993#34543
[Bugfix] Fix ROCm UVA CPU weight offloading broken by #32993#34543vllm-bot merged 1 commit intovllm-project:mainfrom
Conversation
…cpu_tensor Signed-off-by: Andreas Karatzas <akaratza@amd.com>
There was a problem hiding this comment.
Code Review
This pull request addresses a bug that prevented ROCm from using Unified Virtual Addressing (UVA) for CPU weight offloading. The fix is correct and straightforward, adding the necessary platform check for ROCm. I've included one suggestion to use an existing helper function, which will improve code maintainability.
| assert cpu_tensor.is_pinned(), "CPU tensor must be pinned" | ||
| return torch.ops._C.get_xpu_view_from_cpu_tensor(cpu_tensor) | ||
| elif current_platform.is_cuda(): | ||
| elif current_platform.is_cuda() or current_platform.is_rocm(): |
There was a problem hiding this comment.
For better maintainability and to align with existing patterns in this file (e.g., in aux_stream), it's preferable to use the is_cuda_alike() helper method. This method encapsulates the check for both CUDA and ROCm platforms, making the code cleaner and easier to update if more CUDA-like platforms are supported in the future.
| elif current_platform.is_cuda() or current_platform.is_rocm(): | |
| elif current_platform.is_cuda_alike(): |
There was a problem hiding this comment.
This is more declarative. I think it's better.
There was a problem hiding this comment.
@AndreasKaratzas let's follow gemini suggestion to be elif current_platform.is_cuda_alike():. It is a known convention in vLLM. If it works for both cuda and rocm, we use elif current_platform.is_cuda_alike():
njhill
left a comment
There was a problem hiding this comment.
Thanks @AndreasKaratzas!
I was also wondering about is_cuda_alike() since we use that elsewhere. But don't feel too strongly and would be good to fix the CI asap!
Yeah, I don't feel that strongly either tbh 😅 It's just that so far sometimes I've found it more useful for more explicitly defined conditionals, although |
|
@njhill |
(vllm-project#34543) Signed-off-by: Andreas Karatzas <akaratza@amd.com>
get_accelerator_view_from_cpu_tensor, causing aValueErroron ROCm.cuda_view.cuis hipified automatically and the op registration is not ROCm-guarded. Only the Python-level platform check was missing.current_platform.is_rocm()to the existing CUDA branch, since ROCm shares the same device type and dispatch key.Test plan
tests/kernels/core/test_uva.pytests/basic_correctness/test_cpu_offload.pytests/quantization/test_cpu_offload.py