[PD] Bump Nixl version#39797
Conversation
| @@ -1,3 +1,3 @@ | |||
| lmcache >= 0.3.9 | |||
| nixl[cu13] >= 0.7.1, < 0.10.0 # Required for disaggregated prefill | |||
| nixl[cu13] >= 0.7.1, <= 1.0.0 # Required for disaggregated prefill. Use nixl[cu12] for CUDA 12. | |||
There was a problem hiding this comment.
Why the upper bound? If 1.0.1 comes out tomorrow, you don't want to adopt it without another PR?
I see #35495 but the reason back then feels very temporary
Only 11 out of the 56 deps in requirements/common.txt have an upper bound ... so I think we'd only apply an upper bound for nixl if we had some specific reason to be wary of new releases
There was a problem hiding this comment.
@markmc tbh I don't have a strong opinion here, we're quite happy with nixl perf. Let me re-purposes PR
There was a problem hiding this comment.
actually let me hold on this, I don't get why nixl[cu13] is pulling in cu12
There was a problem hiding this comment.
uv pip install nixl[cu13]==1.0
Using Python 3.12.9 environment at: /home/NickLucche/llmd/.venv
Resolved 30 packages in 48ms
Installed 3 packages in 217ms
+ nixl==1.0.0
+ nixl-cu12==1.0.1 <== why does this need to be pulled?
+ nixl-cu13==1.0.0
cat ../.venv/lib/python3.12/site-packages/nixl-1.0.0.dist-info/METADATA | grep Requires-Dist
Requires-Dist: nixl-cu12>=1.0.0 <==
Requires-Dist: nixl-cu12==1.0.0; extra == "cu12"
Requires-Dist: nixl-cu13==1.0.0; extra == "cu13"
There was a problem hiding this comment.
Well I think that is the fundamental packaging issue in nixl wheel files... They should have packed the wheel files with +cu12 and +cu13 local version tags like pytorch, and the Requires-Dist should be simply itself (can be removed entirely if cuda runtime were in local version).
|
This pull request has merge conflicts that must be resolved before it can be |
|
I want to figure out 2 things before merging here.
part2: I'd like that nixl-cu12 dependency to be an exact == not >= or we're always going to have a moving target
|
|
If we do not pursue a mental health, we can just leave the In
|
|
@cjackal thanks for taking a look. I think that solution works for Dockerfile, but not for local installation with Should we just patch the requirements file with sed in the Dockerfile as I believe you also initially suggested? |
I can't come up with a better solution yet; an excuse for the sed yoga is that there already exists several if branches wrt cuda runtime (e.g. https://github.com/vllm-project/vllm/blob/main/docker/Dockerfile#L735-L752 ) so why not another? Or we may vender vllm fork of nixl, though I think it doesn't sound tempting in terms of ROI. |
We can work with the NIXL team here. I have highlighted both these issues to them. We will do our best to resolve quickly. |
|
NIXL 1.1.0 has been released, with this you can drop the explicit cu12/cu13 dependencies and use directly |
|
This pull request has merge conflicts that must be resolved before it can be |
|
we landed 1.1.0 |
Address #39521.
Mind that we're pinning cu13 on vllm right now.