Skip to content

[PD] Bump Nixl version#39797

Closed
NickLucche wants to merge 1 commit into
vllm-project:mainfrom
NickLucche:nixl-1.0
Closed

[PD] Bump Nixl version#39797
NickLucche wants to merge 1 commit into
vllm-project:mainfrom
NickLucche:nixl-1.0

Conversation

@NickLucche

Copy link
Copy Markdown
Member

Address #39521.

Mind that we're pinning cu13 on vllm right now.

@NickLucche NickLucche changed the title [Misc] Bump Nixl version [PD] Bump Nixl version Apr 14, 2026
@NickLucche NickLucche added the ready ONLY add when PR is ready to merge/full CI is needed label Apr 14, 2026

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the version constraint for the nixl[cu13] package in requirements/kv_connectors.txt from < 0.10.0 to <= 1.0.0 and adds a documentation note regarding CUDA 12 compatibility. I have no feedback to provide.

@@ -1,3 +1,3 @@
lmcache >= 0.3.9
nixl[cu13] >= 0.7.1, < 0.10.0 # Required for disaggregated prefill
nixl[cu13] >= 0.7.1, <= 1.0.0 # Required for disaggregated prefill. Use nixl[cu12] for CUDA 12.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the upper bound? If 1.0.1 comes out tomorrow, you don't want to adopt it without another PR?

I see #35495 but the reason back then feels very temporary

Only 11 out of the 56 deps in requirements/common.txt have an upper bound ... so I think we'd only apply an upper bound for nixl if we had some specific reason to be wary of new releases

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@markmc tbh I don't have a strong opinion here, we're quite happy with nixl perf. Let me re-purposes PR

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually let me hold on this, I don't get why nixl[cu13] is pulling in cu12

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

 uv pip install nixl[cu13]==1.0
Using Python 3.12.9 environment at: /home/NickLucche/llmd/.venv
Resolved 30 packages in 48ms
Installed 3 packages in 217ms
 + nixl==1.0.0
 + nixl-cu12==1.0.1 <== why does this need to be pulled?
 + nixl-cu13==1.0.0

cat ../.venv/lib/python3.12/site-packages/nixl-1.0.0.dist-info/METADATA | grep Requires-Dist

Requires-Dist: nixl-cu12>=1.0.0 <==
Requires-Dist: nixl-cu12==1.0.0; extra == "cu12"
Requires-Dist: nixl-cu13==1.0.0; extra == "cu13"

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cjackal cjackal Apr 15, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well I think that is the fundamental packaging issue in nixl wheel files... They should have packed the wheel files with +cu12 and +cu13 local version tags like pytorch, and the Requires-Dist should be simply itself (can be removed entirely if cuda runtime were in local version).

Signed-off-by: NickLucche <nlucches@redhat.com>
@mergify

mergify Bot commented Apr 15, 2026

Copy link
Copy Markdown
Contributor

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @NickLucche.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@NickLucche

Copy link
Copy Markdown
Member Author

I want to figure out 2 things before merging here.

  • If there's a better way to install nixl so that we can avoid pulling in both -cu12 and -cu13 for systems that only need 13. This will require changes to nixl package so we'll eventually pick it up (if a better way to do it is found)
 uv pip install nixl[cu13]==1.0
Using Python 3.12.9 environment at: /home/NickLucche/llmd/.venv
Resolved 30 packages in 48ms
Installed 3 packages in 217ms
 + nixl==1.0.0
 + nixl-cu12==1.0.1 <== why does this need to be pulled?
 + nixl-cu13==1.0.0

cat ../.venv/lib/python3.12/site-packages/nixl-1.0.0.dist-info/METADATA | grep Requires-Dist

Requires-Dist: nixl-cu12>=1.0.0 <==
Requires-Dist: nixl-cu12==1.0.0; extra == "cu12"
Requires-Dist: nixl-cu13==1.0.0; extra == "cu13"

part2: I'd like that nixl-cu12 dependency to be an exact == not >= or we're always going to have a moving target

@cjackal

cjackal commented Apr 15, 2026

Copy link
Copy Markdown
Contributor

If we do not pursue a mental health, we can just leave the nixl_cu1x variant in the kv_connectors.txt requirement file and post-install the nixl metapackage with --no-deps option like:

In kv_connectors.txt

 ...
-nixl[cu13] >= 0.7.1, <= 1.0.0
-nixl-cu12 >= 0.7.1, <= 1.0.0
 nixl-cu13 >= 0.7.1, <= 1.0.0

In Dockerfile

             apt-get update -y && \
             apt-get install -y --no-install-recommends --allow-change-held-packages ${BUILD_PKGS} && \
             uv pip install --system -r /tmp/kv_connectors.txt --no-build-isolation && \
+            uv pip install --system 'nixl>= 0.7.1,<= 1.0.0' --no-deps && \
             apt-get purge -y ${BUILD_PKGS} && \

As @NickLucche showed in the comment above, nixl has some packaging issues. vllm does not put nixl into the package dependency, so nixl packaging issue only matters in docker build stage. While a dirty hotfix, it can save ~200Mi of grand total container image size by removing duplicated nixl-packaged shared objects.

@NickLucche

Copy link
Copy Markdown
Member Author

@cjackal thanks for taking a look. I think that solution works for Dockerfile, but not for local installation with uv pip install requirements/kv_connector.txt (as the api isn't pulled) which might leave some people confused.

Should we just patch the requirements file with sed in the Dockerfile as I believe you also initially suggested?

@cjackal

cjackal commented Apr 16, 2026

Copy link
Copy Markdown
Contributor

@cjackal thanks for taking a look. I think that solution works for Dockerfile, but not for local installation with uv pip install requirements/kv_connector.txt (as the api isn't pulled) which might leave some people confused.

Should we just patch the requirements file with sed in the Dockerfile as I believe you also initially suggested?

I can't come up with a better solution yet; an excuse for the sed yoga is that there already exists several if branches wrt cuda runtime (e.g. https://github.com/vllm-project/vllm/blob/main/docker/Dockerfile#L735-L752 ) so why not another?

Or we may vender vllm fork of nixl, though I think it doesn't sound tempting in terms of ROI.

@alec-flowers

Copy link
Copy Markdown
Contributor

I want to figure out 2 things before merging here.

  • If there's a better way to install nixl so that we can avoid pulling in both -cu12 and -cu13 for systems that only need 13. This will require changes to nixl package so we'll eventually pick it up (if a better way to do it is found)
 uv pip install nixl[cu13]==1.0
Using Python 3.12.9 environment at: /home/NickLucche/llmd/.venv
Resolved 30 packages in 48ms
Installed 3 packages in 217ms
 + nixl==1.0.0
 + nixl-cu12==1.0.1 <== why does this need to be pulled?
 + nixl-cu13==1.0.0

cat ../.venv/lib/python3.12/site-packages/nixl-1.0.0.dist-info/METADATA | grep Requires-Dist

Requires-Dist: nixl-cu12>=1.0.0 <==
Requires-Dist: nixl-cu12==1.0.0; extra == "cu12"
Requires-Dist: nixl-cu13==1.0.0; extra == "cu13"

part2: I'd like that nixl-cu12 dependency to be an exact == not >= or we're always going to have a moving target

We can work with the NIXL team here. I have highlighted both these issues to them. We will do our best to resolve quickly.

@ovidiusm

Copy link
Copy Markdown
Contributor

NIXL 1.1.0 has been released, with this you can drop the explicit cu12/cu13 dependencies and use directly nixl.

@mergify

mergify Bot commented May 23, 2026

Copy link
Copy Markdown
Contributor

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @NickLucche.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify Bot added the needs-rebase label May 23, 2026
@NickLucche

Copy link
Copy Markdown
Member Author

we landed 1.1.0

@NickLucche NickLucche closed this May 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/build kv-connector needs-rebase ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants