Skip to content

nvidia-runtime: create symlinks dynamically#153

Merged
danzatt merged 1 commit intomainfrom
remove-nvidia-runtime-symlink
May 20, 2025
Merged

nvidia-runtime: create symlinks dynamically#153
danzatt merged 1 commit intomainfrom
remove-nvidia-runtime-symlink

Conversation

@danzatt
Copy link
Copy Markdown
Contributor

@danzatt danzatt commented Apr 23, 2025

The sysext contains /opt/bin/nvidia-smi -> /usr/bin/nvidia-smi symlink. This does not play well with the prebuilt sysext, as it already ships the binaries in /usr/bin. Create the symlink dynamically, only when the prebuilt sysext is not loaded.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a dynamic approach for creating symlinks to avoid conflicts with the prebuilt sysext by replacing hard-coded link creation with helper scripts.

  • Replaces multiple ExecStartPre/ExecStartPost commands with calls to _nvidia-persistenced-helper in the systemd unit override file.
  • Removes static symlink creation commands from the create.sh script.

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
nvidia-runtime.sysext/files/usr/lib/systemd/system/nvidia.service.d/10-persistenced.conf Replaces manual commands with helper script invocations to manage dynamic symlink creation and service restarts.
nvidia-runtime.sysext/create.sh Removes static symlink creation lines in favor of dynamically created symlinks at runtime.
Comments suppressed due to low confidence (1)

nvidia-runtime.sysext/create.sh:53

  • [nitpick] Since the symlink creation has been removed in favor of dynamic creation, consider updating the function's comment to document this change for improved maintainability.
ln -s /opt/nvidia "${sysextroot}/usr/local/nvidia"

ExecStartPost=mkdir -p /run/extensions
ExecStartPost=ln -sf /opt/nvidia/current /run/extensions/nvidia-driver
ExecStartPost=systemctl restart systemd-sysext
ExecStartPre=/usr/local/bin/_nvidia-persistenced-helper pre
Copy link

Copilot AI May 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Consider adding an inline comment to explain that _nvidia-persistenced-helper handles the dynamic creation of symlinks when the prebuilt sysext is absent, improving future code clarity.

Copilot uses AI. Check for mistakes.
chcon -R -t container_file_t /dev/nvidia* || true

if [ "$SYSEXT_MODE" -eq 0 ]; then
/opt/bin/nvidia-persistenced || true
Copy link
Copy Markdown
Member

@jepio jepio May 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i've been meaning to drop launching nvidia-persistenced - it causes issues for some usecases. Lets drop this line and leave this to users.

ln -sf /opt/nvidia/current /run/extensions/nvidia-driver
systemctl restart systemd-sysext
else
systemctl enable --now nvidia-persistenced
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i've been meaning to drop launching nvidia-persistenced - it causes issues for some usecases. Lets drop this line and leave this to users.

Copy link
Copy Markdown
Member

@jepio jepio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm but drop the persistenced and i have that one question around the symlinks

Comment on lines +22 to +23
mkdir -p /opt/nvidia/current/usr/bin && ln -sf /opt/bin/nvidia-smi /opt/nvidia/current/usr/bin/nvidia-smi
mkdir -p /opt/nvidia/current/usr/local/ && ln -sf /opt/nvidia /opt/nvidia/current/usr/local/nvidia
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

were you planning on moving these lines to the nvidia-driver.service? Since it's only in that case that we mount /opt/nvidia/current as a sysext.

Second question: does sysext_mode ship /usr/local/nvidia?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, the /opt/nvidia/current is mounted as sysext only when using the nvidia-runtime sysext, but we can move it to nvidia.service.

Re sysext_mode: I've just checked, and it doesn't ship /usr/local/nvidia , but the tests (including GPU uperator) are passing even without it. Do we need to ship it?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/usr/local/nvidia appears to be used when we request that the GPU -Operator install the container-toolkit. So something (maybe symlinking /usr/local/nvidia to /var/lib/nvidia?) would be good to have in Flatcar

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, returning the /usr/local/nvidia symlink to this sysext for now. We should investigate a more systematic approch.

@danzatt danzatt force-pushed the remove-nvidia-runtime-symlink branch 2 times, most recently from 6ed20c8 to 89e264b Compare May 16, 2025 13:37
danzatt added a commit to flatcar/scripts that referenced this pull request May 16, 2025
Add prebuilt NVIDIA drivers in a sysext

- Add capability to specify per-sysext USE flags and compile different versions of upstream portage nvidia-drivers (including open and non-open variants).
- Allow architecture-specific OS-dependent sysexts
- Pull `nvidia-drivers` from portage and build sysexts from the package

Related PRs:
NVIDIA tests using sysext: [mantle #598](flatcar/mantle#598)
NVIDIA runtime modifications to remove `nvidia-smi` symlink: [sysext-bakery #153](flatcar/sysext-bakery#153)
The sysext contains /opt/bin/nvidia-smi -> /usr/bin/nvidia-smi symlink.
This does not play well with the prebuilt sysext, as it already ships
the binaries in /usr/bin. Create the symlink dynamically, only when the
prebuilt sysext is not loaded.

Signed-off-by: Daniel Zatovic <daniel.zatovic@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants