Skip to content

Add nvidia-drivers sysext#2798

Merged
danzatt merged 13 commits intomainfrom
danzatt/nvidia-drivers-signing
May 16, 2025
Merged

Add nvidia-drivers sysext#2798
danzatt merged 13 commits intomainfrom
danzatt/nvidia-drivers-signing

Conversation

@danzatt
Copy link
Copy Markdown
Contributor

@danzatt danzatt commented Mar 31, 2025

Add prebuilt NVIDIA drivers in a sysext

  • Add capability to specify per-sysext USE flags and compile different versions of upstream portage nvidia-drivers (including open and non-open variants).
  • Allow architecture-specific OS-dependent sysexts
  • Pull nvidia-drivers from portage and build sysexts from the package

Related PRs:
NVIDIA tests using sysext: mantle #598
NVIDIA runtime modifications to remove nvidia-smi symlink: sysext-bakery #153

How to use

Testing done

Created a build in Jenkins and tested using kola on Azure instances with NVIDIA GPUs

  • [*] Changelog entries added in the respective changelog/ directory (user-facing change, bug fix, security fix, update)
  • Inspected CI output for image differences: /boot and /usr size, packages, list files for any missing binaries, kernel modules, config files, kernel modules, etc.

@danzatt danzatt force-pushed the danzatt/nvidia-drivers-signing branch from 7a108d6 to 283f117 Compare April 1, 2025 19:36
Comment on lines +88 to +89
=x11-drivers/nvidia-drivers-550.144.03 ~arm64
=x11-drivers/nvidia-drivers-535.230.02 ~arm64
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So slight complication here... I believe only 570 builds for arm64 with the kernel+gcc combination we're using - at least that's what I found when adding support for it to nvidia drivers recently.

How about disabling nvidia sysexts for arm64 for the time being, to focus on getting amd64 in shape.

Copy link
Copy Markdown
Contributor Author

@danzatt danzatt Apr 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right. I was wondering why the ARM builds always fail with compilation errors. I'll disable it for now, which reminds me there currently isn't a way to make a sysext for AMD only. I might need to add another field to the sysext specifier.. Or maybe just skip building the sysext if the package is masked.

@danzatt danzatt force-pushed the danzatt/nvidia-drivers-signing branch from 283f117 to edfa185 Compare April 3, 2025 12:45
@danzatt danzatt force-pushed the danzatt/nvidia-drivers-signing branch from edfa185 to 3f23656 Compare April 3, 2025 12:48
@danzatt danzatt force-pushed the danzatt/nvidia-drivers-signing branch from 3f23656 to b2d0e80 Compare April 7, 2025 13:49
@danzatt danzatt force-pushed the danzatt/nvidia-drivers-signing branch from b2d0e80 to 4147f95 Compare April 10, 2025 12:58
@tormath1 tormath1 added the main label Apr 10, 2025
@tormath1 tormath1 moved this to ✅ Testing / in Review in Flatcar tactical, release planning, and roadmap Apr 10, 2025
@danzatt danzatt force-pushed the danzatt/nvidia-drivers-signing branch from 4147f95 to 4dc74ff Compare April 23, 2025 11:15
@danzatt danzatt force-pushed the danzatt/nvidia-drivers-signing branch from 4dc74ff to 692dd85 Compare April 23, 2025 12:38
@danzatt danzatt force-pushed the danzatt/nvidia-drivers-signing branch from 692dd85 to f524441 Compare April 29, 2025 11:30
@danzatt danzatt requested a review from chewi April 29, 2025 11:32
@danzatt
Copy link
Copy Markdown
Contributor Author

danzatt commented May 12, 2025

@chewi thanks for the review! I've just run shellcheck on this PR and hopefully fixed all instances of the bug

@chewi
Copy link
Copy Markdown
Contributor

chewi commented May 12, 2025

It looks good, but something has gone wrong with your rebase.

@t-lo
Copy link
Copy Markdown
Member

t-lo commented May 13, 2025

Happens to me too occasionally. Time for the good old "make a diff to main and add a single patch on top of a new pristine branch forked from main"?

danzatt added 12 commits May 13, 2025 11:26
From Gentoo commit 9fbc01bc73344c66498d3a3ccbf4fff5be10219b
Accept the NVIDIA-r2 license and don't build the NVIDIA tools.
Don't build NVIDIA drivers when the flatcar-nvidia-drivers sysext is
loaded, only load the prebuilt modules. Also make the nvidia.service run
after the sysexts are merged. Otherwise, it might start building the
modules and conflict with the prebuilt drivers sysext.
To be able to use the SLOT syntax, which uses : we need to change the
sysext seperator to |.
Needed for the nvidia-persistenced daemon.

It's from Gentoo commit e36ce47183552f9fc23556492d70ab4dc5f11e81.
@danzatt danzatt force-pushed the danzatt/nvidia-drivers-signing branch from 8e6cb31 to a96ba6d Compare May 13, 2025 09:30
@danzatt
Copy link
Copy Markdown
Contributor Author

danzatt commented May 13, 2025

I just rebased on main and force-pushed, seems to be OK now.

@github-project-automation github-project-automation bot moved this from ⚒️ In Progress to ✅ Testing / in Review in Flatcar tactical, release planning, and roadmap May 13, 2025
@ader1990
Copy link
Copy Markdown
Contributor

Hello, I have just a comment on the naming -> Rename nvidia-drivers to nvidia-drivers-service - when I started reading, I was expecting to see a systemd service implementation, where there was a systemd nvidia-drivers unit which was renamed to nvidia-drivers-service. Maybe a rename in something that does not contain the suffix service? nvidia-drivers was rather more suitable, any reason to rename it nvidia-drivers-service?

@ader1990
Copy link
Copy Markdown
Contributor

Hello, I have just a comment on the naming -> Rename nvidia-drivers to nvidia-drivers-service - when I started reading, I was expecting to see a systemd service implementation, where there was a systemd nvidia-drivers unit which was renamed to nvidia-drivers-service. Maybe a rename in something that does not contain the suffix service? nvidia-drivers was rather more suitable, any reason to rename it nvidia-drivers-service?

Okay, so I understand - the Flatcar package name was changed and not the underlying service name, which was left as is for backwards compatibility. It was a little bit confusing as I expected a service name change too. I suppose the naming can be left as is nvidia-drivers-service or nvidia-drivers-systemd-unit (to be more clear in intent).

@danzatt
Copy link
Copy Markdown
Contributor Author

danzatt commented May 13, 2025

For a reference, there are also related open PRs for kola and sysext-bakery

@danzatt danzatt merged commit 8dae992 into main May 16, 2025
2 of 4 checks passed
@github-project-automation github-project-automation bot moved this from ✅ Testing / in Review to Implemented in Flatcar tactical, release planning, and roadmap May 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants