-
Notifications
You must be signed in to change notification settings - Fork 77
[Debian 13] vainfo fails: can't open /dev/dri/render128 - invalid argument #351
Description
I am running X11 on Debian Trixie (kernel 6.12.9) on my desktop with a RTX 3090 GPU and a very old AMD Ryzen Threadripper 1950X CPU.
lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description: Debian GNU/Linux trixie/sid
Release: n/a
Codename: trixie
uname -a
Linux debian 6.12.9-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.12.9-1 (2025-01-10) x86_64 GNU/Linux
echo $XDG_SESSION_TYPE
x11
I have installed Nvidia 565 drivers according to the following guides:
Nvidia driver installation: https://docs.nvidia.com/datacenter/tesla/driver-installation-guide/
Nvidia cuda installation: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/
The gist of the installation process is:
sudo apt-get install -V nvidia-open
sudo apt-get install -V cuda-drivers
sudo apt-get install -V cuda-toolkit
My Nvidia drivers seems to be working. nvidia-smi prints normally, and here is my graphic settings:
inxi -Ga
Graphics:
Device-1: NVIDIA GA102 [GeForce RTX 3090] vendor: eVga.com. driver: nvidia
v: 565.57.01 alternate: nouveau,nvidia_drm non-free: 550.xx+ status: current
(as of 2024-09; EOL~2026-12-xx) arch: Ampere code: GAxxx
process: TSMC n7 (7nm) built: 2020-2023 pcie: gen: 1 speed: 2.5 GT/s
lanes: 8 link-max: gen: 4 speed: 16 GT/s lanes: 16 ports: active: none
off: HDMI-A-1 empty: DP-1,DP-2,DP-3 bus-ID: 42:00.0 chip-ID: 10de:2204
class-ID: 0300
Device-2: Logitech C922 Pro Stream Webcam driver: snd-usb-audio,uvcvideo
type: USB rev: 2.0 speed: 480 Mb/s lanes: 1 mode: 2.0 bus-ID: 3-4:3
chip-ID: 046d:085c class-ID: 0102 serial: DD741E8F
Display: x11 server: X.Org v: 21.1.15 with: Xwayland v: 24.1.4
compositor: gnome-shell v: 47.2 driver: X: loaded: fbdev,nouveau
unloaded: modesetting,vesa alternate: nv dri: swrast
gpu: nvidia,nvidia-nvswitch display-ID: :1 screens: 1
Screen-1: 0 s-res: 3840x2160 s-dpi: 96 s-size: 1016x572mm (40.00x22.52")
s-diag: 1166mm (45.9")
Monitor-1: HDMI-A-1 mapped: default note: disabled model: Dell S2817Q
serial: MTKT17AK960I built: 2017 res: 3840x2160 gamma: 1.2
diag: 708mm (27.9") ratio: 16:9 modes: max: 3840x2160 min: 640x480
API: EGL v: 1.5 hw: drv: nvidia platforms: device: 0 drv: nvidia device: 2
drv: swrast surfaceless: drv: nvidia x11: drv: swrast
inactive: gbm,wayland,device-1
API: OpenGL v: 4.6.0 compat-v: 4.5 vendor: mesa v: 24.3.3-1 glx-v: 1.4
direct-render: yes renderer: llvmpipe (LLVM 19.1.6 256 bits)
device-ID: ffffffff:ffffffff memory: 45.92 GiB unified: yes
ffmpeg -encoders 2>/dev/null | grep nvenc
V....D av1_nvenc NVIDIA NVENC av1 encoder (codec av1)
V....D h264_nvenc NVIDIA NVENC H.264 encoder (codec h264)
V....D hevc_nvenc NVIDIA NVENC hevc encoder (codec hevc)
One weird thing about my Nvidia installation: even though I only have 1 GPU, the Nvidia drivers seem to be using Optimus management I think? At least I can't launch firefox onto Nvidia without setting these env: __NV_PRIME_RENDER_OFFLOAD=1 __GLX_VENDOR_LIBRARY_NAME=nvidia, otherwise it would default to software CPU rendering with Mesa and LLVM drivers. My 3090 is the only GPU I am using, so ideally I want everything to run on my 3090 (including X11), but I can't get it to work for some reason.
Setting __NV_PRIME_RENDER_OFFLOAD=1 __GLX_VENDOR_LIBRARY_NAME=nvidia also changes the inxi -Ga output to avoid using Mesa/LLVM as the OpenGL API:
__NV_PRIME_RENDER_OFFLOAD=1 __GLX_VENDOR_LIBRARY_NAME=nvidia inxi -Ga
Graphics:
Device-1: NVIDIA GA102 [GeForce RTX 3090] vendor: eVga.com. driver: nvidia
v: 565.57.01 alternate: nouveau,nvidia_drm non-free: 550.xx+ status: current
(as of 2024-09; EOL~2026-12-xx) arch: Ampere code: GAxxx
process: TSMC n7 (7nm) built: 2020-2023 pcie: gen: 1 speed: 2.5 GT/s
lanes: 8 link-max: gen: 4 speed: 16 GT/s lanes: 16 ports: active: none
off: HDMI-A-1 empty: DP-1,DP-2,DP-3 bus-ID: 42:00.0 chip-ID: 10de:2204
class-ID: 0300
Device-2: Logitech C922 Pro Stream Webcam driver: snd-usb-audio,uvcvideo
type: USB rev: 2.0 speed: 480 Mb/s lanes: 1 mode: 2.0 bus-ID: 3-4:3
chip-ID: 046d:085c class-ID: 0102 serial: DD741E8F
Display: x11 server: X.Org v: 21.1.15 with: Xwayland v: 24.1.4
compositor: gnome-shell v: 47.2 driver: X: loaded: fbdev,nouveau
unloaded: modesetting,vesa alternate: nv gpu: nvidia,nvidia-nvswitch
display-ID: :1 screens: 1
Screen-1: 0 s-res: 3840x2160 s-dpi: 96 s-size: 1016x572mm (40.00x22.52")
s-diag: 1166mm (45.9")
Monitor-1: HDMI-A-1 mapped: default note: disabled model: Dell S2817Q
serial: MTKT17AK960I built: 2017 res: 3840x2160 gamma: 1.2
diag: 708mm (27.9") ratio: 16:9 modes: max: 3840x2160 min: 640x480
API: EGL v: 1.5 hw: drv: nvidia platforms: device: 0 drv: nvidia device: 2
drv: swrast surfaceless: drv: nvidia x11: drv: swrast
inactive: gbm,wayland,device-1
API: OpenGL v: 4.6.0 compat-v: 4.5 vendor: nvidia mesa v: 565.57.01
glx-v: 1.4 direct-render: yes renderer: NVIDIA GeForce RTX 3090/PCIe/SSE2
memory: 23.44 GiB
Anyway, I was trying to get firefox video decoding hardware acceleration to work. I installed nvidia-vaapi-driver=v0.0.13, but I am having some issues with poor performance still. Launching Firefox with the recommended about:config settings and some additional env I found online gives and EGL and VA-API error:
__NV_PRIME_RENDER_OFFLOAD=1 __GLX_VENDOR_LIBRARY_NAME=nvidia __VK_LAYER_NV_optimus=NVIDIA_only __EGL_VENDOR_LIBRARY_FILENAMES=/usr/share/glvnd/egl_vendor.d/10_nvidia.json LIBVA_DRIVER_NAME=nvidia LIBVA_DEVICE=/dev/dri/renderD128 VDPAU_DRIVER=nvidia NVD_BACKEND=direct NVD_GPU="/dev/dri/renderD128" MOZ_ENABLE_WAYLAND=0 MOZ_DISABLE_RDD_SANDBOX=1 MOZ_DRM_DEVICE=/dev/dri/renderD128 NVD_LOG=1 firefox
[GFX1-]: glxtest: libEGL no display
[GFX1-]: vaapitest: ERROR
[GFX1-]: vaapitest: VA-API test failed: failed to open renderDeviceFD.
So I don't think the VA-API is working correctly. When I watch a video on Firefox, it's choppy and while nvtop does have Firefox, it doesn't show any dec decoding utilization (which it does when I watch a local video with VLC).
So I installed vainfo and tried to debug it, but vainfo doesn't work either:
__NV_PRIME_RENDER_OFFLOAD=1 __GLX_VENDOR_LIBRARY_NAME=nvidia NVD_LOG=1 LIBVA_DRIVER_NAME=nvidia VDPAU_DRIVER=nvidia vainfo
Trying display: wayland
Trying display: x11
libva info: VA-API version 1.22.0
libva error: vaGetDriverNames() failed with unknown libva error
libva info: User environment variable requested driver 'nvidia'
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/nvidia_drv_video.so
libva info: Found init function __vaDriverInit_1_0
45673.974193197 [68826-68826] ../src/vabackend.c:2187 __vaDriverInit_1_0 Initialising NVIDIA VA-API Driver: 10
45673.974200250 [68826-68826] ../src/vabackend.c:2196 __vaDriverInit_1_0 Now have 0 (0 max) instances
45673.974210940 [68826-68826] ../src/vabackend.c:2222 __vaDriverInit_1_0 Selecting Direct backend
45674.006006810 [68826-68826] ../src/direct/direct-export-buf.c: 68 direct_initExporter Searching for GPU: 0 0 128
45674.006151956 [68826-68826] ../src/direct/direct-export-buf.c: 72 direct_initExporter Unable to find NVIDIA GPU 0
45674.006161053 [68826-68826] ../src/vabackend.c:2247 __vaDriverInit_1_0 Exporter failed
libva error: /usr/lib/x86_64-linux-gnu/dri/nvidia_drv_video.so init failed
libva info: va_openDriver() returns 1
vaInitialize failed with error code 1 (operation failed),exit
I modified the source code of direct-export-buf.c:72 to dig further, and can confirm a few things:
node=/dev/dri/renderD128which is the correct file path- the file does exist.
stat()returns the following struct:st_dev=6,st_ino=944,st_mode=8624,st_uid=0,st_gid=105,st_rdev=57984 - when
open()returnsfd == -1, this giveserrno = 22(Invalid argument)
Here is the file properties of /dev/dri/renderD128:
ls -last /dev/dri/renderD128
0 crw-rw----+ 1 root render 226, 128 Jan 14 04:39 /dev/dri/renderD128
I searched online for Invalid Argument error and found this link: https://stackoverflow.com/questions/11055060/possible-reasons-of-linux-open-call-returning-einval
I believe what is happening is that the /dev/dri/renderD128 file is created with some special async behavior, but the glibc library packaged with Debian 13 kernel 6.12.9-amd64 is not able to open it. So even though the file exists and can be stat(), it cannot be opened via open(). This might also explain why I can't get X11 to load with Nvidia drivers as well (the logs complained that it couldn't open some device properly).
I'm not sure where to go from here. How can we debug this behavior further? Any help would be appreciated, Firefox and Chrome is basically unusable for any video-related content, and I probably spent over 40 hours on this already lol.