Summary
On Linux hosts with an AMD GPU present, each call into the public API leaks a single file descriptor that is never released. The following all leak one fd per call on the reporter's system:
AllSmi::new()
get_gpu_readers()
AmdGpuReader::default() / AmdGpuReader::new()
With a typical per-process fd limit of 1024, repeatedly instantiating these (~1020 times) exhausts the limit and crashes the process. These descriptors should be released on drop.
Reported by @joshhansen with a minimal reproduction: https://github.com/joshhansen/allsmi-libamdgpu_top-bug (run cargo run --release). Reproduced on commit d5b678d.
Background
The leak originates in the upstream libamdgpu_top crate, not in all-smi's own code. DevicePath opened the DRM device via into_raw_fd(), which transfers ownership away from the RAII wrapper so the descriptor is never closed. Every new AmdGpuReader (constructed through get_gpu_readers() → AllSmi::new()) opens a fresh device handle, so file descriptors accumulate one per instantiation.
all-smi currently exact-pins libamdgpu_top = "=0.11.4" (Cargo.toml:76). That pin was added in #207 (closing #205) because 0.11.4 renamed get_all_proc_usage → update_proc_usage in a patch release, violating semver and breaking caret resolution on fresh installs. The pin comment explicitly notes it must be re-evaluated when bumping.
Upstream issue: Umio-Yasuno/amdgpu_top#163
Proposed Solution
Bump the exact pin to libamdgpu_top = "=0.11.5".
Version 0.11.5 (published 2026-05-18) fixes the leak in upstream commit 8ade0d5: DevicePath now caches an Arc<OwnedFd> in a OnceLock<Arc<OwnedFd>> and get_fd() returns the cached RawFd, so the descriptor is owned and closed via RAII on drop instead of leaked.
Implementation Notes
Cargo.toml:76 — change libamdgpu_top = "=0.11.4" → "=0.11.5"; update the adjacent comment (currently explaining the 0.11.4 rename rationale) to also note the fd-leak fix and that 0.11.5 retains the update_proc_usage API.
- Regenerate the lockfile:
cargo update -p libamdgpu_top --precise 0.11.5.
- No source changes expected:
- all-smi never calls the changed
get_fd() API (verified — no get_fd/into_raw_fd usage anywhere in src/), so the upstream signature change io::Result<RawFd> → RawFd does not affect our call sites.
- The only AMD API call site,
update_proc_usage at src/device/readers/amd.rs:572, was introduced in 0.11.4 and is retained in 0.11.5.
- Confirm with
cargo build --release and cargo clippy on a Linux glibc target.
- Severity context: all-smi's own
view/api binaries construct readers once — outside the collection loop (src/api/collection_loop.rs:60) and behind the guarded one-time init (src/view/data_collection/local_collector.rs:139) — so the CLI leaks at most one fd per AMD reader, once. The unbounded leak primarily affects consumers of the public library API (AllSmi::new, get_gpu_readers, AmdGpuReader) that re-instantiate repeatedly, which is the reporter's scenario.
- Build matrix scope: the dependency only applies to
cfg(all(target_os = "linux", not(target_env = "musl"))); musl/static and non-Linux targets are unaffected.
Acceptance Criteria
Original Suggestion
Title: AllSmi::new, get_gpu_readers, and AmdGpuReader::new leak file descriptors on Linux with AMD GPU present
A minimal reproduction can be seen here: https://github.com/joshhansen/allsmi-libamdgpu_top-bug
Just run cargo run --release.
This appears to stem ultimately from crate libamdgpu_top --- I've filed an issue there but wanted to make this project aware.
These three lines of code invoking the allsmi API each appear to leak a single file descriptor on my system:
AllSmi::new().unwrap();
get_gpu_readers();
AmdGpuReader::default()
Since my per-process file descriptor limit is 1024, all it takes to trigger this is to instantiate AllSmi 1020 times. (The process's own fds do the rest.)
It should be expected that these would be cleaned up on drop.
Thanks
Summary
On Linux hosts with an AMD GPU present, each call into the public API leaks a single file descriptor that is never released. The following all leak one fd per call on the reporter's system:
AllSmi::new()get_gpu_readers()AmdGpuReader::default()/AmdGpuReader::new()With a typical per-process fd limit of 1024, repeatedly instantiating these (~1020 times) exhausts the limit and crashes the process. These descriptors should be released on drop.
Reported by @joshhansen with a minimal reproduction: https://github.com/joshhansen/allsmi-libamdgpu_top-bug (run
cargo run --release). Reproduced on commitd5b678d.Background
The leak originates in the upstream
libamdgpu_topcrate, not in all-smi's own code.DevicePathopened the DRM device viainto_raw_fd(), which transfers ownership away from the RAII wrapper so the descriptor is never closed. Every newAmdGpuReader(constructed throughget_gpu_readers()→AllSmi::new()) opens a fresh device handle, so file descriptors accumulate one per instantiation.all-smi currently exact-pins
libamdgpu_top = "=0.11.4"(Cargo.toml:76). That pin was added in #207 (closing #205) because 0.11.4 renamedget_all_proc_usage→update_proc_usagein a patch release, violating semver and breaking caret resolution on fresh installs. The pin comment explicitly notes it must be re-evaluated when bumping.Upstream issue: Umio-Yasuno/amdgpu_top#163
Proposed Solution
Bump the exact pin to
libamdgpu_top = "=0.11.5".Version 0.11.5 (published 2026-05-18) fixes the leak in upstream commit
8ade0d5:DevicePathnow caches anArc<OwnedFd>in aOnceLock<Arc<OwnedFd>>andget_fd()returns the cachedRawFd, so the descriptor is owned and closed via RAII on drop instead of leaked.Implementation Notes
Cargo.toml:76— changelibamdgpu_top = "=0.11.4"→"=0.11.5"; update the adjacent comment (currently explaining the 0.11.4 rename rationale) to also note the fd-leak fix and that 0.11.5 retains theupdate_proc_usageAPI.cargo update -p libamdgpu_top --precise 0.11.5.get_fd()API (verified — noget_fd/into_raw_fdusage anywhere insrc/), so the upstream signature changeio::Result<RawFd>→RawFddoes not affect our call sites.update_proc_usageatsrc/device/readers/amd.rs:572, was introduced in 0.11.4 and is retained in 0.11.5.cargo build --releaseandcargo clippyon a Linux glibc target.view/apibinaries construct readers once — outside the collection loop (src/api/collection_loop.rs:60) and behind the guarded one-time init (src/view/data_collection/local_collector.rs:139) — so the CLI leaks at most one fd per AMD reader, once. The unbounded leak primarily affects consumers of the public library API (AllSmi::new,get_gpu_readers,AmdGpuReader) that re-instantiate repeatedly, which is the reporter's scenario.cfg(all(target_os = "linux", not(target_env = "musl"))); musl/static and non-Linux targets are unaffected.Acceptance Criteria
libamdgpu_toppinned to=0.11.5inCargo.toml, with the explanatory comment updated.Cargo.lockregenerated to 0.11.5.cargo build --releaseandcargo clippypass on Linux glibc (AMD code path compiles;update_proc_usagecall site unchanged).AllSmi::new()/get_gpu_readers()/AmdGpuReader::default()(>1020 iterations) no longer grows the process fd count unbounded and no longer crashes — verified against the reporter's reproduction.view/apimodes still report AMD GPU info correctly).Original Suggestion
Title:
AllSmi::new,get_gpu_readers, andAmdGpuReader::newleak file descriptors on Linux with AMD GPU presentA minimal reproduction can be seen here: https://github.com/joshhansen/allsmi-libamdgpu_top-bug
Just run
cargo run --release.This appears to stem ultimately from crate
libamdgpu_top--- I've filed an issue there but wanted to make this project aware.These three lines of code invoking the allsmi API each appear to leak a single file descriptor on my system:
AllSmi::new().unwrap();get_gpu_readers();AmdGpuReader::default()Since my per-process file descriptor limit is 1024, all it takes to trigger this is to instantiate
AllSmi1020 times. (The process's own fds do the rest.)It should be expected that these would be cleaned up on drop.
Thanks