Add extended hardware details (NUMA, GSP firmware, NvLink remote device)

## Problem / Background

The `nvml-wrapper` crate v0.12.0 introduced several new hardware detail APIs that are not yet exposed in all-smi:

- **NUMA node ID** (`numa_node_id`) — identifies the NUMA node a GPU is attached to
- **GSP firmware mode and version** (`gsp_firmware_mode`, `gsp_firmware_version`) — reports GPU System Processor firmware state
- **NvLink remote device type** (`remote_device_type`) — identifies what is connected on the other end of each NvLink
- **GPU Performance Monitoring (GPM)** — fine-grained SM occupancy and other performance counters

These details are valuable for topology-aware scheduling, firmware auditing, and interconnect diagnostics in multi-GPU / multi-node environments.

## Goal

Expose detailed hardware topology and firmware information so operators can inspect NUMA placement, GSP firmware health, NvLink interconnect topology, and GPU performance counters from a single tool.

## Scope

- [x] Read NUMA node ID per GPU for topology-aware monitoring
- [x] Read GSP firmware mode (enabled / disabled / default) and version string
- [x] Read NvLink remote device type for each active link (GPU, CPU/host bridge, NvSwitch, etc.)
- [x] Integrate GPU Performance Monitoring (GPM) metrics where available (e.g., SM occupancy, memory bandwidth utilization) — support detection + metric plumbing complete; two-sample handshake deferred to follow-up (see `collect_gpm_metrics` in `src/device/readers/nvidia_hardware.rs`)
- [x] Display hardware details in TUI info/detail view (e.g., a dedicated "Hardware Details" section or tab)
- [x] Export hardware detail metrics in Prometheus format (`all_smi_numa_node_id`, `all_smi_gsp_firmware_mode`, `all_smi_nvlink_remote_device_type`, GPM gauges, etc.)

## Technical Considerations

- These APIs are NVIDIA-specific; guard behind the existing NVIDIA GPU reader path.
- `numa_node_id` may return an error on platforms without NUMA support — handle gracefully.
- GSP firmware APIs may not be available on older driver versions — degrade gracefully and omit the metric.
- NvLink enumeration should iterate over all supported links and skip inactive ones.
- GPM support depends on GPU architecture (Hopper+); detect availability before querying.
- Ensure mock server generates plausible values for new fields so TUI and API tests work without real hardware.

## Acceptance Criteria

- [x] NUMA node ID is collected and displayed per GPU in both TUI and Prometheus output
- [x] GSP firmware mode and version are collected and displayed per GPU
- [x] NvLink remote device type is collected and displayed per link per GPU
- [x] GPM metrics are collected and exported when the GPU supports them (support-detection + mock/parser/exporter paths complete; live two-sample collection follow-up tracked in `collect_gpm_metrics` docstring)
- [x] All new fields degrade gracefully (no crash, no error log spam) on unsupported hardware/drivers
- [x] Mock server provides representative test data for all new fields
- [x] Prometheus metric names follow existing naming conventions (`all_smi_*`)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add extended hardware details (NUMA, GSP firmware, NvLink remote device) #132

Problem / Background

Goal

Scope

Technical Considerations

Acceptance Criteria

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Add extended hardware details (NUMA, GSP firmware, NvLink remote device) #132

Description

Problem / Background

Goal

Scope

Technical Considerations

Acceptance Criteria

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions