fix: stabilize GPU Freq display on idle Apple Silicon (#216)#217
Conversation
The `Freq:` field in the `view` TUI's GPU List row flickered on near-idle Apple Silicon Macs, vanishing for a refresh cycle and shifting every field after it (`Pwr:` etc.) left. Two compounding bugs were at play. Bug A — data layer (`src/device/macos_native/ioreport.rs`). `calc_gpu_freq_with_table` and `calc_freq_from_residencies` return `(0, 0.0)` whenever the GPU/cluster is present (`total_residency > 0`) but spent the entire 100 ms sampling window in `IDLE`/`OFF`/`DOWN` states. That's the normal condition for an idle Apple GPU — it isn't running at 0 Hz, it's parked at its lowest P-state (~350 MHz on an M1). The 0 then propagates all the way to `GpuInfo.frequency`. Fix: when active residency is zero but the cluster is present, fall back to the parked clock instead of 0. `calc_gpu_freq_with_table` uses `freq_table[0]` (lowest entry of the IOKit pmgr table, which is sorted ascending). `calc_freq_from_residencies` tracks the minimum parseable active-state frequency during its scan and uses that — covering both the GPU fallback path (no pmgr table available) and CPU clusters via `process_cpu_channel`. When neither signal is available (`freq_table` empty / no parseable active states), `(0, 0.0)` is still returned and the renderer substitutes N/A. Bug B — render layer (`src/ui/renderers/gpu_renderer.rs`). The entire `Freq:` label+value block was gated on `info.frequency > 0`, so any sample where the data layer reported 0 made the field — and the layout of every field after it — vanish for that cycle. Fix: mirror the existing `Util:` / `VRAM:` / `Temp:` patterns in the same function. Always print the `Freq:` label, and substitute `N/A` when the value is 0. This is cross-platform: readers that statically report `frequency: 0` (Rebellions, Intel Gaudi, AMD via WMI) will now show ` Freq: N/A` instead of hiding the field, which makes row layout consistent across all device types. NVIDIA / Jetson rarely hit 0 (NVML reports a stable idle graphics clock; sysfs `cur_freq` is always readable when accessible), so they are unaffected in practice. Tests. Adds four new unit tests in `ioreport.rs` covering the new idle-state behaviour: all-idle with no parseable states (returns 0), all-idle with known states at zero residency (returns the min parsed frequency), fully-idle GPU with a non-empty freq table (returns the lowest P-state), and fully-idle GPU with an empty freq table (returns 0). The existing `test_calc_freq_from_residencies` and `test_calc_gpu_freq_with_table` exercises the active path and is unchanged. Closes #216
PR Finalization CompleteTestsAll 1,135 lib tests pass (999 fast + 136 doc/integration). The 4 new macOS-gated unit tests in `src/device/macos_native/ioreport.rs` cover all new idle-state branches exhaustively:
No additional tests are needed. macOS gating is correct for macOS-only code paths. DocumentationScanned `README.md`, `API.md`, `docs/man/all-smi.1`, and `docs/ARCHITECTURE.md`. No existing doc contained misleading statements about the `Freq:` field disappearing or returning 0 for idle Apple Silicon — there was nothing to correct. No documentation changes were made. Lint / Format
PR BodyAdded a one-line OpenMetrics side effect note under Bug A's description: the `all_smi_gpu_frequency_mhz` gauge for idle Apple Silicon will now export the parked P-state clock (~350 MHz on M1) instead of 0, eliminating zero-gaps in Prometheus/Grafana dashboards. The rest of the body is unchanged. Status
Ready to proceed to `status:done`. |
Summary
Closes #216. The `Freq:` field in the `view` TUI's GPU List row was flickering on near-idle Apple Silicon Macs, vanishing for a refresh cycle and shifting every field after it (`Pwr:` etc.) left. Two compounding bugs were at play; this PR fixes both.
Bug A — data layer (Apple Silicon)
`src/device/macos_native/ioreport.rs`. `calc_gpu_freq_with_table` and `calc_freq_from_residencies` were returning `(0, 0.0)` whenever the GPU/cluster was present (`total_residency > 0`) but spent the entire 100ms sampling window in `IDLE`/`OFF`/`DOWN` states — the normal condition for an idle Apple GPU. That 0 propagated all the way to `GpuInfo.frequency`. An idle Apple GPU is not running at 0 Hz; it is parked at its lowest P-state (~350 MHz on M1).
Fix: when active residency is zero but the cluster is present, fall back to the parked clock instead of 0.
OpenMetrics side effect: The `all_smi_gpu_frequency_mhz` gauge for idle Apple Silicon will now export the parked P-state clock (e.g. ~350 MHz on M1) instead of 0, eliminating misleading zero-gaps in Prometheus/Grafana dashboards during idle periods.
Bug B — render layer (cross-platform)
`src/ui/renderers/gpu_renderer.rs`. The entire `Freq:` label+value block was gated on `info.frequency > 0`. Any refresh where Bug A produced 0 made the field — and therefore every field after it on the row — vanish.
Fix: mirror the existing `Util:` / `VRAM:` / `Temp:` patterns in the same function. Always render the `Freq:` label, substitute `N/A` (right-aligned at width 7, matching `Temp:`'s N/A width) when the value is 0.
This change is intentionally cross-platform. Readers that statically report `frequency: 0` (Rebellions, Intel Gaudi, AMD via WMI) will now show ` Freq: N/A` instead of hiding the field, making row layout consistent across all device types. NVIDIA / Jetson rarely hit 0 (NVML reports a stable idle graphics clock; sysfs `cur_freq` is always readable when accessible), so they are unaffected in practice — no spurious N/A.
Out of scope
Per the issue's "What NOT to change" section, this PR deliberately does NOT change `GpuInfo.frequency` from `u32` to `Option`. After Fix A, the only way `frequency == 0` reaches the renderer is genuine "no data", which the renderer now handles via N/A. Changing the type would have touched every reader, the JSON API, and the OpenMetrics exporter — out of scope here.
Test plan
Files touched