Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: lablup/all-smi
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: v0.18.1
Choose a base ref
...
head repository: lablup/all-smi
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: v0.19.0
Choose a head ref
  • 4 commits
  • 17 files changed
  • 2 contributors

Commits on Apr 8, 2026

  1. fix: cache platform detection results to avoid per-frame system_profi…

    …ler (#149)
    
    Platform detection functions (has_nvidia, has_gaudi, etc.) were re-evaluated
    on every view refresh from update_notifications, executing system_profiler
    SPPCIDataType once per frame on macOS. The probe takes hundreds of ms and
    spawns processes repeatedly even though hardware presence never changes at
    runtime.
    
    Wrap each detection function in a process-global OnceLock so the underlying
    probe runs at most once per process. Also collapse nested ifs flagged by
    clippy in macOS device readers.
    inureyes authored Apr 8, 2026
    Configuration menu
    Copy the full SHA
    615d5e5 View commit details
    Browse the repository at this point in the history
  2. fix: decode SMC float sensors as little-endian on Apple Silicon (#150)

    Apple Silicon's SMC stores `flt ` (IEEE 754 single-precision) sensor
    values in little-endian byte order, unlike the legacy SP78/FP* fixed
    point types which remain big-endian. The convert_value() helper was
    using f32::from_be_bytes() for FLT, so every Tg*/Tp*/Te* temperature
    read returned a randomly varying garbage float (e.g. 1e-32, 3e36) and
    the (10..=120) sanity filter rejected almost everything. Symptoms in
    local view mode: GPU temperature showed 0 °C or sporadic 6/7 °C, CPU
    temperature showed 0 °C, dashboard "Avg. Temp" showed thermal pressure
    text only because the numeric path produced nonsense.
    
    Switching the FLT branch to from_le_bytes() restores real die
    temperatures (~50–60 °C idle on M1 Ultra). Verified by dumping the raw
    SMC response buffer: bytes always landed at offset 48 with consistent
    LE-encoded floats matching expected die temps.
    
    While here, several related correctness issues were also addressed:
    
    * Dashboard and aggregator looked up the GPU detail key as
      "Architecture" but apple_silicon_native (and the prometheus exporter)
      write it lowercase as "architecture". The mismatch silently disabled
      the entire is_apple_silicon special-case path. Standardise on the
      lowercase form everywhere, including the NVIDIA and Jetson readers.
    * cpu_macos::get_cpu_temperature was a stub that always returned None,
      so the live "CPU Temp." gauge was permanently 0 °C even though SMC
      was already collecting the value. Wire it through the cached
      NativeMetricsData fetched in get_apple_silicon_cpu_info.
    * apple_silicon_native now falls back to the SMC CPU die temperature
      when the GPU sensor is unavailable. CPU and GPU share the same SoC
      die so the readings track each other closely; this is far more
      meaningful than reporting 0 °C.
    * Dashboard "Avg. Temp" cell now shows numeric °C on every platform.
      On Apple Silicon the second-row "Temp. Stdev" cell becomes a
      "Thermal" cell carrying the qualitative thermal pressure level
      (single-die std dev is meaningless), so both pieces of information
      remain visible.
    * The per-GPU list view used to display thermal pressure text on Apple
      Silicon for the same reason; it now shows the real numeric die temp,
      consistent with every other platform.
    
    Added a regression unit test (test_flt_little_endian_decoding) so that
    the endianness can't silently flip back.
    inureyes authored Apr 8, 2026
    Configuration menu
    Copy the full SHA
    3cd6b76 View commit details
    Browse the repository at this point in the history
  3. fix: widen process list TIME+ column to prevent Command drift (#151)

    The TIME+ column was fixed at 8 chars, but format_cpu_time can produce
    values up to 10 chars ("8760:00:00" at the 365-day cap). Values like
    "213:16:04" (9 chars) overflow the column and push the Command column
    right, so rows with 9-char times no longer align with the header or
    with rows that have shorter times.
    
    Widen fixed_widths[11] from 8 to 10 so all possible TIME+ values
    right-align cleanly in the column and Command stays at a consistent
    position. Document the width invariant in format_cpu_time and add a
    regression test that enforces the maximum output width.
    inureyes authored Apr 8, 2026
    Configuration menu
    Copy the full SHA
    2ddd47a View commit details
    Browse the repository at this point in the history
  4. release: v0.19.0

    inureyes committed Apr 8, 2026
    Configuration menu
    Copy the full SHA
    009549f View commit details
    Browse the repository at this point in the history
Loading