Skip to content

Handle thermal zones which return ENODATA for temp #293

@candlerb

Description

@candlerb

This is linked to prometheus/node_exporter#1703 - full details to reproduce are there.

I have a system with 5 thermal zones. However one of them returns ENODATA for the temp field:

# cat /sys/class/thermal/thermal_zone*/temp
27800
29800
41000
53000
cat: /sys/class/thermal/thermal_zone4/temp: No data available

# cat /sys/class/thermal/thermal_zone4/type
iwlwifi_1
# cat /sys/class/thermal/thermal_zone4/policy
step_wise
# cat /sys/class/thermal/thermal_zone4/temp
cat: /sys/class/thermal/thermal_zone4/temp: No data available

This error causes node_exporter to return no thermal data at all: in the loop which collects results, if any single stat returns an error, an empty slice is returned.

Temp used to be uint64 but is now int64. This means that returning a zero-value isn't a good way of signalling "no value".

There is existing optional handling of 'Mode' and 'Passive' attributes. In the case of 'Passive' it returns a *uint64 instead of uint64.

So the options I can see are:

  1. Change Temp from int64 to *int64. That obviously has knock-on effects on all consumers of this code. However such changes are not unknown - such as the uint64 to int64 change already mentioned, which was done ~2 weeks ago in de3a686. In fact, maybe now is a good time to do this.

    node_exporter in particular would need the corresponding change, but I don't know what the other main consumers of this library are.

  2. Change the loop which collects thermal zones so that it silently drops a zone which generates ENODATA. No client changes required. Proof-of-concept:

--- a/vendor/github.com/prometheus/procfs/sysfs/class_thermal.go
+++ b/vendor/github.com/prometheus/procfs/sysfs/class_thermal.go
@@ -16,9 +16,11 @@
 package sysfs

 import (
+       "errors"
        "os"
        "path/filepath"
        "strings"
+       "syscall"

        "github.com/prometheus/procfs/internal/util"
 )
@@ -43,16 +45,19 @@ func (fs FS) ClassThermalZoneStats() ([]ClassThermalZoneStats, error) {
        }

        var zoneStats = ClassThermalZoneStats{}
-       stats := make([]ClassThermalZoneStats, len(zones))
-       for i, zone := range zones {
+       stats := make([]ClassThermalZoneStats, 0, len(zones))
+       for _, zone := range zones {
                zoneName := strings.TrimPrefix(filepath.Base(zone), "thermal_zone")

                zoneStats, err = parseClassThermalZone(zone)
                if err != nil {
+                       if errors.Is(err, syscall.ENODATA) {
+                               continue
+                       }
                        return []ClassThermalZoneStats{}, err
                }
                zoneStats.Name = zoneName
-               stats[i] = zoneStats
+               stats = append(stats, zoneStats)
        }
        return stats, nil
 }

For my use case this is good enough, expecially since node_exporter ignores everything apart from Temp in struct ClassThermalZoneStats - here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions