Skip to content

SMART input works in 1.15.3 and fails in 1.16.0 with exact same config #8313

@chrishoage

Description

@chrishoage

Relevant telegraf.conf:

[[inputs.smart]]
  use_sudo = true
  devices = [
    "/dev/disk/by-id/ata-Crucial_CT525MX300SSD1_1651150FA577",
    "/dev/disk/by-id/ata-Crucial_CT525MX300SSD1_16431465A85A",
    "/dev/disk/by-id/scsi-SATA_HGST_HDN724040AL_PK1334PEJLL6NS",
    "/dev/disk/by-id/scsi-SATA_HGST_HDN724040AL_PK1334PEK49SBS",
    "/dev/disk/by-id/scsi-SATA_HGST_HDN724040AL_PK1334PEKDNZ0S",
    "/dev/disk/by-id/scsi-SATA_HGST_HDN724040AL_PK1334PEKDXVTS",
    "/dev/disk/by-id/scsi-SATA_HGST_HDN724040AL_PK2334PEJM9B3T",
    "/dev/disk/by-id/scsi-SATA_HGST_HDN724040AL_PK2334PEK4AXTT",
    "/dev/disk/by-id/scsi-SATA_WDC_WD40EFRX-68W_WD-WCC4E4FKJ5DV",
    "/dev/disk/by-id/scsi-SATA_WDC_WD40EFRX-68W_WD-WCC4E4FKJH1X",
    "/dev/disk/by-id/scsi-SATA_WDC_WD40EFRX-68W_WD-WCC4EECRN58H",
    "/dev/disk/by-id/scsi-SATA_WDC_WD40EFRX-68W_WD-WCC4EK8ZSK37",
    "/dev/disk/by-id/scsi-SATA_WDC_WD40EFRX-68W_WD-WCC4EM0WN624"
  ]

System info:

› uname -a
Linux cortex 5.4.0-52-generic #57-Ubuntu SMP Thu Oct 15 10:57:00 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

Docker

Steps to reproduce:

  1. Install 1.15.3
  2. run with config
  3. verify output with telegraf --test
  4. upgrade to 1.16.0
  5. run telegraf --test

Expected behavior:

Config to work after upgrade

Actual behavior:

Config fails.

Additional info:

I initially saw this error. After installing nvme-cli the error went away, but the SMART input would still not output anything

[inputs.smart] nvme not found: verify that nvme is installed and it is in your PATH (or specified in config) to gather vendor specific attributes: provided path does not exist: []
› sudo -u telegraf telegraf --config /etc/telegraf/telegraf.conf  --test | grep smart
2020-10-25T21:32:18Z I! Starting Telegraf 1.15.3
> smart_device,device=scsi-SATA_WDC_WD40EFRX-68W_WD-WCC4EM0WN624,host=cortex exit_status=2i 1603661539000000000
> smart_device,device=scsi-SATA_WDC_WD40EFRX-68W_WD-WCC4EECRN58H,host=cortex exit_status=2i 1603661539000000000
> smart_device,capacity=525112713216,device=ata-Crucial_CT525MX300SSD1_16431465A85A,enabled=Enabled,host=cortex,model=Crucial_CT525MX300SSD1,serial_no=16431465A85A,wwn=500a07511465a85a exit_status=0i,health_ok=true,read_error_rate=2i,temp_c=37i,udma_crc_errors=0i 1603661539000000000
> smart_device,device=scsi-SATA_WDC_WD40EFRX-68W_WD-WCC4E4FKJH1X,host=cortex exit_status=2i 1603661539000000000
> smart_device,device=scsi-SATA_WDC_WD40EFRX-68W_WD-WCC4E4FKJ5DV,host=cortex exit_status=2i 1603661539000000000
> smart_device,capacity=525112713216,device=ata-Crucial_CT525MX300SSD1_1651150FA577,enabled=Enabled,host=cortex,model=Crucial_CT525MX300SSD1,serial_no=1651150FA577,wwn=500a0751150fa577 exit_status=0i,health_ok=true,read_error_rate=0i,temp_c=36i,udma_crc_errors=0i 1603661539000000000
> smart_device,device=scsi-SATA_WDC_WD40EFRX-68W_WD-WCC4EK8ZSK37,host=cortex exit_status=2i 1603661539000000000
> smart_device,capacity=4000787030016,device=scsi-SATA_HGST_HDN724040AL_PK1334PEJLL6NS,enabled=Enabled,host=cortex,model=HGST\ HDN724040ALE640,serial_no=PK1334PEJLL6NS,wwn=5000cca250e4a210 exit_status=0i,health_ok=true,read_error_rate=0i,seek_error_rate=0i,temp_c=33i,udma_crc_errors=0i 1603661540000000000
> smart_device,capacity=4000787030016,device=scsi-SATA_HGST_HDN724040AL_PK2334PEJM9B3T,enabled=Enabled,host=cortex,model=HGST\ HDN724040ALE640,serial_no=PK2334PEJM9B3T,wwn=5000cca250e4f530 exit_status=0i,health_ok=true,read_error_rate=2i,seek_error_rate=0i,temp_c=36i,udma_crc_errors=0i 1603661540000000000
> smart_device,capacity=4000787030016,device=scsi-SATA_HGST_HDN724040AL_PK1334PEKDXVTS,enabled=Enabled,host=cortex,model=HGST\ HDN724040ALE640,serial_no=PK1334PEKDXVTS,wwn=5000cca250f02751 exit_status=0i,health_ok=true,read_error_rate=0i,seek_error_rate=0i,temp_c=33i,udma_crc_errors=0i 1603661540000000000
> smart_device,capacity=4000787030016,device=scsi-SATA_HGST_HDN724040AL_PK2334PEK4AXTT,enabled=Enabled,host=cortex,model=HGST\ HDN724040ALE640,serial_no=PK2334PEK4AXTT,wwn=5000cca250ec4105 exit_status=0i,health_ok=true,read_error_rate=0i,seek_error_rate=0i,temp_c=36i,udma_crc_errors=0i 1603661540000000000
> smart_device,capacity=4000787030016,device=scsi-SATA_HGST_HDN724040AL_PK1334PEKDNZ0S,enabled=Enabled,host=cortex,model=HGST\ HDN724040ALE640,serial_no=PK1334PEKDNZ0S,wwn=5000cca250f009ad exit_status=0i,health_ok=true,read_error_rate=0i,seek_error_rate=0i,temp_c=37i,udma_crc_errors=0i 1603661540000000000
> smart_device,capacity=4000787030016,device=scsi-SATA_HGST_HDN724040AL_PK1334PEK49SBS,enabled=Enabled,host=cortex,model=HGST\ HDN724040ALE640,serial_no=PK1334PEK49SBS,wwn=5000cca250ec3c9c exit_status=0i,health_ok=true,read_error_rate=0i,seek_error_rate=0i,temp_c=36i,udma_crc_errors=0i 1603661540000000000
› sudo cat /etc/sudoers.d/telegraf
Cmnd_Alias SMARTCTL = /usr/sbin/smartctl
telegraf ALL=(ALL) NOPASSWD: SMARTCTL
Defaults!SMARTCTL !logfile, !syslog, !pam_session

Cmnd_Alias NVME = /usr/sbin/nvme
telegraf ALL=(ALL) NOPASSWD: NVME
Defaults!NVME !logfile, !syslog, !pam_session
› sudo -u telegraf bash
telegraf@cortex:~$ which smartctl
/usr/sbin/smartctl
telegraf@cortex:~$ which nvme
/usr/sbin/nvme
telegraf@cortex:~$ sudo smartctl --info --attributes --health -n standby --format=brief /dev/disk/by-id/scsi-SATA_HGST_HDN724040AL_PK1334PEJLL6NS
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-52-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     HGST Deskstar NAS
Device Model:     HGST HDN724040ALE640
Serial Number:    PK1334PEJLL6NS
LU WWN Device Id: 5 000cca 250e4a210
Firmware Version: MJAOA5E0
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Sun Oct 25 14:29:35 2020 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
Power mode is:    ACTIVE or IDLE

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     PO-R--   100   100   016    -    0
  2 Throughput_Performance  P-S---   136   136   054    -    83
  3 Spin_Up_Time            POS---   165   165   024    -    497 (Average 440)
  4 Start_Stop_Count        -O--C-   100   100   000    -    47
  5 Reallocated_Sector_Ct   PO--CK   100   100   005    -    0
  7 Seek_Error_Rate         PO-R--   100   100   067    -    0
  8 Seek_Time_Performance   P-S---   121   121   020    -    34
  9 Power_On_Hours          -O--C-   096   096   000    -    32685
 10 Spin_Retry_Count        PO--C-   100   100   060    -    0
 12 Power_Cycle_Count       -O--CK   100   100   000    -    47
192 Power-Off_Retract_Count -O--CK   100   100   000    -    308
193 Load_Cycle_Count        -O--C-   100   100   000    -    308
194 Temperature_Celsius     -O----   181   181   000    -    33 (Min/Max 23/55)
196 Reallocated_Event_Count -O--CK   100   100   000    -    0
197 Current_Pending_Sector  -O---K   100   100   000    -    0
198 Offline_Uncorrectable   ---R--   100   100   000    -    0
199 UDMA_CRC_Error_Count    -O-R--   200   200   000    -    0
                            ||||||_ K auto-keep
                            |||||__ C event count
                            ||||___ R error rate
                            |||____ S speed/performance
                            ||_____ O updated online
                            |______ P prefailure warning

Here is the sample commands showing that downgrading works

chris at cortex in ~
› sudo apt-get upgrade telegraf
Reading package lists... Done
Building dependency tree       
Reading state information... Done
Calculating upgrade... Done
The following packages will be upgraded:
  telegraf
1 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
Need to get 0 B/21.8 MB of archives.
After this operation, 1,598 kB of additional disk space will be used.
Do you want to continue? [Y/n] y
(Reading database ... 113471 files and directories currently installed.)
Preparing to unpack .../telegraf_1.16.0-1_amd64.deb ...
Unpacking telegraf (1.16.0-1) over (1.15.3-1) ...
Setting up telegraf (1.16.0-1) ...
Installing new version of config file /etc/telegraf/telegraf.conf.sample ...
Synchronizing state of telegraf.service with SysV service script with /lib/systemd/systemd-sysv-install.
Executing: /lib/systemd/systemd-sysv-install enable telegraf

chris at cortex in ~
› sudo -u telegraf telegraf --config /etc/telegraf/telegraf.conf  --test | grep smart
2020-10-25T21:39:12Z I! Starting Telegraf 1.16.0

chris at cortex in ~
› sudo dpkg -i ~/downloads/telegraf_1.15.3-1_amd64.deb
dpkg: warning: downgrading telegraf from 1.16.0-1 to 1.15.3-1
(Reading database ... 113471 files and directories currently installed.)
Preparing to unpack .../telegraf_1.15.3-1_amd64.deb ...
Unpacking telegraf (1.15.3-1) over (1.16.0-1) ...
Setting up telegraf (1.15.3-1) ...
Installing new version of config file /etc/telegraf/telegraf.conf.sample ...
Synchronizing state of telegraf.service with SysV service script with /lib/systemd/systemd-sysv-install.
Executing: /lib/systemd/systemd-sysv-install enable telegraf

chris at cortex in ~
› sudo -u telegraf telegraf --config /etc/telegraf/telegraf.conf  --test | grep smart
2020-10-25T21:39:48Z I! Starting Telegraf 1.15.3
> smart_device,capacity=525112713216,device=ata-Crucial_CT525MX300SSD1_1651150FA577,enabled=Enabled,host=cortex,model=Crucial_CT525MX300SSD1,serial_no=1651150FA577,wwn=500a0751150fa577 exit_status=0i,health_ok=true,read_error_rate=0i,temp_c=36i,udma_crc_errors=0i 1603661989000000000
> smart_device,capacity=525112713216,device=ata-Crucial_CT525MX300SSD1_16431465A85A,enabled=Enabled,host=cortex,model=Crucial_CT525MX300SSD1,serial_no=16431465A85A,wwn=500a07511465a85a exit_status=0i,health_ok=true,read_error_rate=2i,temp_c=37i,udma_crc_errors=0i 1603661989000000000
> smart_device,capacity=4000787030016,device=scsi-SATA_WDC_WD40EFRX-68W_WD-WCC4E4FKJH1X,enabled=Enabled,host=cortex,model=WDC\ WD40EFRX-68WT0N0,serial_no=WD-WCC4E4FKJH1X,wwn=50014ee20a70d5a0 exit_status=0i,health_ok=true,read_error_rate=0i,seek_error_rate=0i,temp_c=29i,udma_crc_errors=0i 1603661989000000000
> smart_device,capacity=4000787030016,device=scsi-SATA_WDC_WD40EFRX-68W_WD-WCC4EM0WN624,enabled=Enabled,host=cortex,model=WDC\ WD40EFRX-68WT0N0,serial_no=WD-WCC4EM0WN624,wwn=50014ee2b51b9d7f exit_status=0i,health_ok=true,read_error_rate=0i,seek_error_rate=0i,temp_c=30i,udma_crc_errors=0i 1603661989000000000
> smart_device,capacity=4000787030016,device=scsi-SATA_WDC_WD40EFRX-68W_WD-WCC4EK8ZSK37,enabled=Enabled,host=cortex,model=WDC\ WD40EFRX-68WT0N0,serial_no=WD-WCC4EK8ZSK37,wwn=50014ee2b51c8ebd exit_status=0i,health_ok=true,read_error_rate=0i,seek_error_rate=0i,temp_c=30i,udma_crc_errors=0i 1603661989000000000
> smart_device,capacity=4000787030016,device=scsi-SATA_WDC_WD40EFRX-68W_WD-WCC4EECRN58H,enabled=Enabled,host=cortex,model=WDC\ WD40EFRX-68WT0N0,serial_no=WD-WCC4EECRN58H,wwn=50014ee20a98bd99 exit_status=0i,health_ok=true,read_error_rate=0i,seek_error_rate=0i,temp_c=33i,udma_crc_errors=0i 1603661989000000000
> smart_device,capacity=4000787030016,device=scsi-SATA_WDC_WD40EFRX-68W_WD-WCC4E4FKJ5DV,enabled=Enabled,host=cortex,model=WDC\ WD40EFRX-68WT0N0,serial_no=WD-WCC4E4FKJ5DV,wwn=50014ee25fc65114 exit_status=0i,health_ok=true,read_error_rate=0i,seek_error_rate=0i,temp_c=29i,udma_crc_errors=0i 1603661989000000000
> smart_device,capacity=4000787030016,device=scsi-SATA_HGST_HDN724040AL_PK1334PEK49SBS,enabled=Enabled,host=cortex,model=HGST\ HDN724040ALE640,serial_no=PK1334PEK49SBS,wwn=5000cca250ec3c9c exit_status=0i,health_ok=true,read_error_rate=0i,seek_error_rate=0i,temp_c=36i,udma_crc_errors=0i 1603661990000000000
> smart_device,capacity=4000787030016,device=scsi-SATA_HGST_HDN724040AL_PK1334PEJLL6NS,enabled=Enabled,host=cortex,model=HGST\ HDN724040ALE640,serial_no=PK1334PEJLL6NS,wwn=5000cca250e4a210 exit_status=0i,health_ok=true,read_error_rate=0i,seek_error_rate=0i,temp_c=34i,udma_crc_errors=0i 1603661990000000000
> smart_device,capacity=4000787030016,device=scsi-SATA_HGST_HDN724040AL_PK1334PEKDXVTS,enabled=Enabled,host=cortex,model=HGST\ HDN724040ALE640,serial_no=PK1334PEKDXVTS,wwn=5000cca250f02751 exit_status=0i,health_ok=true,read_error_rate=0i,seek_error_rate=0i,temp_c=34i,udma_crc_errors=0i 1603661990000000000
> smart_device,capacity=4000787030016,device=scsi-SATA_HGST_HDN724040AL_PK2334PEJM9B3T,enabled=Enabled,host=cortex,model=HGST\ HDN724040ALE640,serial_no=PK2334PEJM9B3T,wwn=5000cca250e4f530 exit_status=0i,health_ok=true,read_error_rate=2i,seek_error_rate=0i,temp_c=36i,udma_crc_errors=0i 1603661990000000000
> smart_device,capacity=4000787030016,device=scsi-SATA_HGST_HDN724040AL_PK2334PEK4AXTT,enabled=Enabled,host=cortex,model=HGST\ HDN724040ALE640,serial_no=PK2334PEK4AXTT,wwn=5000cca250ec4105 exit_status=0i,health_ok=true,read_error_rate=0i,seek_error_rate=0i,temp_c=36i,udma_crc_errors=0i 1603661990000000000
> smart_device,capacity=4000787030016,device=scsi-SATA_HGST_HDN724040AL_PK1334PEKDNZ0S,enabled=Enabled,host=cortex,model=HGST\ HDN724040ALE640,serial_no=PK1334PEKDNZ0S,wwn=5000cca250f009ad exit_status=0i,health_ok=true,read_error_rate=0i,seek_error_rate=0i,temp_c=36i,udma_crc_errors=0i 1603661990000000000

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugunexpected problem or unintended behavior

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions