Skip to content

New input plugin for RAS (Reliability, Availability and Serviceability) #8086

@p-zak

Description

@p-zak

Feature Request

We have created a Telegraf input plugin that is able to gather and count errors based on data provided by RASDaemon.

Proposal:

The plugin we have created will be able to provide following metrics:

  • memory_read_corrected_errors
  • memory_read_uncorrectable_errors
  • memory_write_corrected_errors
  • memory_write_uncorrectable_errors
  • cache_l0_l1_errors
  • tlb_instruction_errors
  • cache_l2_errors
  • upi_errors
  • processor_base_errors
  • processor_bus_errors
  • internal_timer_errors
  • smm_handler_code_access_violation_errors
  • internal_parity_errors
  • frc_errors
  • external_mce_errors
  • microcode_rom_parity_errors
  • unclassified_mce_errors

I am creating this issue to see if there are any questions or concerns about this before creating a PR containing the RAS input plugin.

Current behavior:

Currently it is not possible to gather RAS data.

Desired behavior:

This plugin will allow RAS data to be gathered.

Use case:

Users would be able to gather Platform Reliability, Availability and Serviceability data.

Metadata

Metadata

Assignees

No one assigned

    Labels

    feature requestRequests for new plugin and for new features to existing plugins

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions