Skip to content

add additional stats from mdstat#380

Merged
SuperQ merged 4 commits intoprometheus:masterfrom
johnseekins:mdstats
Jul 5, 2021
Merged

add additional stats from mdstat#380
SuperQ merged 4 commits intoprometheus:masterfrom
johnseekins:mdstats

Conversation

@johnseekins
Copy link
Copy Markdown
Contributor

In the recovery line for mdstat, we can also track percentage complete, estimated time to completion, and current recovery write speed. This MR adds those additional stats.

@discordianfish
Copy link
Copy Markdown
Member

We need to make sure this works for all kernels we support. If that's the case, LGTM!

@johnseekins
Copy link
Copy Markdown
Contributor Author

This should gracefully handle different kernels...and I believe the mdstat file format hasn't changed in quite some time.

treydock and others added 3 commits April 28, 2021 12:22
Counters added:
* excessive_buffer_overrun_errors
* local_link_integrity_errors

Signed-off-by: Trey Dockendorf <tdockendorf@osc.edu>
Signed-off-by: John Seekins <jseekins@datto.com>
Signed-off-by: John Seekins <jseekins@datto.com>
Signed-off-by: John Seekins <jseekins@datto.com>
Signed-off-by: John Seekins <jseekins@datto.com>
@johnseekins
Copy link
Copy Markdown
Contributor Author

@discordianfish Is there a standard way to prove out the "kernels supported"? I know this works on 4.19, for example.

@SuperQ
Copy link
Copy Markdown
Member

SuperQ commented May 27, 2021

Typically it involves a lot of browsing the kernel source tree. :-(

To note, we want to keep kernel support all the way back to 2.6.23.

@discordianfish
Copy link
Copy Markdown
Member

Unfortunately I don't think we have a way. We should probably collect fixtures for multiple kernel versions. It's silly but I'd probably spin up a VM with e.g 2.6.23 and run tests there..

@johnseekins
Copy link
Copy Markdown
Contributor Author

johnseekins commented Jun 8, 2021

If that's the case...how did y'all validate this collector in the first place? While I appreciate that we should try to validate this as much as possible, it does fail gracefully on missing stats.
I don't have the ability to easily test this across many different kernels myself, but it at least seems to work based on the built-in tests.

@SuperQ
Copy link
Copy Markdown
Member

SuperQ commented Jun 23, 2021

Lots of trial and error. The current mdstat fixture is a collection of various examples from bugs reported by users.

Copy link
Copy Markdown
Member

@discordianfish discordianfish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough, I think we should extend the fixtures to cover more kernel versions but until then, I think it's fair to support the procfs files as shown in the fixtures. So LGTM.
@SuperQ wdyt?

Copy link
Copy Markdown
Member

@SuperQ SuperQ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, adding new features while we maintain backwards compatibility is what I desire.

LGTM

@SuperQ SuperQ merged commit 99f3c37 into prometheus:master Jul 5, 2021
remijouannet pushed a commit to remijouannet/procfs that referenced this pull request Oct 20, 2022
* Add several Infiniband counters

Counters added:
* excessive_buffer_overrun_errors
* local_link_integrity_errors

Signed-off-by: Trey Dockendorf <tdockendorf@osc.edu>
Signed-off-by: John Seekins <jseekins@datto.com>

* add additional stats from mdstat

Signed-off-by: John Seekins <jseekins@datto.com>

* return successful values every time

Signed-off-by: John Seekins <jseekins@datto.com>

* add count of 'downed' disks

Signed-off-by: John Seekins <jseekins@datto.com>

Co-authored-by: Trey Dockendorf <tdockendorf@osc.edu>
jritter pushed a commit to jritter/procfs that referenced this pull request Jul 15, 2024
* Add several Infiniband counters

Counters added:
* excessive_buffer_overrun_errors
* local_link_integrity_errors

Signed-off-by: Trey Dockendorf <tdockendorf@osc.edu>
Signed-off-by: John Seekins <jseekins@datto.com>

* add additional stats from mdstat

Signed-off-by: John Seekins <jseekins@datto.com>

* return successful values every time

Signed-off-by: John Seekins <jseekins@datto.com>

* add count of 'downed' disks

Signed-off-by: John Seekins <jseekins@datto.com>

Co-authored-by: Trey Dockendorf <tdockendorf@osc.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants