Per-object Prometheus metrics: avoid duplicate HELP, TYPE metadata lines#15610
Per-object Prometheus metrics: avoid duplicate HELP, TYPE metadata lines#15610michaelklishin merged 3 commits intomainfrom
Conversation
for Raft metrics. Raft metrics can and do come from different Ra systems, namely Khepri and quorum queues. We need to format them as a "single" metric to avoid duplicate HELP, TYPE metadata lines. Since quorum queues have dozens of metrics, we filter out a set of Raft-related ones specifically that combine well with the Raft metrics from Khepri. Closes #15600.
|
there is also https://github.com/rabbitmq/khepri/blob/main/src/khepri_cluster.erl#L326 will this work with other queue types? |
Previously a missing metric was ignored.
mkuratczyk
left a comment
There was a problem hiding this comment.
made the test more strict and added the same validation to the aggregated endpoint to avoid similar problems, but the main fix looks good to me.
|
@elo-magnier-7s can you confirm this solves #15600 for you? |
Khepri metrics are present. As part of RabbitMQ, khepri runs in the
Ra-based queue types other than QQs are not accounted for here. I assume they will either have their own collector or we'll make sure this collector handles them correctly as part of work on those queue types. We've played with the idea of splitting Ra metrics per Ra system on the Ra level (currently they are all under that |
streams also? I posted a link to the khepri, from that link it looks like it has its own system. What am I missing? |
Good point - I don't think any osiris counters are currently returned. Something to look into for sure, but not directly related to this issue.
You linked to Khepri directly, which is developed as a standalone project. When embedded in RabbitMQ, the correct place to look is https://github.com/rabbitmq/rabbitmq-server/blob/main/deps/rabbit/src/rabbit_khepri.erl#L279 |
|
I mean is it ok and visible and understood both streams and khepri use coordination system? https://github.com/rabbitmq/rabbitmq-server/blob/main/deps/rabbit/src/rabbit_stream_coordinator.hrl#L7 Even if only from metrics PoV. |
|
streams don't use the Ra system as such, only the stream coordinator does. So metrics of the |
like having a label system="system_name"? |
Anyway, there's room for improvements in terms of clarify and hopefully performance as well. |
|
Hi everyone, Before (make run-broker from
|
|
@elo-magnier-7s A node started from So branch switching can run into "downgrading-like" scenarios. All regular contributors to RabbitMQ eventually develop this habit of wiping |
Per-object Prometheus metrics: avoid duplicate HELP, TYPE metadata lines (backport #15610)
|
Thanks for the patch & the explanation @michaelklishin, I appreciate it! I figured playing it fast and loose was what was biting me in the butt, but not how or why, as I didn't expect any difference in code, I thought it was a compilation issue and you'd be able to tell in a second. Cheers! |
I am not sure if this is the optimal approach but it is best place/manner of addressing this that I could find without affecting aggregated metrics.
Raft metrics can and do come from different Ra systems, namely Khepri and quorum queues. We need to format them as a "single" metric to avoid duplicate HELP, TYPE metadata lines.
Since quorum queues have dozens of metrics, we filter out a set of Raft-related ones specifically that combine well with the Raft metrics from Khepri.
Closes #15600.