prometheus/metrics: three new metrics for consensus#4263
Conversation
- add consensus_validator_power metric so if a node is a validator it can see its own power in prometheus - add last_signed_height metric so if a node is a validator it can be aware of at which height the most recent time it signed. - closes: #3773 - closes: #3083 Signed-off-by: Marko Baricevic <marbar3778@yahoo.com>
|
This also needs a changelog_pending entry. |
…endermint/tendermint into marko/3773_add_promethues_mets
Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com>
Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com>
|
What makes these metrics validator specific? I do not see any additional labels. How would someone know last_signed_height is coming from validator A? |
…endermint/tendermint into marko/3773_add_promethues_mets
|
add a label with the validator address to the added metrics |
|
Thanks @marbar3778 for your work on this. @leoluk has previously suggested having a "total missed block" counter per validator, I think that could be a valuable addition to detect missed blocks that occurs sporadically. Also worth considering is providing these metrics for all validators in the set, not just the local one found in priv_validator. Con: Consumes more resources to store/query 125 samples instead of 1. Pro: Sentries can collect samples so there is a higher degree of certainty of getting accurate data, even if there is a network partition or similar restricting Prometheus from scraping the validator. Perhaps a config option is the most sensible approach to this, but personally I'd be comfortable with collecting this data for the current 125 validators. |
|
I can add the missing of blocks to this pr for individual validators but to keep the scope smaller, it would be good to add the second part, adding these metrics for the entire set, to a followup PR. edit: added here: #1791 (comment) |
| "validator_address", privValAddress.String(), | ||
| } | ||
| cs.metrics.ValidatorPower.With(label...).Set(float64(val.VotingPower)) | ||
| if !commitSig.Absent() { |
There was a problem hiding this comment.
This checks for BlockIDFlagAbsent (no vote). Wouldn't BlockIDFlagNil (nil vote) also count as a miss?
- follow up to #4263 - when a commit is nil, then it should be counted as a missed commit Signed-off-by: Marko Baricevic <marbar3778@yahoo.com>
- follow up to #4263 - when a commit is nil, then it should be counted as a missed commit Signed-off-by: Marko Baricevic marbar3778@yahoo.com
* prometheus/metrics: two new metrics for consensus - add consensus_validator_power metric so if a node is a validator it can see its own power in prometheus - add last_signed_height metric so if a node is a validator it can be aware of at which height the most recent time it signed. - closes: #3773 - closes: #3083 Signed-off-by: Marko Baricevic <marbar3778@yahoo.com> * check if signature is present * minor change and check if sig is not absent * add changelog entry * Update consensus/state.go Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com> * Update CHANGELOG_PENDING.md Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com> * Update CHANGELOG_PENDING.md * add label with validator address * change address to vali_address * add metric missed blocks * add changelog entry for missed blocks metric * address naming & add docs
- follow up to #4263 - when a commit is nil, then it should be counted as a missed commit Signed-off-by: Marko Baricevic marbar3778@yahoo.com
add consensus_validator_power metric so if a node is a validator it can see its own power in prometheus
add last_signed_height metric so if a node is a validator it can be aware of which was the most recent height it signed.
closes: Add Prometheus observability of last signed height per validator #3773
closes: Add Prometheus metric tendermint_consensus_validator_power #3083
@mdyring
Signed-off-by: Marko Baricevic marbar3778@yahoo.com