Skip to content

prometheus/metrics: three new metrics for consensus#4263

Merged
tac0turtle merged 17 commits intomasterfrom
marko/3773_add_promethues_mets
Dec 19, 2019
Merged

prometheus/metrics: three new metrics for consensus#4263
tac0turtle merged 17 commits intomasterfrom
marko/3773_add_promethues_mets

Conversation

@tac0turtle
Copy link
Contributor

@tac0turtle tac0turtle commented Dec 18, 2019

@mdyring

Signed-off-by: Marko Baricevic marbar3778@yahoo.com

  • Referenced an issue explaining the need for the change
  • Updated all relevant documentation in docs
  • Updated all code comments where relevant
  • Wrote tests
  • Updated CHANGELOG_PENDING.md

- add consensus_validator_power metric so if a node is a validator it can see its own power in prometheus
- add last_signed_height metric so if a node is a validator it can be aware of at which height the most recent time it signed.

- closes: #3773
- closes: #3083

Signed-off-by: Marko Baricevic <marbar3778@yahoo.com>
@tac0turtle tac0turtle added C:consensus Component: Consensus T:validator Type: Validator related T:observability Type: Observability labels Dec 18, 2019
@tac0turtle tac0turtle self-assigned this Dec 18, 2019
@tac0turtle tac0turtle marked this pull request as ready for review December 19, 2019 08:58
@melekes
Copy link
Contributor

melekes commented Dec 19, 2019

This also needs a changelog_pending entry.

tac0turtle and others added 3 commits December 19, 2019 10:20
Copy link
Contributor

@melekes melekes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com>
@melekes
Copy link
Contributor

melekes commented Dec 19, 2019

What makes these metrics validator specific? I do not see any additional labels. How would someone know last_signed_height is coming from validator A?

@melekes melekes self-requested a review December 19, 2019 09:45
@tac0turtle
Copy link
Contributor Author

add a label with the validator address to the added metrics

@mdyring
Copy link
Contributor

mdyring commented Dec 19, 2019

Thanks @marbar3778 for your work on this.

@leoluk has previously suggested having a "total missed block" counter per validator, I think that could be a valuable addition to detect missed blocks that occurs sporadically.

Also worth considering is providing these metrics for all validators in the set, not just the local one found in priv_validator.

Con: Consumes more resources to store/query 125 samples instead of 1.

Pro: Sentries can collect samples so there is a higher degree of certainty of getting accurate data, even if there is a network partition or similar restricting Prometheus from scraping the validator.

Perhaps a config option is the most sensible approach to this, but personally I'd be comfortable with collecting this data for the current 125 validators.

@tac0turtle
Copy link
Contributor Author

tac0turtle commented Dec 19, 2019

I can add the missing of blocks to this pr for individual validators but to keep the scope smaller, it would be good to add the second part, adding these metrics for the entire set, to a followup PR.

edit: added here: #1791 (comment)

@tac0turtle tac0turtle mentioned this pull request Dec 19, 2019
@tac0turtle tac0turtle changed the title prometheus/metrics: two new metrics for consensus prometheus/metrics: three new metrics for consensus Dec 19, 2019
@tac0turtle tac0turtle merged commit 7368ba3 into master Dec 19, 2019
@tac0turtle tac0turtle deleted the marko/3773_add_promethues_mets branch December 19, 2019 17:04
"validator_address", privValAddress.String(),
}
cs.metrics.ValidatorPower.With(label...).Set(float64(val.VotingPower))
if !commitSig.Absent() {
Copy link
Contributor

@leoluk leoluk Dec 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This checks for BlockIDFlagAbsent (no vote). Wouldn't BlockIDFlagNil (nil vote) also count as a miss?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#4283 did the follow up here

tac0turtle added a commit that referenced this pull request Jan 6, 2020
- follow up to #4263
- when a commit is nil, then it should be counted as a missed commit

Signed-off-by: Marko Baricevic <marbar3778@yahoo.com>
melekes pushed a commit that referenced this pull request Jan 7, 2020
- follow up to #4263
- when a commit is nil, then it should be counted as a missed commit

Signed-off-by: Marko Baricevic marbar3778@yahoo.com
tac0turtle added a commit that referenced this pull request Jan 8, 2020
* prometheus/metrics: two new metrics for consensus

- add consensus_validator_power metric so if a node is a validator it can see its own power in prometheus
- add last_signed_height metric so if a node is a validator it can be aware of at which height the most recent time it signed.

- closes: #3773
- closes: #3083

Signed-off-by: Marko Baricevic <marbar3778@yahoo.com>

* check if signature is present

* minor change and check if sig is not absent

* add changelog entry

* Update consensus/state.go

Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com>

* Update CHANGELOG_PENDING.md

Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com>

* Update CHANGELOG_PENDING.md

* add label with validator address

* change address to vali_address

* add metric missed blocks

* add changelog entry for missed blocks metric

* address naming & add docs
tac0turtle added a commit that referenced this pull request Jan 8, 2020
- follow up to #4263
- when a commit is nil, then it should be counted as a missed commit

Signed-off-by: Marko Baricevic marbar3778@yahoo.com
@tac0turtle tac0turtle mentioned this pull request Jan 9, 2020
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

C:consensus Component: Consensus T:observability Type: Observability T:validator Type: Validator related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add Prometheus observability of last signed height per validator Add Prometheus metric tendermint_consensus_validator_power

5 participants