prometheus: improve UX, add grafana, node_exporter, custom dashboards#82656
Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom Jun 9, 2022
Merged
prometheus: improve UX, add grafana, node_exporter, custom dashboards#82656craig[bot] merged 1 commit intocockroachdb:masterfrom
craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
Member
Member
Author
erikgrinaker
approved these changes
Jun 9, 2022
Contributor
erikgrinaker
left a comment
There was a problem hiding this comment.
Definitely agree that this belongs in roachprod, but seems useful enough as an intermediate step. Only skimmed the code itself, given the ad hoc nature.
We already had the ability to deploy a prometheus instance to a node in
the cluster. However, to run experiments / long investigations[^1] we
often need a Grafana instance with the dashboards du jour. This commit
dramatically cuts down on the manual steps needed to get this set up.
All it takes is adding setup like this to the roachtest:
```
clusNodes := c.Range(1, c.Spec().NodeCount-1)
workloadNode := c.Node(c.Spec().NodeCount)
promNode := workloadNode
cfg := (&prometheus.Config{}).
WithCluster(clusNodes).
WithPrometheusNode(promNode).
WithGrafanaDashboard("https://gist.githubusercontent.com/tbg/f238d578269143187e71a1046562225f/raw").
WithNodeExporter(clusNodes).
WithWorkload(workloadNode, 2112).
WithWorkload(workloadNode, 2113)
p, saveSnap, err := prometheus.Init(
ctx,
*cfg,
c,
t.L(),
repeatRunner{C: c, T: t}.repeatRunE,
)
require.NoError(t, err)
defer saveSnap(ctx, t.ArtifactsDir())
```
There has been talk[^2] of adding some of this tooling to `roachprod`.
Probably a good idea, but we can pour infinite amount of work into this,
and for now I think this is a good stepping stone and satisfies my
immediate needs.
[^1]: cockroachdb#82109
[^2]: [internal slack](https://cockroachlabs.slack.com/archives/CAC6K3SLU/p1654267035695569?thread_ts=1654153265.215669&cid=CAC6K3SLU)
Release note: None
Member
Author
|
bors r=erikgrinaker |
Contributor
|
Build succeeded: |
nicktrav
reviewed
Jun 9, 2022
Collaborator
nicktrav
left a comment
There was a problem hiding this comment.
Reviewed after-the-fact, but . Thank you!
Reviewed 3 of 3 files at r1, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
We already had the ability to deploy a prometheus instance to a node in
the cluster. However, to run experiments / long investigations1 we
often need a Grafana instance with the dashboards du jour. This commit
dramatically cuts down on the manual steps needed to get this set up.
All it takes is adding setup like this to the roachtest:
There has been talk2 of adding some of this tooling to
roachprod.Probably a good idea, but we can pour infinite amount of work into this,
and for now I think this is a good stepping stone and satisfies my
immediate needs.
Release note: None
Footnotes
https://github.com/cockroachdb/cockroach/issues/82109 ↩
internal slack ↩