Skip to content

Allow non-default registry to be used for metrics of SD components#13023

Merged
beorn7 merged 6 commits intoprometheus:mainfrom
ptodev:prefer-to-not-register-metrics-globally-sd
Dec 12, 2023
Merged

Allow non-default registry to be used for metrics of SD components#13023
beorn7 merged 6 commits intoprometheus:mainfrom
ptodev:prefer-to-not-register-metrics-globally-sd

Conversation

@ptodev
Copy link
Copy Markdown
Contributor

@ptodev ptodev commented Oct 23, 2023

This PR makes it possible to register metrics of SD components using a registry other than the global one. This is going to be very useful for applications such as Grafana Agent which import Prometheus SDs from the Prometheus repo and instantiate more than one SD per process.

This PR is a follow-up from #12565. This PR does for SD mechanisms what #12958 does for the "scrape" package.

Note that the Kubernetes SD mechanism registers metrics from the Go k8s client. Those metrics are per process and not per SD instance, so I moved them outside of the SD mechanism. They are now registered when the SD Manager metrics are registered, once per process.

Also note that since we register the metrics during the Run() function, we are not able to use the metrics during the SD creation because the Run() function hasn't ran yet. I believe this will not be a problem.

I successfully tested the change locally:

  • I used file_sd_configs and kubernetes_sd_configs. I see all the k8s metrics and file SD metrics whenever those SDs are enabled in the config.
  • I tested Prometheus with both the --enable-feature=new-service-discovery-manager cmd option and without. In both cases I get the same metrics.
  • I tried reloading the Prometheus config using a pkill -SIGHUP -i prometheus command. This worked ok, Prometheus reloaded the config and kept running, and the same metrics were visible.

I did not test promtool.

@beorn7
Copy link
Copy Markdown
Member

beorn7 commented Oct 24, 2023

I'll have a look ASAP. Could you fix the DCO and the lint issues in the meantime?

@ptodev ptodev force-pushed the prefer-to-not-register-metrics-globally-sd branch from f2c25df to bfeac67 Compare October 24, 2023 19:03
@ptodev
Copy link
Copy Markdown
Contributor Author

ptodev commented Oct 24, 2023

I'll have a look ASAP. Could you fix the DCO and the lint issues in the meantime?

Thank you! The CI is green now.

Copy link
Copy Markdown
Member

@beorn7 beorn7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Epic work. Thank you very much. I think you can avoid having the separate registration of "shared" metrics by using ConstLabels or WrapRegistererWith. Feel free to ask if you would like to discuss this approach more. Let's try this out before I do the detailed review.

@ptodev ptodev force-pushed the prefer-to-not-register-metrics-globally-sd branch 2 times, most recently from 0c04f6a to 8d8894b Compare November 15, 2023 17:14
@ptodev ptodev requested a review from beorn7 November 15, 2023 17:17
@ptodev ptodev force-pushed the prefer-to-not-register-metrics-globally-sd branch 2 times, most recently from f801b57 to f50b965 Compare November 23, 2023 10:04
@beorn7
Copy link
Copy Markdown
Member

beorn7 commented Nov 29, 2023

Thank you very much. I hope I'll get to this tomorrow.

@ptodev ptodev force-pushed the prefer-to-not-register-metrics-globally-sd branch from f50b965 to 7f71a10 Compare November 29, 2023 23:29
@beorn7
Copy link
Copy Markdown
Member

beorn7 commented Dec 5, 2023

I'm still trying very hard to get to this. In the meantime, could you address the lint issues? They seem easy to fix.

@ptodev ptodev force-pushed the prefer-to-not-register-metrics-globally-sd branch 3 times, most recently from 2e5c9f8 to 1d48f6a Compare December 6, 2023 09:25
@ptodev
Copy link
Copy Markdown
Contributor Author

ptodev commented Dec 6, 2023

I'm still trying very hard to get to this. In the meantime, could you address the lint issues? They seem easy to fix.

Yes, thank you, this is done now.

Copy link
Copy Markdown
Member

@beorn7 beorn7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice. I have just one concern about the "late WithLabelValues calls inside the anonymous functions in the K8s SD. (I think I sent the first comment directly and not part of this review.) If you have done this on purpose and already know that the performance impact doesn't matter, then fine. Otherwise, I would recommend to do the WithLabelValues call once "outside the loop".

And then there is a merge conflict.

But once those two things are addressed, we can finally merge. Thank you very much.

Signed-off-by: Paulin Todev <paulin.todev@gmail.com>
Signed-off-by: Paulin Todev <paulin.todev@gmail.com>
This was flagged by the linter.

Signed-off-by: Paulin Todev <paulin.todev@gmail.com>
@ptodev ptodev force-pushed the prefer-to-not-register-metrics-globally-sd branch from 1d48f6a to 5be085c Compare December 11, 2023 13:19
Signed-off-by: Paulin Todev <paulin.todev@gmail.com>
Signed-off-by: Paulin Todev <paulin.todev@gmail.com>
@ptodev ptodev force-pushed the prefer-to-not-register-metrics-globally-sd branch from 5be085c to 27bb57a Compare December 11, 2023 13:39
Signed-off-by: Paulin Todev <paulin.todev@gmail.com>
@ptodev ptodev force-pushed the prefer-to-not-register-metrics-globally-sd branch from 3d2e1fc to d2e9970 Compare December 11, 2023 14:28
@ptodev ptodev requested a review from beorn7 December 12, 2023 10:39
@ptodev
Copy link
Copy Markdown
Contributor Author

ptodev commented Dec 12, 2023

@beorn7 Thank you for the review! I addressed the issues which you mentioned. Also, the branch is rebased.


svcAddCount := eventCount.WithLabelValues(RoleService.String(), MetricLabelRoleAdd)
svcUpdateCount := eventCount.WithLabelValues(RoleService.String(), MetricLabelRoleUpdate)
svcDeleteCount := eventCount.WithLabelValues(RoleService.String(), MetricLabelRoleDelete)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@beorn7 I thought it'd be nice to define string constants such as "endpoints", "endpointslice", "add", "update", and "delete" only once and use them as constants instead of rewriting the string. I hope you don't mind.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are different camps in Prometheus land. Some people argue, if you see a label value in your metrics, you should be able to find it directly in the code with grep. I don't feel strongly, and I understand that string constants are matching most developers' exceptions much better.

Copy link
Copy Markdown
Member

@beorn7 beorn7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much. Epic work.

@beorn7 beorn7 merged commit 9825a80 into prometheus:main Dec 12, 2023
@bwplotka
Copy link
Copy Markdown
Member

NOTE: It seems this PR might have broke something: #13312

marctc added a commit to grafana/prometheus that referenced this pull request Jan 4, 2024
…-register-metrics-globally-sd"

This reverts commit 9825a80, reversing
changes made to 669cad6.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants