Add optional TLS support to /metrics endpoint by peppi-lotta · Pull Request #7255 · coredns/coredns

peppi-lotta · 2025-04-15T09:43:09Z

1. Why is this pull request needed?

It adds optional TLS support to the CoreDNS metrics endpoint, allowing metrics to be served over HTTPS. This improves security by encrypting metrics traffic. TLS is disabled by default, so existing configurations continue to work.

2. Related issues

#7109

3. Documentation changes

Update metrics plugin docs to include TLS settings and examples
Add instructions for providing certificates and configuring Prometheus to scrape over HTTPS
Add troubleshooting notes

4. Backward compatibility

No breaking changes. TLS is optional and off by default.

peppi-lotta · 2025-04-16T11:43:33Z

@chrisohaver @SuperQ Would you have time to review? :)

kashifest · 2025-04-22T08:10:23Z

@chrisohaver @SuperQ can you please take a look on this PR? It would be nice to have this PR moving forward

codecov · 2025-04-25T19:02:34Z

Codecov Report

❌ Patch coverage is 60.46512% with 17 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.19%. Comparing base (93c57b6) to head (8c5515c).
⚠️ Report is 1706 commits behind head on master.

Files with missing lines	Patch %	Lines
plugin/metrics/setup.go	0.00%	11 Missing and 1 partial ⚠️
plugin/metrics/metrics.go	83.87%	4 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #7255      +/-   ##
==========================================
+ Coverage   55.70%   63.19%   +7.49%     
==========================================
  Files         224      278      +54     
  Lines       10016    15129    +5113     
==========================================
+ Hits         5579     9561    +3982     
- Misses       3978     4879     +901     
- Partials      459      689     +230

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

kashifest · 2025-05-05T10:27:52Z

It would be really nice to have someone review this and start the discussion rolling

amshankaran · 2025-05-05T12:11:15Z

It would be helpful if someone could take a look at this.

peppi-lotta · 2025-05-27T06:31:55Z

@jameshartig @miekg @SuperQ @greenpau @Tantalor93 Can I get a review on this? What are the next steps needed to get this merged?

miekg · 2025-06-19T14:32:11Z

I consider TLS on metric endpoints to be an anti feature

JanMkl · 2025-06-19T16:13:53Z

I consider TLS on metric endpoints to be an anti feature

I understand the desire to minimise complexity—especially on internal metrics endpoints—but in a zero-trust architecture, all network traffic, regardless of source or function, must be both authenticated and encrypted.

Tampering with metrics may seem low-impact, but it can lead to unpredictable or harmful behavior. For example:

It can reveal service topology and workload characteristics.
It can trigger alerts or automated remediation based on falsified data.

In multi-tenant or regulated environments where the network is not implicitly trusted, TLS is not an anti-feature—it’s a baseline security control. Even when the content isn’t sensitive, preserving integrity and controlling access are essential.

To be clear, the PR does not propose mandatory TLS—it introduces optional support, as indicated in the title: “Add optional TLS support to /metrics endpoint”. Rejecting TLS outright disregards the realities of deployment models built on zero-trust principles.

peppi-lotta · 2025-10-21T10:07:37Z

I've simplified this TLS implementation quite a bit. It's still using the Prometheus exporter-toolkit but now in the intended way. There is no more creating any temp files (this was the concern brought up in this comment: #7255 (comment)).

@jameshartig @miekg @SuperQ @greenpau @Tantalor93 Can I get a review on this? What are the next steps needed to get this merged?

kashifest

lgtm, it looks ok. It would nice to have someone with review rights have a look into this.

kashifest · 2025-11-17T08:59:00Z

@johnbelamaric would you please take a look at this one?

yongtang · 2025-12-04T01:10:58Z

@SuperQ any additional feedback on the updated PR?

kashifest · 2026-01-12T08:08:58Z

@johnbelamaric @SuperQ Can we please decide how to proceed with this PR?

johnbelamaric · 2026-03-06T19:13:30Z

@peppi-lotta can you rebase?

peppi-lotta · 2026-03-09T08:59:29Z

@johnbelamaric I've rebased now :)

johnbelamaric · 2026-03-09T17:01:16Z

+	select {
+	case err := <-startResult:
+		return err
+	case <-time.After(200 * time.Millisecond):


I don't love this. Is there instead a status or something we can check on the control structure?

Hi @johnbelamaric, joining the discussion.

I have been discussing this with @peppi-lotta, and I think there is a way to avoid waiting a predefined time for the service to be ready. I also agree that a timer-based approach can be problematic for various reasons. The ServeTLS method in net/http, which web.Serve ultimately calls, unfortunately does not provide a usable hook for checking service status. I don't think building an additional health-check mechanism that sends probe messages is a feasible solution either.

However, we know that ServeTLS essentially follows two paths: it either exits with an error or it calls Accept. ServeTLS creates a new listener but uses the listener argument to accept connections. Therefore, we can wrap the listener passed to web.Serve like this, e.g.:

// startupListener wraps a net.Listener to detect when Accept() is first called type startupListener struct { readyOnce sync.Once ready chan struct{} net.Listener } func newStartupListener(l net.Listener) *startupListener { return &startupListener{ Listener: l, ready: make(chan struct{}), } } func (sl *startupListener) Accept() (net.Conn, error) { // Signal ready on first Accept() call (server is running) sl.readyOnce.Do(func() { close(sl.ready) }) return sl.Listener.Accept() } func (sl *startupListener) Ready() <-chan struct{} { return sl.ready } ... ln, err := reuseport.Listen("tcp", m.Addr) m.ln := newStartupListener(ln) ... err := web.Serve(m.ln, server, webConfig, logger) // Wait for server to be ready (first Accept() called) or error select { case <-startupListener.Ready(): log.Println("Server is ready and accepting connections") case err := <-serverErr: log.Fatalf("Server failed to start: %v", err) }

What do you think?

I pushed a commit implementing the suggested approach. Seems to work well :)

johnbelamaric · 2026-03-09T17:02:22Z

+		WebConfigFile:      &m.tlsConfigPath,
+	}
+
+	logger := slog.New(slog.NewTextHandler(os.Stderr, nil))


I assume we need this particular object for the prometheus exporter? How does this interplay with our existing logging?

Yes the prometheus web toolkit requires a slog.Logger type object to be passed to it. I think it interplays nicely. There is not a lot of logging when the the TLS setup is successful as can be seen from bellow.

$ kubectl logs -n kube-system coredns-67ddfdcbd-6lhfq maxprocs: Leaving GOMAXPROCS=22: CPU quota undefined time=2026-03-11T06:43:05.148Z level=INFO msg="Listening on" address=[::]:9153 time=2026-03-11T06:43:05.148Z level=INFO msg="TLS is enabled." http2=true address=[::]:9153 .:53 [INFO] plugin/reload: Running configuration SHA512 = 815c08e687bdc067019d64a774daf517a0fd3bb19531a7596741f57e78e3abb3dade4ae3bbe4898839f7bab7622c2bb91b27ce2f2c92d52b53e992dcd26ecbad CoreDNS-1.14.2 linux/amd64, go1.25.5,

…dpoint Signed-off-by: peppi-lotta <peppi-lotta.saari@est.tech>

Signed-off-by: peppi-lotta <peppi-lotta.saari@est.tech>

johnbelamaric · 2026-03-12T20:49:09Z

thanks!

* Use exporter-toolkit to enable optional TLS encryption on /metrics endpoint Signed-off-by: peppi-lotta <peppi-lotta.saari@est.tech> * Implement startup listener to signal server readiness Signed-off-by: peppi-lotta <peppi-lotta.saari@est.tech> --------- Signed-off-by: peppi-lotta <peppi-lotta.saari@est.tech>

peppi-lotta requested review from SuperQ, Tantalor93, chrisohaver, greenpau, jameshartig, johnbelamaric, miekg, stp-ip and yongtang as code owners April 15, 2025 09:43

peppi-lotta force-pushed the peppi-lotta/metrics-tls-support branch 8 times, most recently from 89c8dbf to 7ac7560 Compare April 15, 2025 10:51

github-advanced-security AI found potential problems Apr 25, 2025

View reviewed changes

Comment thread plugin/metrics/setup.go Fixed

SuperQ reviewed May 5, 2025

View reviewed changes

Comment thread plugin/metrics/metrics.go Outdated

nuhakala mentioned this pull request May 26, 2025

Enable optional TLS on nodecache metrics endpoint kubernetes/dns#694

Open

peppi-lotta force-pushed the peppi-lotta/metrics-tls-support branch from 7ac7560 to f12b5dd Compare May 26, 2025 11:40

peppi-lotta force-pushed the peppi-lotta/metrics-tls-support branch from f12b5dd to ce041ac Compare June 2, 2025 09:06

peppi-lotta mentioned this pull request Jun 30, 2025

Add Config to FlagConfig struct prometheus/exporter-toolkit#332

Open

peppi-lotta force-pushed the peppi-lotta/metrics-tls-support branch from ce041ac to 8f81c13 Compare September 22, 2025 09:37

peppi-lotta force-pushed the peppi-lotta/metrics-tls-support branch 3 times, most recently from 9239ecc to 2b84a94 Compare October 21, 2025 07:23

peppi-lotta force-pushed the peppi-lotta/metrics-tls-support branch from 2b84a94 to 8c5515c Compare October 27, 2025 11:15

kashifest approved these changes Oct 29, 2025

View reviewed changes

peppi-lotta force-pushed the peppi-lotta/metrics-tls-support branch 3 times, most recently from 1fbe753 to cd49281 Compare November 11, 2025 08:09

peppi-lotta force-pushed the peppi-lotta/metrics-tls-support branch from cd49281 to 5c93f42 Compare December 8, 2025 12:10

peppi-lotta force-pushed the peppi-lotta/metrics-tls-support branch from 5c93f42 to 839f096 Compare January 13, 2026 14:39

peppi-lotta force-pushed the peppi-lotta/metrics-tls-support branch from 839f096 to ef076fe Compare March 9, 2026 06:24

johnbelamaric reviewed Mar 9, 2026

View reviewed changes

peppi-lotta force-pushed the peppi-lotta/metrics-tls-support branch from ef076fe to b10bd56 Compare March 11, 2026 06:45

Use exporter-toolkit to enable optional TLS encryption on /metrics en…

8ee9ace

…dpoint Signed-off-by: peppi-lotta <peppi-lotta.saari@est.tech>

peppi-lotta force-pushed the peppi-lotta/metrics-tls-support branch from b10bd56 to 8ee9ace Compare March 11, 2026 06:50

Implement startup listener to signal server readiness

394acb8

Signed-off-by: peppi-lotta <peppi-lotta.saari@est.tech>

johnbelamaric merged commit 7ff001d into coredns:master Mar 12, 2026
11 checks passed

BrewTestBot mentioned this pull request Apr 22, 2026

coredns 1.14.3 Homebrew/homebrew-core#278880

Merged

nuhakala mentioned this pull request Jun 8, 2026

Add optional TLS support to metrics endpoint kubernetes-sigs/node-local-dns#27

Open

Conversation

peppi-lotta commented Apr 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

1. Why is this pull request needed?

2. Related issues

3. Documentation changes

4. Backward compatibility

Uh oh!

peppi-lotta commented Apr 16, 2025

Uh oh!

kashifest commented Apr 22, 2025

Uh oh!

Uh oh!

codecov Bot commented Apr 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

kashifest commented May 5, 2025

Uh oh!

amshankaran commented May 5, 2025

Uh oh!

Uh oh!

peppi-lotta commented May 27, 2025

Uh oh!

miekg commented Jun 19, 2025

Uh oh!

JanMkl commented Jun 19, 2025

Uh oh!

peppi-lotta commented Oct 21, 2025

Uh oh!

kashifest left a comment

Choose a reason for hiding this comment

Uh oh!

kashifest commented Nov 17, 2025

Uh oh!

yongtang commented Dec 4, 2025

Uh oh!

kashifest commented Jan 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

johnbelamaric commented Mar 6, 2026

Uh oh!

peppi-lotta commented Mar 9, 2026

Uh oh!

Uh oh!

johnbelamaric Mar 9, 2026

Choose a reason for hiding this comment

Uh oh!

terror96 Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

peppi-lotta Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

johnbelamaric Mar 9, 2026

Choose a reason for hiding this comment

Uh oh!

peppi-lotta Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

johnbelamaric commented Mar 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants

peppi-lotta commented Apr 15, 2025 •

edited

Loading

codecov Bot commented Apr 25, 2025 •

edited

Loading

kashifest commented Jan 12, 2026 •

edited

Loading

peppi-lotta Mar 11, 2026 •

edited

Loading