Skip to content

CFP: Add mTLS support for agent and operator metrics servers #38990

@ulkub

Description

@ulkub

Cilium Feature Proposal

Is your proposed feature related to a problem?

Yes. We need to be able to encrypt all our Cilium metrics. Currently Cilium agent and operator do not have mTLS support for their metrics servers. Hubble metrics server has this support, and achieving parity with Hubble is required. We need mTLS support to be added.

Describe the feature you'd like

We'd like agent and operator metrics servers to have mTLS support with automatic certificate rotation.

(Optional) Describe your proposed solution

Hubble metrics server already has mTLS support. Similar work could be done for agent and operator.

In our case, we need strict TLS; we don't want to fail open. The question here is: When TLS is enabled but metrics server cannot be started with TLS, does it make sense not to start metrics server i.e not to expose metrics, or just to crash?
The solution approach for this problem will be adopted to Hubble metrics too.
If not exposing metrics is sensible, we can manage this with the help of an additional flag EnableStrictTLS. If both 'EnableMetricsServerTLS' and 'EnableStrictTLS' are 'true', metrics server will not be started.
The flags that we need in this case can be like:

// Enable tls for metrics server
EnableMetricsServerTLS bool

// Makes sure that metrics server is not started if tls is enabled  but could not be configured/started
EnableStrictTLS bool

// MetricsServerTLSCertFile specifies the path to the public key file for
// the metrics server. The file must contain PEM encoded data.
MetricsServerTLSCertFile string

// MetricsServerTLSKeyFile specifies the path to the private key file for
// the metrics server. The file must contain PEM encoded data.
MetricsServerTLSKeyFile string

// MetricsServerTLSClientCAFiles specifies the path to one or more client
// CA certificates to use for TLS with mutual authentication (mTLS) on the
// metrics server. The files must contain PEM encoded data.
MetricsServerTLSClientCAFiles []string

Certificates we use will be valid for 6 months, and will need renewal. Thus, we will also include Automatic Certification Rotation. This is required for Hubble metrics as well.

We have a working strict TLS support solution with certificate rotation for Cilium 1.13. A work-in-progress for the current head is here

Metadata

Metadata

Assignees

Labels

area/metricsImpacts statistics / metrics gathering, eg via Prometheus.kind/cfpCilium Feature Proposalkind/featureThis introduces new functionality.staleThe stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions