Kubelet serving CSR never created

### What happened?

We are seeing an issue occasionally where the kubelet never gets the server certificate (`serverTLSBootstrap: true`).
We have an auto-approver for the server certificates and detect this issue because we are waiting for the certificate to appear in `/var/lib/kubelet/pki/kubelet-server-current.pem`. When we hit it, not even the CSR is created.
It started happening after upgrading to v1.32 so we believe it could be from a change that was introduced in v1.32.
Restarting the kubelet does not help. It happily continues without the server certificate, even though it is configured to bootstrap it.

In healthy clusters, we see this:

```console
$ kubectl get csr
NAME        AGE   SIGNERNAME                                    REQUESTOR                           REQUESTEDDURATION   CONDITION
csr-8dql4   29s   kubernetes.io/kubelet-serving                 system:node:lennart-kubelet-debug   <none>              Pending
csr-r8ztr   17s   kubernetes.io/kubelet-serving                 system:node:lennart-kubelet-debug   <none>              Pending
csr-xtwpf   30s   kubernetes.io/kube-apiserver-client-kubelet   system:node:lennart-kubelet-debug   <none>              Approved,Issued
```

In faulty clusters, there is just the client CSR.
~~We are not sure why two CSRs are created for the server either, but it works when we approve the newer of them.~~

Edit: The two server CSRs happen because kubeadm restarts the kubelet as part of initializing. This is how it picks up the client certificate. After the restart, the kubelet does not "remember" its old server CSR so it creates a new. This is not an issue.

Here is the kubelet config from an affected node: [kubelet-config.txt](https://github.com/user-attachments/files/18688742/kubelet-config.txt)

### What did you expect to happen?

When configured, the kubelet should always create the server CSR.

### How can we reproduce it (as minimally and precisely as possible)?

Unfortunately we do not have any guaranteed reproduction steps.
In general, this is what is needed to trigger the issue:

1. Kubelet is configured to rotate certificates (default) and to use server certificate.
2. Node status is stable, i.e. there are no changes of any kind that would cause the kubelet to update the status.
3. Kubelet is restarted and the server certificate does not exist (e.g. because it was removed or it wasn't issued yet).

In this specific case you should be able to see that the CSR is not created:

```console
$ kubectl get csr
NAME        AGE   SIGNERNAME                                    REQUESTOR                           REQUESTEDDURATION   CONDITION
csr-2mq8h   22s   kubernetes.io/kube-apiserver-client-kubelet   system:node:lennart-kubelet-debug   <none>              Approved,Issued
$ kubectl get nodes
NAME                    STATUS   ROLES           AGE   VERSION
lennart-kubelet-debug   Ready    control-plane   28s   v1.32.1
```

### Anything else we need to know?

I believe the culprit is https://github.com/kubernetes/kubernetes/pull/128640. Not the randomization of the interval, but the change in how the [zero case for the lastStatusReportTime](https://github.com/kubernetes/kubernetes/pull/128640/files#diff-f39bb531745e5658a65143c9054c3fa92aa3ac9f41e9e627ac61b7ff08808153R607) is handled.
It used to be treated as expired and trigger an update. After the PR it is treated the opposite.
This means that `patchNodeStatus` is never called and consequently neither is `setLastObservedNodeAddresses`.
The latter is initializing the kubelets internal node address state (not node status).
This internal state is what the server certificate manager is relying on to create the CSR. As long as it is `nil`, i.e. as long as the node status hasn't been patched, it cannot progress.

### Kubernetes version

<details>

```console
$ kubectl version
Client Version: v1.32.1
Kustomize Version: v5.5.0
Server Version: v1.32.1
```

</details>


### Cloud provider

<details>
Happens both with and without external cloud provider
</details>


### OS version

<details>

```console
# On Linux:
$ cat /etc/os-release
# paste output here
$ uname -a
# paste output here
```

</details>


### Install tools

<details>

</details>


### Container runtime (CRI) and version (if applicable)

<details>

</details>


### Related plugins (CNI, CSI, ...) and versions (if applicable)

<details>

</details>


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kubelet serving CSR never created #130001

What happened?

What did you expect to happen?

How can we reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

Kubernetes version

Cloud provider

OS version

Install tools

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Kubelet serving CSR never created #130001

Description

What happened?

What did you expect to happen?

How can we reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

Kubernetes version

Cloud provider

OS version

Install tools

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions