What is the purpose of a kubernetes deployment pod selector?

Question

I fail to see why kubernetes need a pod selector in a deployment statement that can only contain one pod template? Feel free to educate me why kubernetes engineers introduced a selector statement inside a deployment definition instead of automatically select the pod from the template?

---
apiVersion: v1
kind: Service
metadata:
  name: grpc-service

spec:
  type: LoadBalancer
  ports:
  - name: grpc
    port: 8080
    targetPort: 8080
    protocol: TCP
  selector:
    app: grpc-test

---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: grpc-deployment

spec:
  replicas: 1
  revisionHistoryLimit: 3
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 0

  selector:
    matchLabels:
      app: grpc-test

  template:
    metadata:
      labels:
        app: grpc-test

    spec:
      containers:
      ...

Why not simply define something like this?

---
apiVersion: v1
kind: Service
metadata:
  name: grpc-service

spec:
  type: LoadBalancer
  ports:
  - name: grpc
    port: 8080
    targetPort: 8080
    protocol: TCP
  selector:
    app: grpc-test

---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: grpc-deployment

spec:
  replicas: 1
  revisionHistoryLimit: 3
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 0

  template:
    metadata:
      labels:
        app: grpc-test

    spec:
      containers:
      ...

Govind Rai · Accepted Answer · 2021-12-23 17:40:27Z

52

Ah! Funny enough, I have once tried wrapping my head around the concept of label selectors as well before. So, here it goes...

First of all, what the hell are these labels used for? Labels within Kubernetes are the core means of identifying objects. A controller controls pods based on their label instead of their name. In this particular case they are meant to identify the pods belonging to the deployment’s replica set.

You actually didn’t have to implicitly define .spec.selector when using the v1beta1 extensions. It would in that case default from .spec.template.labels. However, if you don’t, you can run into problems with kubectl apply once one or more of the labels that are used for selecting change because kubeclt apply will look at kubectl.kubernetes.io/last-applied-configuration when comparing changes and that annotation will only contain the user input when he created the resource and none of the defaulted fields. You’ll get an error because it cannot calculate the diff like:

spec.template.metadata.labels: Invalid value: {"app":"nginx"}: `selector` does not match template `labels`

As you can see, this is a pretty big shortcoming since it means you can not change any of the labels that are being used as a selector label or it would completely break your deployment flow. It was “fixed” in apps/v1beta2 by requiring selectors to be explicitly defined, disallowing mutation on those fields.

So in your example, you actually don’t have to define them! The creation will work and will use your .spec.template.labels by default. But yeah, in the near future when you have to use v1beta2, the field will be mandatory. I hope this kind of answers your question and I didn’t make it any more confusing ;)

edited Dec 23, 2021 at 17:40

Govind Rai

16.1k10 gold badges83 silver badges93 bronze badges

answered May 12, 2018 at 18:59

Toon Lamberigts

5724 silver badges4 bronze badges

Sign up to request clarification or add additional context in comments.

9 Comments

misberner Over a year ago

Came here via Google, and I didn't find this answer very satisfying. I can totally see the motivation for using labels with services, node selectors etc. However, Kubernetes prescribes that the pod template itself must match the selector, so it can only be more loose than the assigned labels. That liberty however is of no value, since it also prescribes that deployment selectors in a namespace must not overlap. So the actual question is: why even have the liberty of specifying a selector, and not force an autogenerated pod label & selector such as owning-deployment=<deployment-uid>?

Toon Lamberigts Over a year ago

The question was what the purpose is of a selector label and why it is allegedly required in the deployment spec. I believe I answered that question. Regarding your question, label selectors were not meant to provide uniqueness, that's what the UID is for. They only provide a means for grouping objects. And those objects don't necessarily have to be unique within their kind. But yeah, in this case there is indeed not much sense to having it configurable and open for human error as you don't want your deployment to control more pods than the ones from your spec.

Oliver Over a year ago

@ToonLamberigts that was not the question. The OP is also saying that the selector is mandatory and has only one value possible, and therefore it should be automatically generated by the kubernetes instead of asking the user to duplicate information. So the OP question is why require a selector to be defined explicitely, when the only possible definition of the selector is the labels in the template. Your answer definitely does not answer that.

Toon Lamberigts Over a year ago

@user2896438, A deployment doesn't actually directly manage its pods, that might have been confusing from my explanation. A deployment creates a ReplicaSet that has the purpose of maintaining a "set" of "replicas" of said deployment. Every pod owned by a ReplicaSet will get the unique "metadata.ownerReferences" field with the ID of that ReplicaSet. The interesting thing here is that when there is pod that actually matches the selector labels of the RS without having a valid ownerReferences field, the RS will assume that pod as part of its set. So in theory you could work with that ;)

alecbz Over a year ago

I'm with @misberner, I don't think this really answers the core of the question. This is like asking the DMV "why do I need to fill out ten different forms to get a driver's license?" and the DMV answering "well if you don't fill out form X then department Y won't know that you ....". An answer like that is fundamentally more of a description of the current state of affairs, but the question is really "why can't things be organized such that a simple interface like ... would work?".

|

Chris Stryczynski · Accepted Answer · 2019-11-26 18:51:59Z

6

However, if you don’t, you can run into problems with kubectl apply once one or more of the labels that are used for selecting change because kubeclt apply will look at kubectl.kubernetes.io/last-applied-configuration when comparing changes and that annotation will only contain the user input when he created the resource and none of the defaulted fields.

Quoting from Toon's answer.

My interpretation is it's not logically necessary at all. It's only due to the limitation of the current implementation of Kubernetes, that it has some weird "behavior" in that the functionality it uses to "compare" two deployments / objects does not take into account "default values".

answered Nov 26, 2019 at 18:51

Chris Stryczynski

35.1k61 gold badges211 silver badges337 bronze badges

Comments

cajual · Accepted Answer · 2022-11-03 17:41:55Z

It is a method to decouple a replicaset type from a pod type. There are many similar answers here, but the crux of it is that a deployment/replicaset may be changed at a future point in time, but it won't know what the previous selector was for the last revision. It would have to look at the last revision's template.metadata.labels and then recursively apply those pod labels as the current revision selector. But wait! What if the template.metadata.labels in the current revision changes? Now how do you account for two template.metadata.labels label sets if the new spec doesn't include the same label(s) in the prior revision where the matchLabels was inferred?

Consider inferred matchLabels:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: grpc-deployment
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: grpc-test
    spec:
      containers:
      ...

Now if I were to go and revise this deployment, my client-side doesn't have awareness of the inferred matchLabels, so my changes would need to account for existing pods. Server-side could do some magic to assume the context in a diff, but what if I changed my template.metadata.labels:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: grpc-deployment
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: grpc-test-new
    spec:
      containers:
      ...

Now my deployment would need to both infer the new template.metadata.labels as well as munged with the existing server-side, else you end up orphaning a bunch of pods.

I hope this helps illustrate a scenario where explicitly defining the selector allows you to be more flexible in your template updates while still retaining the revision history of previous selectors.

Tomislav Mikulin · Accepted Answer · 2018-05-12 18:59:04Z

3

As far as I know, the selector in the deployment is an optional property.

The template is the only required field of spec.

So, you don't need the use the label selector in the deployment, and in you're example I don't see why you couldn't use the latter part?

answered May 12, 2018 at 18:59

Tomislav Mikulin

6,1455 gold badges27 silver badges39 bronze badges

2 Comments

philo Over a year ago

That's true of extensions/v1beta1 Deployments, but not apps/v1 Deployments. Apparently also required in v1beta2, from OP's question.

user6826691 Over a year ago

If template.metadata has 2 labels defined then should we define all labels in the selector? spec: replicas: 1 template: metadata: labels: app: foo-bar elastic-instance:foo

Yakkov Leng · Accepted Answer · 2021-12-28 23:43:41Z

1

Deployments are dynamic objects, for example, when your system need a scale up and add more Pods. The template section only defines the Pods that this Deployment would create when you do kubectl apply, while the selector section ensures that the newly created Pods by scaling up are still managed by the already existing Deployment.

Generally speaking, Deployment continuously watches all the Pods and see if there are any Pods it should control, via the selector section.

answered Dec 28, 2021 at 23:43

Yakkov Leng

191 bronze badge

1 Comment

Ofek .T. Over a year ago

That sounds a bit odd, if that was the case then the deployment would just look for pods with the specific deployment id in their name (or add the id as a label by itself). So it wouldn't need the developer creating a label and mentioning that exact label just few lines ahead

Collectives™ on Stack Overflow

What is the purpose of a kubernetes deployment pod selector?

5 Answers 5

9 Comments

Comments

Comments

2 Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

9 Comments

Comments

Comments

2 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related