Skip to content

Implement dedicated nodes using taints and tolerations #17190

@davidopp

Description

@davidopp

This is really a meta-feature; it can be built from other features that we already have or plan to have.

The requirements are

  • nodes are partitioned into groups
  • administrator can specify a policy that forces pods meeting particular criteria to only run on machines in one of these groups
  • optionally, the policy can say that pods that do not meet the criteria can not run on those machines

One possible implementation that meets these requirements is

  • each node optionally has a label with key 'dedicated' and some value denoting a group name
  • an admission controller has a table mapping namespace name to label value, and adds the corresponding <"dedicated", value> node selector to a pod if its namespace is in the table (ideally this is done in a way that the end-user cannot later modify, but I don't think we have that ability yet)
  • not sure yet about how to implement the third bullet from the requirements; Restrict/prefer what pods schedule on particular nodes #14573 has some discussion.

(I guess this could be done with annotations instead of labels.)

The user who requested this feature also requested the following: kube-proxy on a node in the dedicated machine group belonging to namespace A should not know about any of the services outside of namespace A (except system services of course). Of course this only makes sense if the policy in the admission controller assigns pod to dedicated machine group based on the pod's namespace.

This is closely related to the discussion in #14573, but here I'm trying to capture the exact feature that was requested from a user in-person recently.

There is of course a "preferred" variant of this that acts as a preference rather than a hard constraint.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/adminIndicates an issue on admin area.kind/featureCategorizes issue or PR as related to a new feature.lifecycle/frozenIndicates that an issue or PR should not be auto-closed due to staleness.priority/backlogHigher priority than priority/awaiting-more-evidence.sig/schedulingCategorizes an issue or PR as relevant to SIG Scheduling.

    Type

    No type

    Projects

    Status

    Needs Triage

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions