Kubelet to understand pods, and to be able to pull from apiserver

Several inter-related goals:
1. Kubelet continues to be usable in isolation, as well as in a Kubernetes cluster. _(composability)_
2. Config defined in terms of Pods should work both with an apiserver, and directly with Kubelets _(bootstraping, debugging, composability)_
3. Kubelet's can only see pods that they need to see _(security)_
4. Don't create more apiserver API objects than necessary _(ease of understanding)_
5. Node etcd need not be connected to master etcd. _(security, isolation, scalability)_
6. Not have clients (be they kubectl, or kubelets or whatever) depend on the storage layer of the apiserver. _(abstraction, allowing multiple implementations of the API)_

Current state:
1. Kubelet can read container manifest from a local file.  Or from an URL.  Used by [Google ContainerVM](https://cloud.google.com/compute/docs/containers/container_vms).
2. Kubelets can read boundPods from their local etcd, which is connected to the master machine's etcd.
3. Kubelets can talk to the apiserver, and do write /events.  They don't read /pods or /boundPods.

Suggested changes:
1. Kubelet learns the definition of type Pods. 
2. Kubelet can read json pod definitions from local files or URLs. 
   -  It also still supports containerManifests for the foreseeable future, with some way for it to determine which type to expect from a source.
3. In a "typical" Kubernetes cluster, a Kubelet watches /api/v1beta3/pods for what pods it should run. 
4. Get rid of BoundPod and BoundPods since nothing reads them anymore.
5. Add a Host field to the PodStatus (but not PodSpec).  
6. When scheduler writes a Binding, the PodStatus.Host is be set and resourceVersion is updated.  
7. A cluster is bootstrapped by first starting one or more VMs with, 

Concerns that may be raised and responses:

**Q1:** Are changes to the set of pods bound made by the scheduler machine atomic or eventually consistent?

**A1:** It could work either way.  If we want atomic behavior, we could implement that in apiserver more readily that we could when we directly expose our storage via etcd.

**Q2:** Should the kubelet be allowed to see CurrentState (now PodStatus in v1beta3)?  It generates (some of) the status, so why let it see that?

**A2:** We could implement this if it is important.  Kubelet would watch pods with a selector that matches only pods with PodStatus.Host == kubelets hostname, and could use a field selector so that only PodSpec and not PodStatus is returned.

**Q3:** How do we prevent Kubelets from seeing other node's pods.
**A3:** There are a couple of ways I can think to do this with small changes to our current authorization policy.   
1. One way is to have a distinguished "kubelet" user and special case its authorization.  
2. Another is to create a new policy as each "kubelet" is added which matches on the SourceIP of the request, and requires a selector to be part of the request which selects on "PodStatus.Host=$SourceIP".    This may make some assumptions about the network security of the cluster, but seems like it could work.
3. A variation on the previous is to only have one line of policy for all kubelets, but have a "condition" field in the policy that checks that PodStatus.Host matches SourceIP.
4. Another variation is to have a different token for each kubelet and a separate policy for each.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kubelet to understand pods, and to be able to pull from apiserver #2483

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Kubelet to understand pods, and to be able to pull from apiserver #2483

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions