adding docs for node allocatable#2649
Conversation
Signed-off-by: Vishnu kannan <vishnuk@google.com>
|
|
||
| ### Kube Reserved | ||
|
|
||
| **Kubelet Flag**: `--kube-reserved=[cpu=100mi][,][memory=100Mi]` |
| ### Kube Reserved | ||
|
|
||
| **Kubelet Flag**: `--kube-reserved=[cpu=100mi][,][memory=100Mi]` | ||
| **Kubelet Flag**: `--kube-reserved-cgroup=`/runtime.slice` |
There was a problem hiding this comment.
is this the name you are using in your images?
There was a problem hiding this comment.
we should make clear that /runtime.slice is not the kubelet default value.
There was a problem hiding this comment.
Good point on the defaults. I hope I have made the defaults clear this time around. PTAL
| [This performance dashboard](http://node-perf-dash.k8s.io/#/builds) exposes `cpu` and `memory` usage profiles of `kubelet` and `docker engine` at multiple levels of pod density. | ||
| [This blog post](http://blog.kubernetes.io/2016/11/visualize-kubelet-performance-with-node-dashboard.html) explains how the dashboard can be interpreted to come up with a suitable `kube-reserved` reservation. | ||
|
|
||
| It is recommended that the kubernetes system daemons are placed under a top level control group (`system.slice` on systemd machines for example). |
There was a problem hiding this comment.
i think this text should be in system reserved section.
you should have text specific to kube daemons here..
| ### System Reserved | ||
|
|
||
| **Kubelet Flag**: `--system-reserved=[cpu=100mi][,][memory=100Mi]` | ||
| **Kubelet Flag**: `--system-reserved-cgroup=`/system.slice` |
There was a problem hiding this comment.
make clear this flag has no default.
There was a problem hiding this comment.
we need to make clear that the kubelet doesnt create either of these two cgroups.
| Evictions are supported for `memory` and `storage` only. | ||
| By reserving some memory via `--eviction-hard` flag, the `kubelet` attempts to `evict` pods whenever memory availability on the node drops below the reserved value. | ||
| Hypothetically, if system daemons did not exist on a node, pods cannot use more than `capacity - eviction-hard`. | ||
| For this reason, resources reserved for evictions will not be available for pods. |
There was a problem hiding this comment.
Scheduling is meant to be implicit since pods can be placed directly on nodes bypassing the scheduler
|
|
||
| **Kubelet Flag**: `--enforce-node-allocatable=[pods][,][system-reserved][,][kube-reserved]` | ||
|
|
||
| The scheduler will treat `Allocatable` as the available `capacity` for pods. |
There was a problem hiding this comment.
i would remove the use of will style phrasing in the document as we are describing the present in this doc.
The scheduler treats 'Allocatable'...
| The scheduler will treat `Allocatable` as the available `capacity` for pods. | ||
|
|
||
| `kubelet` will enforce `Allocatable` across pods by default. | ||
| This enforcement is controlled by specifying `pods` value to the kubelet flag `--enforce-node-allocatable`. |
There was a problem hiding this comment.
note that this is the default value.
| ### System Reserved | ||
|
|
||
| **Kubelet Flag**: `--system-reserved=[cpu=100mi][,][memory=100Mi]` | ||
| **Kubelet Flag**: `--system-reserved-cgroup=`/system.slice` |
There was a problem hiding this comment.
we need to make clear that the kubelet doesnt create either of these two cgroups.
| However, Kubelet cannot burst and use up all available Node resources if `kube-reserved` is enforced. | ||
|
|
||
| Be extra careful while enforcing `system-reserved` reservation since it can lead to critical system services being CPU starved or OOM killed on the node. | ||
| The recommendation is to enforce `system-reserved` only if a user has profiled their nodes exhaustively to come up with precise estimates. |
There was a problem hiding this comment.
and is confident in their ability to recover if any item in that group is oom_killed.
|
|
||
| * To begin with enforce `Allocatable` on `pods`. | ||
| * Once adequate monitoring and alerting is in place to track kube system daemons, attempt to enforce `kube-reserved` based on usage heuristics. | ||
| * If aboslutely necessary, enforce `system-reserved` over time. |
vishh
left a comment
There was a problem hiding this comment.
@derekwaynecarr PTAL
94c3ef3 to
a617a42
Compare
Signed-off-by: Vishnu kannan <vishnuk@google.com>
|
@kubernetes/sig-docs-maintainers this PR is meant for v1.6 Can I get a docs review? |
derekwaynecarr
left a comment
There was a problem hiding this comment.
one typo, and maybe some more clarifying text.
while not needed on this pr, i feel like this should be linked to from somewhere centrally on how to administer a kubernetes node. maybe a doc team member can assist there.
|
|
||
| Memory pressure at the node level leads to System OOMs which affects the entire node and all pods running on it. | ||
| Nodes can go offline temporarily until memory has been reclaimed. | ||
| To avoid (or reduce the probabilty) system OOMs kubelet provides [`Out of Resource`](./out-of-resource.md) management. |
| The scheduler treats `Allocatable` as the available `capacity` for pods. | ||
|
|
||
| `kubelet` enforce `Allocatable` across pods by default. | ||
| This enforcement is controlled by specifying `pods` value to the kubelet flag `--enforce-node-allocatable`. |
There was a problem hiding this comment.
maybe explain what enforcement means? for example, by enforcing at this level, we ensure pods cannot consume more memory and cpu time than allocated?
Signed-off-by: Vishnu kannan <vishnuk@google.com>
|
@derekwaynecarr PTAL |
|
/lgtm |
User facing documentation for https://github.com/kubernetes/community/blob/master/contributors/design-proposals/node-allocatable.md#node-allocatable-resources
cc @derekwaynecarr @dashpole @dchen1107
This change is