Skip to content

adding docs for node allocatable#2649

Merged
chenopis merged 3 commits into
kubernetes:release-1.6from
vishh:node-allocatable
Mar 15, 2017
Merged

adding docs for node allocatable#2649
chenopis merged 3 commits into
kubernetes:release-1.6from
vishh:node-allocatable

Conversation

@vishh

@vishh vishh commented Mar 1, 2017

Copy link
Copy Markdown
Contributor

Signed-off-by: Vishnu kannan <vishnuk@google.com>
@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Mar 1, 2017
@vishh vishh added this to the 1.6 milestone Mar 1, 2017
@chenopis chenopis changed the base branch from master to release-1.6 March 2, 2017 17:50
@chenopis chenopis requested a review from derekwaynecarr March 3, 2017 08:23
Comment thread docs/admin/node-allocatable.md Outdated

### Kube Reserved

**Kubelet Flag**: `--kube-reserved=[cpu=100mi][,][memory=100Mi]`

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

100m,100Mi

Comment thread docs/admin/node-allocatable.md Outdated
### Kube Reserved

**Kubelet Flag**: `--kube-reserved=[cpu=100mi][,][memory=100Mi]`
**Kubelet Flag**: `--kube-reserved-cgroup=`/runtime.slice`

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this the name you are using in your images?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should make clear that /runtime.slice is not the kubelet default value.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point on the defaults. I hope I have made the defaults clear this time around. PTAL

Comment thread docs/admin/node-allocatable.md Outdated
[This performance dashboard](http://node-perf-dash.k8s.io/#/builds) exposes `cpu` and `memory` usage profiles of `kubelet` and `docker engine` at multiple levels of pod density.
[This blog post](http://blog.kubernetes.io/2016/11/visualize-kubelet-performance-with-node-dashboard.html) explains how the dashboard can be interpreted to come up with a suitable `kube-reserved` reservation.

It is recommended that the kubernetes system daemons are placed under a top level control group (`system.slice` on systemd machines for example).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this text should be in system reserved section.

you should have text specific to kube daemons here..

Comment thread docs/admin/node-allocatable.md Outdated
### System Reserved

**Kubelet Flag**: `--system-reserved=[cpu=100mi][,][memory=100Mi]`
**Kubelet Flag**: `--system-reserved-cgroup=`/system.slice`

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make clear this flag has no default.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to make clear that the kubelet doesnt create either of these two cgroups.

Comment thread docs/admin/node-allocatable.md Outdated
Evictions are supported for `memory` and `storage` only.
By reserving some memory via `--eviction-hard` flag, the `kubelet` attempts to `evict` pods whenever memory availability on the node drops below the reserved value.
Hypothetically, if system daemons did not exist on a node, pods cannot use more than `capacity - eviction-hard`.
For this reason, resources reserved for evictions will not be available for pods.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to schedule against?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scheduling is meant to be implicit since pods can be placed directly on nodes bypassing the scheduler

Comment thread docs/admin/node-allocatable.md Outdated

**Kubelet Flag**: `--enforce-node-allocatable=[pods][,][system-reserved][,][kube-reserved]`

The scheduler will treat `Allocatable` as the available `capacity` for pods.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would remove the use of will style phrasing in the document as we are describing the present in this doc.

The scheduler treats 'Allocatable'...

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

@derekwaynecarr derekwaynecarr left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comments throughout.

The scheduler will treat `Allocatable` as the available `capacity` for pods.

`kubelet` will enforce `Allocatable` across pods by default.
This enforcement is controlled by specifying `pods` value to the kubelet flag `--enforce-node-allocatable`.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note that this is the default value.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack

Comment thread docs/admin/node-allocatable.md Outdated
### System Reserved

**Kubelet Flag**: `--system-reserved=[cpu=100mi][,][memory=100Mi]`
**Kubelet Flag**: `--system-reserved-cgroup=`/system.slice`

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to make clear that the kubelet doesnt create either of these two cgroups.

Comment thread docs/admin/node-allocatable.md Outdated
However, Kubelet cannot burst and use up all available Node resources if `kube-reserved` is enforced.

Be extra careful while enforcing `system-reserved` reservation since it can lead to critical system services being CPU starved or OOM killed on the node.
The recommendation is to enforce `system-reserved` only if a user has profiled their nodes exhaustively to come up with precise estimates.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and is confident in their ability to recover if any item in that group is oom_killed.

Comment thread docs/admin/node-allocatable.md Outdated

* To begin with enforce `Allocatable` on `pods`.
* Once adequate monitoring and alerting is in place to track kube system daemons, attempt to enforce `kube-reserved` based on usage heuristics.
* If aboslutely necessary, enforce `system-reserved` over time.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo on absolutely

@vishh vishh left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vishh vishh force-pushed the node-allocatable branch 2 times, most recently from 94c3ef3 to a617a42 Compare March 9, 2017 19:44
Signed-off-by: Vishnu kannan <vishnuk@google.com>
@vishh vishh force-pushed the node-allocatable branch from a617a42 to bcd5e12 Compare March 9, 2017 19:45
@vishh

vishh commented Mar 9, 2017

Copy link
Copy Markdown
Contributor Author

@kubernetes/sig-docs-maintainers this PR is meant for v1.6 Can I get a docs review?

@derekwaynecarr derekwaynecarr left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one typo, and maybe some more clarifying text.

while not needed on this pr, i feel like this should be linked to from somewhere centrally on how to administer a kubernetes node. maybe a doc team member can assist there.

Comment thread docs/admin/node-allocatable.md Outdated

Memory pressure at the node level leads to System OOMs which affects the entire node and all pods running on it.
Nodes can go offline temporarily until memory has been reclaimed.
To avoid (or reduce the probabilty) system OOMs kubelet provides [`Out of Resource`](./out-of-resource.md) management.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: probability

The scheduler treats `Allocatable` as the available `capacity` for pods.

`kubelet` enforce `Allocatable` across pods by default.
This enforcement is controlled by specifying `pods` value to the kubelet flag `--enforce-node-allocatable`.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe explain what enforcement means? for example, by enforcing at this level, we ensure pods cannot consume more memory and cpu time than allocated?

Signed-off-by: Vishnu kannan <vishnuk@google.com>
@vishh

vishh commented Mar 14, 2017

Copy link
Copy Markdown
Contributor Author

@derekwaynecarr PTAL

@derekwaynecarr

Copy link
Copy Markdown
Member

/lgtm

@chenopis chenopis merged commit d4383a4 into kubernetes:release-1.6 Mar 15, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants