Modify Eviction Strategy to take Priority into account#946
Modify Eviction Strategy to take Priority into account#946derekwaynecarr merged 2 commits intokubernetes:masterfrom dashpole:priority_eviction
Conversation
|
|
||
| It will target pods whose usage of the starved resource exceeds its requests. | ||
| Of those pods, it will rank by a function of priority, and usage - requests. | ||
| If system daemons are exceeding their allocation (see [Strategy Caveat](strategy-caveat) below), |
There was a problem hiding this comment.
Other than system daemons, there are critical system pods whose eviction may break the functionality of the node or even the whole cluster. In the future (next couple of months) we will set a priority class for critical system pods. There are two priority classes for system pods: system-cluster-critical and system-node-critical. The former should be set for critical system pods that should be present in every cluster, but they don't need to run on every node. The latter should be set for critical system pods that should be present on every node, e.g., kube-proxy. Pods with system-node-critical priority class should be treated like system daemons and should not get evicted as much as possible.
|
LGTM fyi @sjenning |
Modify Eviction Strategy to take Priority into account
Previously, I introduced a number of possible methods for introducing priority: #846.
We came to agreement on using the strategy:
Only evict pods where usage > requests. Then sort by Function(priority, usage - requests)
This solution provides users with a clear path to avoiding evictions. It prevents abuse by high-priority pods by not allowing them to disrupt other pods that are below, or near their requests. Using a function provides a more nuanced approach to allowing pods to consume "unused" (not requested) memory on the node. Power users, or cluster administrators can determine how unused memory is allocated to pods by choosing priority levels for pods that are closer for more equal sharing of extra memory, or further apart to give better availability to higher priority pods. For pods that have equal priority, the function is equivalent to usage - requests, so that clusters that do not have priority enabled maintain behavior that is similar (though not exactly the same) as today's behavior.
This PR changes the eviction documentation to include this change in the eviction strategy, and removes outdated information on inode eviction.
@kubernetes/sig-node-proposals
@derekwaynecarr @vishh @dchen1107 @sjenning
@davidopp @bsalamat