-
Notifications
You must be signed in to change notification settings - Fork 660
Description
Summary
We have to handle resources both at the scheduler level and at the runtime level.
Swarm provided a single set of resources (CPU, Memory) which was used both for scheduling and constraining.
Several users were unhappy by this since specifying a resource limit resulted in reservation. For instance, if the user wanted to limit a container to 1GB of RAM (-m 1g) then the scheduler assumed that gigabyte was reserved and would "take it away" from the machine.
In SwarmKit
, we should let users specify the two independently:
- Reservation: Amount of resources reserved for this task on the node. A node should not go beyond its total capacity, as in, the scheduler should only chose a node if:
sumOfAllTasksReservations(node) + task.Resources.Reserved < node.Resources - Limits: Limit the amount (quota) of resources the task can use. This is effectively the
-mand-cflags of the engine.
CPU:
Swarm was using --cpu-shares which is very approximate. Docker later introduced --cpu-period and --cpu-quota which use CFS (Completely Fair Scheduler). We should leverage that.
Memory:
Reservation is actually trickier than just scheduling.
Tasks can have higher limits than their reservation. There might also be tasks on the same node that simply don't have reservations nor limits. We might want to let tasks "burst" beyond their limits if no one else is using the resources.
Network:
@mrjana Is there anything we can do with network resources?
Disk:
Not sure if this is tied to volumes rather than containers.
/cc @aaronlehmann @LK4D4 @tonistiigi @icecrime
This is a combination of scheduling plus low-level system stuff, we're going to need to help of engine to design it.