-
Notifications
You must be signed in to change notification settings - Fork 67
Description
(This is a task description for the Centaurus Summer Camp 2021 event)
Background/Motivation:
Improve Arktos scheduler with all the optimizations that you can think of, no matter new algorithms or refactoring code implementation. The target is to improve scheduler throughput measured by pods/second.
Background/Motivation:
Arktos is evolved from Kubernetes with core design changes in various areas, such as VM orchestration, scalability or multi-tenancy. But so far we didn't make much algorithm improvements for scheduler throughput.
Scheduler throughput is a key metric in a large-scale cluster management system. It decides how fast users can deploy or scale workloads. During our 30,000-node scalability test, one single scheduler instance can achieve the scheduling throughput of 100 pod/second (Percentile 50) and 200 pod/second (Percentile 99). And we observed that scheduler throughput decreases when the cluster size goes larger.
We'd like to optimize scheduling throughout while maintaining the same latency level and scheduling quality.
Deliverable:
- An optimized scheduler component, with necessary document about your changes and the commits on your private repo/branch.
Requirements:
- It's better to limit the changes to scheduler component only. So that we can easily integrate your change and test it.
- The scheduling latency should be same or lower, which can be measured by SchedulingLatency.
- The scheduling quality should be same or better, which can be measured by number of scored nodes.
Advisor(s): @XiaoningDing (Xiaoning Ding)
Resource Links:
- Scheduler process entrance: https://github.com/CentaurusInfra/arktos/tree/master/cmd/kube-scheduler
- Scheduler main logics: https://github.com/CentaurusInfra/arktos/tree/master/pkg/scheduler
- Some potential ideas for your reference:
- Change serial pod scheduling to concurrent scheduling? But you need to handle the logics of affinity/anti-affinity correctly, and also the scheduling conflicts.
- Optimize worker numbers based on cpu cores for doing filtering and scoring?
- Optimize scheduler cache and algorithm to remove some O(N) operations per cluster size?
- Fine-granular event operations to reduce the movements of pods in different schedule queues
- There are some further ideas like graph-based scheduling. But I'm afraid it will take a lot of work to implement all current scheduler features and need to change the basic architecture of current scheduler.