Conversation
|
Can one of the admins verify that this patch is reasonable to test? (reply "ok to test", or if you trust the user, reply "add to whitelist") If this message is too spammy, please complain to ixdy. |
1 similar comment
|
Can one of the admins verify that this patch is reasonable to test? (reply "ok to test", or if you trust the user, reply "add to whitelist") If this message is too spammy, please complain to ixdy. |
|
cc @hurf |
51f3a1b to
3146d54
Compare
|
We found a Contributor License Agreement for you (the sender of this pull request) and all commit authors, but as best as we can tell these commits were authored by someone else. If that's the case, please add them to this pull request and have them confirm that they're okay with these commits being contributed to Google. If we're mistaken and you did author these commits, just reply here to confirm. |
3146d54 to
83a20fa
Compare
83a20fa to
05f9ca9
Compare
|
CLAs look good, thanks! |
|
Can one of the admins verify that this patch is reasonable to test? (reply "ok to test", or if you trust the user, reply "add to whitelist") If this message is too spammy, please complain to ixdy. |
2 similar comments
|
Can one of the admins verify that this patch is reasonable to test? (reply "ok to test", or if you trust the user, reply "add to whitelist") If this message is too spammy, please complain to ixdy. |
|
Can one of the admins verify that this patch is reasonable to test? (reply "ok to test", or if you trust the user, reply "add to whitelist") If this message is too spammy, please complain to ixdy. |
|
Sorry for the delay in responding. I think you are addressing a good problem here. Our thought for how to address this is something more like what Borg does, where each pod has a priority, and the scheduler processes pods in priority order. In Borg the scheduler queue priority is the same priority as is used to determine which pods can preempt (evict) other pods in order to get scheduled, which is important because if you don't process pods from highest to lowest preemption priority, you may schedule a (low-preemption-priority) pod and then immediately preempt it when the next pod you process from the scheduler queue has a higher preemption priority. Within a single priority, you can do round-robin (or any other kind of fairness approach) among the pending pods with that priority. We can get the effect of something like your deadline policy by guaranteeing some minimum fraction of the scheduler's time will be spent on each priority level (when there are pending pods at that priority level). This represents a priority inversion relative to the policy I mentioned in the first paragraph (where we process pending pods strictly in priority order) but it shouldn't be a problem as long as it isn't needed very often. The benefit of the approach I've described above is that it doesn't require any new knobs on the scheduler or pods (other than priority, which we need for preemption anyway). |
|
IIUR, with priority, the scheduler will maintain multiple queues, only when queue with higher priority is empty, pods in queue with lower priority will get scheduled? By setting a fixed time that scheduler will spend on each priority is intended to avoid pods with lower priority waiting for too much time. But 'too much time' is different for each pod, and a fixed time set to scheduler will make it same for all pods, so I'd prefer to let pod itself to describe this requirement. |
Yes, you can think of it that way (though an actual implementation would probably have a single queue sorted by priority). Though as I mentioned, you can add a rule that says you will occasionally allow lower-priority pods to jump the queue, to avoid starvation (and get something similar to the Deadline described in this proposal).
In practice, anything you allow pods to request has to be protected by a quota, otherwise people will just request the "best" of everything. The way Borg handles this is to have resource quota per priority level. This prevents users from setting the highest priority (or equivalently, shortest scheduling deadline) on all of their pods.
Once you implement things like equivalence classes and caching (#17390), it's very hard to calculate the amount of time spent trying to schedule a particular pod. Also, I think you're over-estimating people's ability to make reasonable decisions about how long they're willing to wait for their pod to get considered by the scheduler. On the other hand, I think the per-pod deadline concept is useful in the context of a deadline scheduler for handling batch jobs, where users say "I need this job to run within the next 12 hours, and it will take two hours to run" and then let the system figure out when to run it based on how much resources will be available at different times. But this is very different from having pods specify how often they should be examined by the scheduler when they are pending. |
So, we should add FIFO and RR feature may be very useful in some usage scenario, for example:
And since we support multi-scheduler in the near future, people could deploy multi schedulers with different config to meets their different scheduling requirements. So, personally, I think the approach in this proposal is much more flexible:
And we can add a |
|
I'm wondering how deadline scheduling works, can you explain some more? On the other hand, is scheduling already the bottleneck at this level? If yes, I believe we should put more resource on how to make them concurrent instead :-) btw I love the concept of priority scheduler 👍 though it need more effort of prioritized quota. |
We don't need to evaluate the time the scheduler spent to schedule a pod, but the time a pod waits in the queue to get scheduled. Anyway it's not the key point. I think the main divergence is whether to give user more control or let the scheduler to decide. I'm positive to dealine scheduling. But I don't think FIFO is removed, for when all pods have same priority, it is FIFO.
Just my thoughts for discussion. |
Just as @hurf put:
|
Yes, I agree. I think the kinds of policies that are being discussed here (e.g. FIFO vs. RR vs. HPF) should be scheduler parameters, not pod parameters. We do need "priority" in each pod in order to implement preemption anyway, and we can use that priority as the signal to the scheduler for how to prioritize the pod in the scheduling queue if the scheduler is configured for HPF.
I still disagree about setting Deadline per-pod. I don't think the user will know how to set it, and also the effect is not very visible (if user's pod is pending, how can they tell whether the scheduler is re-evaluating the pod every second or every minute?). I think it makes more sense to make it something that the scheduler decides. For example, scheduler can use HPF but occasionally check lower-priority pods to avoid starvation. And like I said before, I think a "deadline scheduler" in the batch scheduler sense (like http://research.microsoft.com/apps/pubs/default.aspx?id=192091 ) would be very useful. It's also easier to solve the "what prevents every user from asking for the soonest deadline" problem there because you can connect it to some billing mechanism that charges more money for sooner deadlines. But it's very different from the kind of deadline we're talking about here, which only controls how often the scheduler will evaluate the pod, not whether the pod will actually be able to start. |
|
Can one of the admins verify that this patch is reasonable to test? (reply "ok to test", or if you trust the user, reply "add to whitelist") If this message is too spammy, please complain to ixdy. |
Deadline in this PR doesn't control how often the scheduler will evaluate the pod, but the deadline when pod will actually be able to scheduled. Once one pod expired, it will get the highest priority and will be scheduled soon. |
|
Can one of the admins verify that this patch is reasonable to test? (reply "ok to test", or if you trust the user, reply "add to whitelist") This message may repeat a few times in short succession due to jenkinsci/ghprb-plugin#292. Sorry. Otherwise, if this message is too spammy, please complain to ixdy. |
|
Can one of the admins verify that this patch is reasonable to test? If so, please reply "ok to test". This message may repeat a few times in short succession due to jenkinsci/ghprb-plugin#292. Sorry. Otherwise, if this message is too spammy, please complain to ixdy. |
|
Can one of the admins verify that this patch is reasonable to test? If so, please reply "ok to test". This message may repeat a few times in short succession due to jenkinsci/ghprb-plugin#292. Sorry. Otherwise, if this message is too spammy, please complain to ixdy. |
3 similar comments
|
Can one of the admins verify that this patch is reasonable to test? If so, please reply "ok to test". This message may repeat a few times in short succession due to jenkinsci/ghprb-plugin#292. Sorry. Otherwise, if this message is too spammy, please complain to ixdy. |
|
Can one of the admins verify that this patch is reasonable to test? If so, please reply "ok to test". This message may repeat a few times in short succession due to jenkinsci/ghprb-plugin#292. Sorry. Otherwise, if this message is too spammy, please complain to ixdy. |
|
Can one of the admins verify that this patch is reasonable to test? If so, please reply "ok to test". This message may repeat a few times in short succession due to jenkinsci/ghprb-plugin#292. Sorry. Otherwise, if this message is too spammy, please complain to ixdy. |
|
This PR hasn't been active in 153 days. Feel free to reopen. You can add 'keep-open' label to prevent this from happening again. |
@davidopp @kevin-wangzefeng @HaiyangDING @mqliang