MetadataPolicy and its use in choosing the scheduler in multi-scheduler Kubernetes system by davidopp · Pull Request #18262 · kubernetes/kubernetes

davidopp · 2015-12-06T22:47:12Z

ref/ #11793
ref/ #17097
ref/ #17324

@thockin @HaiyangDING @bgrant0607 @cameronbrunner @timothysc @hongchaodeng @mali11 @mqliang @derekwaynecarr

This change is

k8s-github-robot · 2015-12-06T22:50:26Z

Labelling this PR as size/L

k8s-bot · 2015-12-06T23:23:11Z

GCE e2e build/test failed for commit 9cf651d.

HaiyangDING · 2015-12-07T01:09:01Z

docs/proposals/choosing-scheduler.md

/s/sheduling/scheduling

Thanks, fixed.

hongchaodeng · 2015-12-07T03:48:50Z

Thanks @davidopp . The PR looks great. Just have a few questions if you don't mind.

k8s-bot · 2015-12-07T04:25:55Z

GCE e2e test build/test passed for commit 692ce705c4557a6512e8faffff52506cc36a4edc.

thockin · 2015-12-07T06:07:51Z

docs/proposals/choosing-scheduler.md

What values can this take?

Can you show somewhere what the YAML would look like for setting a named scheduler as default scheduler? And suggest where this "intentionally generic" object would be extended to, for example, include network policy.

What values can this take?

My intent is that it can take any string. Interpreting the string is the purview of the component that is using a PodPolicy. For example, for the puts-a-scheduler-name-annotation-on-a-pod admission controller, this string is the scheduler name to apply. In another component, the string my be effectively an enum value, which triggers arbitrary behavior based on the value.

Can you show somewhere what the YAML would look like for setting a named scheduler as default scheduler?

Empty PodSelector matches all pods (in the namespace), so you could have the last PodPolicyRule in the list have a PolicyPredicate with an empty PodSelector; since rules are evaluated in order, this would have the effect of using the corresponding Policy for any pods that don't match any of the "real" rules.

And suggest where this "intentionally generic" object would be extended to, for example, include network policy.

Sorry, my claims of "intentionally generic" are perhaps a bit overblown. My thought was simply that the "Policy" string could be used however the consumer wants; in the scheduler-chooser it's the name of the desired scheduler, while in another consumer it could be effectively an enum value. I didn't give any real thought to use cases outside of scheduler-chooser, I just kinda assumed a string would be good enough. I'm quite willing to believe that assumption is wrong. (BTW @bgrant0607 is the one who suggested to do something generic that could be reused by other components that need policies.)

I think the YAML ends up as

kind: PodSchedulerPolicy spec: policy: rules: - policyPredicate: podSelector: foo: bar policy: my-custom-scheduler

That's a WHOLE LOT of indent for a string. It looks like you could kill one level of nesting with no loss of info or flexibility?

kind: PodSchedulerPolicy spec: policy: - predicate: podSelector: foo: bar value: my-custom-scheduler

davidopp · 2015-12-07T08:03:05Z

I pushed a new commit to address the comments @thockin had on the confusing wording in the description. I didn't actually change the design yet, though I do like his suggestion for an alternative design. Interested to see what others have to say.

k8s-bot · 2015-12-07T08:32:51Z

GCE e2e test build/test passed for commit 2d76dfb45a65d5adc0c9c82eaa1f37d26f387eca.

bgrant0607 · 2015-12-08T03:22:45Z

cc @pmorie @pweil-

davidopp · 2015-12-08T07:49:46Z

I've revamped the proposal based on the feedback from @bgrant0607 and @thockin. PTAL.

k8s-bot · 2015-12-08T09:30:12Z

GCE e2e build/test failed for commit ab53955d05464ab2d595f66d1a0674ed15916ca7.

k8s-bot · 2015-12-08T09:48:04Z

GCE e2e build/test failed for commit 3dd572a50dccdb11a9dfa8fbd487214acfce0e3f.

k8s-bot · 2015-12-08T09:48:08Z

GCE e2e build/test failed for commit 26b2c0abcf539622a98368d7c60d98a1695f61ec.

davidopp · 2015-12-09T01:48:33Z

It occurred to me that, at least for the scheduler-picking case, it could be useful to allow multiple of the same type of action per PodPolicyRule. The semantics would be "pick one of these randomly." So for the scheduler use case, you could give several different annotations (all for the scheduler name key) and the consumer of the PodPolicy would interpret that as "pick one of these at random." This would be useful if you are running multiple replicas of the same scheduler for performance reasons. Another approach would be to make the PolicyPredicate a bit more expressive, e.g. so that you could have one PolicyPredicate for "hash of the Pod is less than [midpoint of your hash range]" and another for "hash of the Pod is [greater than or equal to the midpoint of your hash range]." Then you could assign a different scheduler for each PolicyPredicate and sort of get the same random behavior (assuming the pods are different so you get different hashes).

HaiyangDING · 2015-12-09T03:30:35Z

Cool, I will start the implementation once the PR gets merged. I need some time to catch up.

hongchaodeng · 2016-02-05T18:33:48Z

(1) and (2) sounds a conceptual difference. From engineering's point of view, both are gonna need a central controller or registry to guard concurrency issues.

If a scheduler fails, the schedulers would have to detect this and repartition the space.

I want to point out resharding logic should be in a separate layer instead of going into scheduler.

I want to also point out that the goal has changed slightly after the beginning. At the beginning, it's about to enable flexibility in scheduling policies. Now we are going way further and talking about fault tolerance and load balancing. To be honest, it would be better if we could separate the issues and deal with them one by one.

bgrant0607 · 2016-02-05T18:42:35Z

docs/proposals/metadata-policy.md

Please remove this comment

davidopp · 2016-02-08T02:37:20Z

Incorporated reviewer comments, PTAL.

k8s-bot · 2016-02-08T03:08:02Z

GCE e2e test build/test passed for commit a1df7444e59a1073adcc60f188756b1556549734.

derekwaynecarr · 2016-02-08T19:55:56Z

docs/proposals/metadata-policy.md

typo evaluation is

derekwaynecarr · 2016-02-08T19:57:17Z

Should this be called "PodMetadataPolicy"? I anticipate similar things may be needed for PersistentVolumeClaim in the future.

bgrant0607 · 2016-02-09T22:31:50Z

@derekwaynecarr I wasn't thinking of this as specific to pods.

bgrant0607 · 2016-02-09T22:35:47Z

LGTM. Please rebase, run hack/update-generated-docs.sh, squash, and apply the lgtm label.

derekwaynecarr · 2016-02-09T23:47:52Z

The comments in Godoc made this seem explicit to just pods.

On Tuesday, February 9, 2016, Brian Grant notifications@github.com wrote:

@derekwaynecarr https://github.com/derekwaynecarr I wasn't thinking of
this as specific to pods.

—
Reply to this email directly or view it on GitHub
#18262 (comment)
.

davidopp · 2016-02-10T06:19:08Z

@derekwaynecarr has a good point -- I say "pod" all over the place, never generalized it from the initial version which was going to be just for pods. I'll fix it (haven't merged yet).

k8s-github-robot · 2016-02-10T06:55:19Z

Automatic merge from submit-queue

Auto commit by PR queue bot

k8s-bot · 2016-02-10T06:59:05Z

GCE e2e build/test failed for commit 05dcf74.

HaiyangDING · 2016-02-16T01:19:54Z

Now that this has been merged, I will send relevant PR asap.

jason-riddle · 2017-04-05T03:13:09Z

Now that the proposal has been accepted, has there been any discussion around actually implementing MetadataPolicy?

davidopp · 2017-04-05T05:14:30Z

TBH I have not seen any demand for MetadataPolicy. I suspect that people who are using multiple schedulers just write their own admission controller that hard-codes the policy for setting schedulerName (or reads a policy from some custom configuration mechanism they set up themselves).

smarterclayton · 2017-04-05T18:29:04Z

I'd like to see discussion on podschedulerpolicy or similar that resolves: 1. Control over tolerations (rbac or otherwise) 2. Control over node selector 3. Whether to support small M "logical configs" for placement that are named and usable for human focused uis 4. Unifies the namespace placement policy defaulters with a real resource. 5. Potentially controls (or at least discusses intersection with) which schedulers are available to select

davidopp assigned bgrant0607 Dec 6, 2015

googlebot added the cla: yes label Dec 6, 2015

k8s-github-robot added kind/design Categorizes issue or PR as related to design. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Dec 6, 2015

HaiyangDING reviewed Dec 7, 2015
View reviewed changes

davidopp force-pushed the choose-scheduler branch from 9cf651d to 692ce70 Compare December 7, 2015 03:56

davidopp changed the title ~~PodPolicy and PodSchedulerPolicy admission controller proposal initial draft~~ PodPolicy and PodSchedulerPolicy admission controller proposal Dec 7, 2015

thockin reviewed Dec 7, 2015
View reviewed changes

davidopp changed the title ~~PodPolicy and PodSchedulerPolicy admission controller proposal~~ PodPolicy and its use for setting scheduler in multi-scheduler Kubernetes system Dec 8, 2015

davidopp changed the title ~~PodPolicy and its use for setting scheduler in multi-scheduler Kubernetes system~~ PodPolicy and its use for setting scheduler name in multi-scheduler Kubernetes system Dec 8, 2015

davidopp changed the title ~~PodPolicy and its use for setting scheduler name in multi-scheduler Kubernetes system~~ PodPolicy and its use in choosing the scheduler in multi-scheduler Kubernetes system Dec 8, 2015

davidopp mentioned this pull request Dec 8, 2015

Dedicated nodes, taints, and tolerations design proposal #18263

Merged

bgrant0607 reviewed Feb 5, 2016
View reviewed changes

docs/proposals/metadata-policy.md Outdated

Copy link
Copy Markdown

Member

bgrant0607 Feb 5, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove this comment

derekwaynecarr reviewed Feb 8, 2016
View reviewed changes

docs/proposals/metadata-policy.md Outdated

Copy link
Copy Markdown

Member

derekwaynecarr Feb 8, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo evaluation is

MetadataPolicy design doc.

05dcf74

davidopp force-pushed the choose-scheduler branch from a1df744 to 05dcf74 Compare February 10, 2016 06:31

davidopp added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 10, 2016

k8s-github-robot removed the kind/design Categorizes issue or PR as related to design. label Feb 10, 2016

k8s-github-robot pushed a commit that referenced this pull request Feb 10, 2016

Merge pull request #18262 from davidopp/choose-scheduler

b3a84d4

Auto commit by PR queue bot

k8s-github-robot merged commit b3a84d4 into kubernetes:master Feb 10, 2016

davidopp deleted the choose-scheduler branch February 26, 2016 22:45

davidopp mentioned this pull request May 8, 2016

Requirements for multi-scheduler to graduate to Beta and then v1 #25318

Open

3 tasks

bgrant0607 added the sig/service-catalog Categorizes an issue or PR as relevant to SIG Service Catalog. label Nov 2, 2016

starsdeep mentioned this pull request Jan 1, 2017

Apply for GSoC 2017 #39340

Closed

Conversation

davidopp commented Dec 6, 2015 • edited by k8s-oncall Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

k8s-github-robot commented Dec 6, 2015

Uh oh!

k8s-bot commented Dec 6, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hongchaodeng commented Dec 7, 2015

Uh oh!

k8s-bot commented Dec 7, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

davidopp commented Dec 7, 2015

Uh oh!

k8s-bot commented Dec 7, 2015

Uh oh!

bgrant0607 commented Dec 8, 2015

Uh oh!

davidopp commented Dec 8, 2015

Uh oh!

k8s-bot commented Dec 8, 2015

Uh oh!

k8s-bot commented Dec 8, 2015

Uh oh!

k8s-bot commented Dec 8, 2015

Uh oh!

davidopp commented Dec 9, 2015

Uh oh!

HaiyangDING commented Dec 9, 2015

Uh oh!

hongchaodeng commented Feb 5, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

davidopp commented Feb 8, 2016

Uh oh!

k8s-bot commented Feb 8, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

derekwaynecarr commented Feb 8, 2016

Uh oh!

bgrant0607 commented Feb 9, 2016

Uh oh!

bgrant0607 commented Feb 9, 2016

Uh oh!

derekwaynecarr commented Feb 9, 2016

Uh oh!

davidopp commented Feb 10, 2016

Uh oh!

k8s-github-robot commented Feb 10, 2016

Uh oh!

k8s-bot commented Feb 10, 2016

Uh oh!

HaiyangDING commented Feb 16, 2016

Uh oh!

jason-riddle commented Apr 5, 2017

Uh oh!

davidopp commented Apr 5, 2017

Uh oh!

smarterclayton commented Apr 5, 2017 via email

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

davidopp commented Dec 6, 2015 •

edited by k8s-oncall

Loading