Skip to content

MetadataPolicy and its use in choosing the scheduler in multi-scheduler Kubernetes system#18262

Merged
k8s-github-robot merged 1 commit intokubernetes:masterfrom
davidopp:choose-scheduler
Feb 10, 2016
Merged

MetadataPolicy and its use in choosing the scheduler in multi-scheduler Kubernetes system#18262
k8s-github-robot merged 1 commit intokubernetes:masterfrom
davidopp:choose-scheduler

Conversation

@davidopp
Copy link
Copy Markdown
Contributor

@davidopp davidopp commented Dec 6, 2015

@k8s-github-robot
Copy link
Copy Markdown

Labelling this PR as size/L

@k8s-github-robot k8s-github-robot added kind/design Categorizes issue or PR as related to design. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Dec 6, 2015
@k8s-bot
Copy link
Copy Markdown

k8s-bot commented Dec 6, 2015

GCE e2e build/test failed for commit 9cf651d.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/s/sheduling/scheduling

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixed.

@hongchaodeng
Copy link
Copy Markdown
Contributor

Thanks @davidopp . The PR looks great. Just have a few questions if you don't mind.

@davidopp davidopp changed the title PodPolicy and PodSchedulerPolicy admission controller proposal initial draft PodPolicy and PodSchedulerPolicy admission controller proposal Dec 7, 2015
@k8s-bot
Copy link
Copy Markdown

k8s-bot commented Dec 7, 2015

GCE e2e test build/test passed for commit 692ce705c4557a6512e8faffff52506cc36a4edc.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What values can this take?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you show somewhere what the YAML would look like for setting a named scheduler as default scheduler? And suggest where this "intentionally generic" object would be extended to, for example, include network policy.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What values can this take?

My intent is that it can take any string. Interpreting the string is the purview of the component that is using a PodPolicy. For example, for the puts-a-scheduler-name-annotation-on-a-pod admission controller, this string is the scheduler name to apply. In another component, the string my be effectively an enum value, which triggers arbitrary behavior based on the value.

Can you show somewhere what the YAML would look like for setting a named scheduler as default scheduler?

Empty PodSelector matches all pods (in the namespace), so you could have the last PodPolicyRule in the list have a PolicyPredicate with an empty PodSelector; since rules are evaluated in order, this would have the effect of using the corresponding Policy for any pods that don't match any of the "real" rules.

And suggest where this "intentionally generic" object would be extended to, for example, include network policy.

Sorry, my claims of "intentionally generic" are perhaps a bit overblown. My thought was simply that the "Policy" string could be used however the consumer wants; in the scheduler-chooser it's the name of the desired scheduler, while in another consumer it could be effectively an enum value. I didn't give any real thought to use cases outside of scheduler-chooser, I just kinda assumed a string would be good enough. I'm quite willing to believe that assumption is wrong. (BTW @bgrant0607 is the one who suggested to do something generic that could be reused by other components that need policies.)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the YAML ends up as

kind: PodSchedulerPolicy
spec:
    policy:
        rules:
            - policyPredicate:
                  podSelector:
                      foo: bar
              policy: my-custom-scheduler

That's a WHOLE LOT of indent for a string. It looks like you could kill one level of nesting with no loss of info or flexibility?

kind: PodSchedulerPolicy
spec:
    policy:
        - predicate:
              podSelector:
                  foo: bar
          value: my-custom-scheduler

@davidopp
Copy link
Copy Markdown
Contributor Author

davidopp commented Dec 7, 2015

I pushed a new commit to address the comments @thockin had on the confusing wording in the description. I didn't actually change the design yet, though I do like his suggestion for an alternative design. Interested to see what others have to say.

@k8s-bot
Copy link
Copy Markdown

k8s-bot commented Dec 7, 2015

GCE e2e test build/test passed for commit 2d76dfb45a65d5adc0c9c82eaa1f37d26f387eca.

@bgrant0607
Copy link
Copy Markdown
Member

cc @pmorie @pweil-

@davidopp
Copy link
Copy Markdown
Contributor Author

davidopp commented Dec 8, 2015

I've revamped the proposal based on the feedback from @bgrant0607 and @thockin. PTAL.

@davidopp davidopp changed the title PodPolicy and PodSchedulerPolicy admission controller proposal PodPolicy and its use for setting scheduler in multi-scheduler Kubernetes system Dec 8, 2015
@davidopp davidopp changed the title PodPolicy and its use for setting scheduler in multi-scheduler Kubernetes system PodPolicy and its use for setting scheduler name in multi-scheduler Kubernetes system Dec 8, 2015
@davidopp davidopp changed the title PodPolicy and its use for setting scheduler name in multi-scheduler Kubernetes system PodPolicy and its use in choosing the scheduler in multi-scheduler Kubernetes system Dec 8, 2015
@k8s-bot
Copy link
Copy Markdown

k8s-bot commented Dec 8, 2015

GCE e2e build/test failed for commit ab53955d05464ab2d595f66d1a0674ed15916ca7.

@k8s-bot
Copy link
Copy Markdown

k8s-bot commented Dec 8, 2015

GCE e2e build/test failed for commit 3dd572a50dccdb11a9dfa8fbd487214acfce0e3f.

@k8s-bot
Copy link
Copy Markdown

k8s-bot commented Dec 8, 2015

GCE e2e build/test failed for commit 26b2c0abcf539622a98368d7c60d98a1695f61ec.

@davidopp
Copy link
Copy Markdown
Contributor Author

davidopp commented Dec 9, 2015

It occurred to me that, at least for the scheduler-picking case, it could be useful to allow multiple of the same type of action per PodPolicyRule. The semantics would be "pick one of these randomly." So for the scheduler use case, you could give several different annotations (all for the scheduler name key) and the consumer of the PodPolicy would interpret that as "pick one of these at random." This would be useful if you are running multiple replicas of the same scheduler for performance reasons. Another approach would be to make the PolicyPredicate a bit more expressive, e.g. so that you could have one PolicyPredicate for "hash of the Pod is less than [midpoint of your hash range]" and another for "hash of the Pod is [greater than or equal to the midpoint of your hash range]." Then you could assign a different scheduler for each PolicyPredicate and sort of get the same random behavior (assuming the pods are different so you get different hashes).

@HaiyangDING
Copy link
Copy Markdown

Cool, I will start the implementation once the PR gets merged. I need some time to catch up.

@hongchaodeng
Copy link
Copy Markdown
Contributor

(1) and (2) sounds a conceptual difference. From engineering's point of view, both are gonna need a central controller or registry to guard concurrency issues.

If a scheduler fails, the schedulers would have to detect this and repartition the space.

I want to point out resharding logic should be in a separate layer instead of going into scheduler.

I want to also point out that the goal has changed slightly after the beginning. At the beginning, it's about to enable flexibility in scheduling policies. Now we are going way further and talking about fault tolerance and load balancing. To be honest, it would be better if we could separate the issues and deal with them one by one.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove this comment

@davidopp
Copy link
Copy Markdown
Contributor Author

davidopp commented Feb 8, 2016

Incorporated reviewer comments, PTAL.

@k8s-bot
Copy link
Copy Markdown

k8s-bot commented Feb 8, 2016

GCE e2e test build/test passed for commit a1df7444e59a1073adcc60f188756b1556549734.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo evaluation is

@derekwaynecarr
Copy link
Copy Markdown
Member

Should this be called "PodMetadataPolicy"? I anticipate similar things may be needed for PersistentVolumeClaim in the future.

@bgrant0607
Copy link
Copy Markdown
Member

@derekwaynecarr I wasn't thinking of this as specific to pods.

@bgrant0607
Copy link
Copy Markdown
Member

LGTM. Please rebase, run hack/update-generated-docs.sh, squash, and apply the lgtm label.

@derekwaynecarr
Copy link
Copy Markdown
Member

The comments in Godoc made this seem explicit to just pods.

On Tuesday, February 9, 2016, Brian Grant notifications@github.com wrote:

@derekwaynecarr https://github.com/derekwaynecarr I wasn't thinking of
this as specific to pods.


Reply to this email directly or view it on GitHub
#18262 (comment)
.

@davidopp
Copy link
Copy Markdown
Contributor Author

@derekwaynecarr has a good point -- I say "pod" all over the place, never generalized it from the initial version which was going to be just for pods. I'll fix it (haven't merged yet).

@davidopp davidopp added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 10, 2016
@k8s-github-robot k8s-github-robot removed the kind/design Categorizes issue or PR as related to design. label Feb 10, 2016
@k8s-github-robot
Copy link
Copy Markdown

Automatic merge from submit-queue

k8s-github-robot pushed a commit that referenced this pull request Feb 10, 2016
@k8s-github-robot k8s-github-robot merged commit b3a84d4 into kubernetes:master Feb 10, 2016
@k8s-bot
Copy link
Copy Markdown

k8s-bot commented Feb 10, 2016

GCE e2e build/test failed for commit 05dcf74.

@HaiyangDING
Copy link
Copy Markdown

Now that this has been merged, I will send relevant PR asap.

@davidopp davidopp deleted the choose-scheduler branch February 26, 2016 22:45
@bgrant0607 bgrant0607 added the sig/service-catalog Categorizes an issue or PR as relevant to SIG Service Catalog. label Nov 2, 2016
@starsdeep starsdeep mentioned this pull request Jan 1, 2017
@jason-riddle
Copy link
Copy Markdown
Contributor

Now that the proposal has been accepted, has there been any discussion around actually implementing MetadataPolicy?

@davidopp
Copy link
Copy Markdown
Contributor Author

davidopp commented Apr 5, 2017

TBH I have not seen any demand for MetadataPolicy. I suspect that people who are using multiple schedulers just write their own admission controller that hard-codes the policy for setting schedulerName (or reads a policy from some custom configuration mechanism they set up themselves).

@smarterclayton
Copy link
Copy Markdown
Contributor

smarterclayton commented Apr 5, 2017 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/service-catalog Categorizes an issue or PR as relevant to SIG Service Catalog. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.