Skip to content

✨ crd/marker: add AtMostOneOf and ExactlyOneOf constraints#1212

Merged
k8s-ci-robot merged 2 commits intokubernetes-sigs:mainfrom
shashankram:oneof
Jun 18, 2025
Merged

✨ crd/marker: add AtMostOneOf and ExactlyOneOf constraints#1212
k8s-ci-robot merged 2 commits intokubernetes-sigs:mainfrom
shashankram:oneof

Conversation

@shashankram
Copy link
Contributor

Adds the following validation markers to enforce oneof fields on Types:

  • AtMostOneOf: allows at most one of the specified fields, thereby allowing exactly 1 or no fields to be set. The marker may be repeated to allow mutually exclusive oneof constraints.

  • ExactlyOneOf: allows exactly one of the specified fields, thereby requiring exactly 1 field to be set. The marker may be repeated to allow mutually exclusive oneof constraints.

Examples:

allow at most oneof foo,bar on a Type to be set
+kubebuilder:validation:AtMostOneOf=foo;bar

allow at most oneof foo,bar and oneof baz,qux to be set
+kubebuilder:validation:AtMostOneOf=foo;ba
+kubebuilder:validation:AtMostOneOf=baz;qux

allow exactly oneof foo,bar on a Type to be set
+kubebuilder:validation:ExactlyOneOf=foo;bar

allow exactly oneof foo,bar and oneof baz,qux to be set
+kubebuilder:validation:ExactlyOneOf=foo;ba
+kubebuilder:validation:ExactlyOneOf=baz;qux

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jun 11, 2025
@k8s-ci-robot
Copy link
Contributor

Welcome @shashankram!

It looks like this is your first PR to kubernetes-sigs/controller-tools 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/controller-tools has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot
Copy link
Contributor

Hi @shashankram. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jun 11, 2025
Copy link
Contributor

@JoelSpeed JoelSpeed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we can achieve a way to improve the error messages to be more specific and helpful to end users. Presently the errors in the test output aren't fantastic to tell and end user what they did wrong

@shashankram
Copy link
Contributor Author

I wonder if we can achieve a way to improve the error messages to be more specific and helpful to end users. Presently the errors in the test output aren't fantastic to tell and end user what they did wrong

I agree, but I don't know of a way to do this. Given that these markers are optional, I believe they add value in enforcing structural schema for OneOf types. Even documentation on the types regarding the oneof constraints could help users.

@shashankram
Copy link
Contributor Author

I wonder if we can achieve a way to improve the error messages to be more specific and helpful to end users. Presently the errors in the test output aren't fantastic to tell and end user what they did wrong

I agree, but I don't know of a way to do this. Given that these markers are optional, I believe they add value in enforcing structural schema for OneOf types. Even documentation on the types regarding the oneof constraints could help users.

So the error messages from schema validation are coming from kube-openapi. I am checking to see if they can be improved. I don't think there's much we can do in controller-tools though.

@JoelSpeed
Copy link
Contributor

I wonder what other maintainers (cc @alvaroaleman @sbueringer ) would think of leveraging CEL rules to implement these, instead of using the anyOf/allOf/oneOf that we have in the PR at the moment.

If we leveraged these markers to generate CEL on behalf of the end user, we could customise the messages and provide a standard format that would be more helpful, IMO, than the existing error messages

A very simple way to achieve this behaviour would be to filter based on the names of the fields, though I'm not certain how the cost estimation would handle this

+kubebuilder:validation:XValidation:=rule="self.filter(x, x in ['foo', 'bar']).size() <= 1",message="at most one of 'foo', 'bar' should be set"

@alvaroaleman
Copy link
Member

I wonder what other maintainers (cc @alvaroaleman @sbueringer ) would think of leveraging CEL rules to implement these, instead of using the anyOf/allOf/oneOf that we have in the PR at the moment.

In principle that seems like a good idea, but if there aleady is a custom CEL rule at the same layer, how would we merge?

@JoelSpeed
Copy link
Contributor

In principle that seems like a good idea, but if there aleady is a custom CEL rule at the same layer, how would we merge?

CEL rules are a list, you can have as many as you like provided the runtime cost estimates do not exceed 10,000,000 cost units. I'll have to do a little research to determine if the cost analyzer is smart enough to determine the maximum size of the object based on the number of fields within it, if it is, then the likely cost of this rule would be sub 1000 cost units in even a bad case. I tested on the CEL playground and the cost it determined for a three field validation was 46 units for example.

@shashankram
Copy link
Contributor Author

I wonder what other maintainers (cc @alvaroaleman @sbueringer ) would think of leveraging CEL rules to implement these, instead of using the anyOf/allOf/oneOf that we have in the PR at the moment.

If we leveraged these markers to generate CEL on behalf of the end user, we could customise the messages and provide a standard format that would be more helpful, IMO, than the existing error messages

A very simple way to achieve this behaviour would be to filter based on the names of the fields, though I'm not certain how the cost estimation would handle this

+kubebuilder:validation:XValidation:=rule="self.filter(x, x in ['foo', 'bar']).size() <= 1",message="at most one of 'foo', 'bar' should be set"

I like the CEL approach, though my concern is that a rule like this would be within the complexity budget, and may suddenly not as new fields are added. There is also a complexity budget for the entire CRD, which could become a concern with a lot of such fields. If we can make kube-openapi present more user-friendly errors, that would be a big win. I am exploring that option.

@JoelSpeed
Copy link
Contributor

The filter suggestion I made doesn't actually work, because of the way structures are created in CEL within Kubernetes. They become a structured "selfType" and are not compatible with the filter expression.

There is an alternative though, which is easy to calculate the cost of, and definitely low cost!

+kubebuilder:validation:XValidation:=rule="[has(self.foo), has(self.bar), has(self.baz)].filter(x, x == true).size() <= 1",message="at most one of 'foo', 'bar', 'baz' should be set"

Filtering is proportional to the length of the list and we know that has a fixed size since we will control the number of elements within it.

Breaking down the cost elements of this (I've stepped through this with a debugger):

  • _ <= _ is the total of the cost of the lhs + rhs + 1. The rhs has zero cost (it's a constant), so we are at 1 + lhs.
  • .size() has a fixed cost of 1.
  • filter is the cost of the iterator range (the list input), the allocation of the variables, and then the comparison multiplied by the number of items in the input
  • Checking the field presence has(...) has a fixed cost, but there's a cost of 10 units to allocate the list, so [has(self.foo), has(self.bar), has(self.baz)] costs 12 units, and adding new elements would add 1 extra cost unit
  • The variable allocations also allocate a list, so that's another 10 units, this is constant cost
  • The loop cost is 0, this is constant
  • The step cost is 14, this is constant

So for a N item list, this becomes

1 + 1 + (10 + N) +  10 + (14*N) = 22 + 15N

So for a 2 item list, 52, 3 items 67, 10 items 172.

The per rule limit is 10,000,000 and the per CRD limit is 100,000,000.

I think the cost growth of this rule is small enough that we will be safe without worry.

@shashankram
Copy link
Contributor Author

@JoelSpeed I updated the impl to use CEL and added a test to verify the errors when validation fails

Copy link
Contributor

@JoelSpeed JoelSpeed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that is much cleaner! A few comments but nothing major now

Adds the following validation markers to enforce oneof fields
on Types:

- AtMostOneOf: allows at most one of the specified fields,
  thereby allowing exactly 1 or no fields to be set.
  The marker may be repeated to allow mutually exclusive oneof
  constraints.

- ExactlyOneOf: allows exactly one of the specified fields,
  thereby requiring exactly 1 field to be set.
  The marker may be repeated to allow mutually exclusive oneof
  constraints.

Examples:
```
allow at most oneof foo,bar on a Type to be set
+kubebuilder:validation:AtMostOneOf=foo;bar

allow at most oneof foo,bar and oneof baz,qux to be set
+kubebuilder:validation:AtMostOneOf=foo;ba
+kubebuilder:validation:AtMostOneOf=baz;qux

allow exactly oneof foo,bar on a Type to be set
+kubebuilder:validation:ExactlyOneOf=foo;bar

allow exactly oneof foo,bar and oneof baz,qux to be set
+kubebuilder:validation:ExactlyOneOf=foo;ba
+kubebuilder:validation:ExactlyOneOf=baz;qux
```
@JoelSpeed
Copy link
Contributor

/ok-to-test

/lgtm

@alvaroaleman @sbueringer What do you think of this one? It's not exposing anything a user couldn't do by themselves today, but is a useful syntactic sugar to help make it easier (and means folks don't have to think about what the correct CEL rule would be)

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jun 17, 2025
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 17, 2025
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

DetailsGit tree hash: 77bed8ff94a7368b674d7dc2b67e25f8cd4b2a8a

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 17, 2025
@k8s-ci-robot k8s-ci-robot requested a review from JoelSpeed June 17, 2025 14:47
@alvaroaleman alvaroaleman added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Jun 17, 2025
Copy link
Member

@alvaroaleman alvaroaleman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

Idea seems fine, but looks like there are still some test failures

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 17, 2025
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

DetailsGit tree hash: b62b56cea9173cc30e11ba4a74169125a81ba5df

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 17, 2025
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 17, 2025
@shashankram
Copy link
Contributor Author

/approve

Idea seems fine, but looks like there are still some test failures

My bad, forgot to include the new test directory.
@alvaroaleman would you mind approving again? Thanks

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 18, 2025
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

DetailsGit tree hash: 71fba0d409bbd6449b4b2df031b0e4ebe77cef22

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alvaroaleman, shashankram

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit 7c982e0 into kubernetes-sigs:main Jun 18, 2025
7 checks passed
@shashankram shashankram deleted the oneof branch June 18, 2025 15:10
@shashankram
Copy link
Contributor Author

@JoelSpeed While the CEL based oneofs work fine, they do suffer from the CEL limitation where Arrays are deemed to exceed the complexity budget even with MaxItems constraint, so defining a Oneoff for a type within an array may be infeasible sometimes.

Should we enable an OpenAPI based toggle for these OneOfs that don't suffer from general CEL limitations?

@JoelSpeed
Copy link
Contributor

Can you provide an example that breaks the current implementation?

The CEL rule was deliberately crafted to not have a ballooning cost.

The only array involved in the CEL rule is the one we construct as part of the rule. It doesn't actually perform any array type operations on arrays from within the API.

See #1212 (comment) where I've broken down how the cost of these rules are calculated.

Even with 1000 items in the one of, the rule cost will still be manageable and well, well within the limit. Approx 15000 cost units.

@shashankram
Copy link
Contributor Author

shashankram commented Jun 25, 2025

Can you provide an example that breaks the current implementation?

I am not referring to the CEL rule for the constraint itself, but if the rule is applied on a type that is within an array, then CEL cost estimation could reject this.

@JoelSpeed
Copy link
Contributor

I am not referring to the CEL rule for the constraint itself, but if the rule is applied on a type that is within an array, then CEL cost estimation could reject this.

Ahh, so in that case yes, you're right, this could violate the cost estimation as we would estimate the cost of the rule multiplied by the worst case number of elements in the array. Hmm. On the one hand, preventing these complex validations in unbounded lists has a benefit, the API server still has to do the same complex validations even with the anyOf approach.

But I could see how that could be frustrating.

Perhaps we could add an optional parameter to the marker that uses the anyOf style behaviour instead of CEL, but default to CEL so that our defaults recommend something that is using cost estimation and fair sharing style semantics within the API server.

where Arrays are deemed to exceed the complexity budget even with MaxItems constraint

Thinking this through, the rule cost is the same as in my previous comment, but then the cardinality is set by the max length of the array.

So taking a 5 member AtMostOneOf, this would have a cost of 97. The maximum length of the array then would be 10,000,000 / 97 =~ 103092. Is this really likely to be an issue for people?

We could document that where these markers are used, parent arrays must have sensible maximum lengths. I'd be surprised if anyone was creating an API where they could enter 100000 items into an array, no?

@shashankram
Copy link
Contributor Author

I am not referring to the CEL rule for the constraint itself, but if the rule is applied on a type that is within an array, then CEL cost estimation could reject this.

Ahh, so in that case yes, you're right, this could violate the cost estimation as we would estimate the cost of the rule multiplied by the worst case number of elements in the array. Hmm. On the one hand, preventing these complex validations in unbounded lists has a benefit, the API server still has to do the same complex validations even with the anyOf approach.

But I could see how that could be frustrating.

Perhaps we could add an optional parameter to the marker that uses the anyOf style behaviour instead of CEL, but default to CEL so that our defaults recommend something that is using cost estimation and fair sharing style semantics within the API server.

I think this would be a reasonable alternative to enforce structural schema without needing to drop it due to CEL limitations. E.g., there are bugs in older k8s versions where CEL misbehaves and we are unable to use these oneofs there, or there may be APIs where adding MaxItems is not backward compatible and can thus exceed the complexity budget.

So the OpenAPI based OneOf/AnyOf constraints would be a good alternative despite the unfriendly error message.

where Arrays are deemed to exceed the complexity budget even with MaxItems constraint

Thinking this through, the rule cost is the same as in my previous comment, but then the cardinality is set by the max length of the array.

So taking a 5 member AtMostOneOf, this would have a cost of 97. The maximum length of the array then would be 10,000,000 / 97 =~ 103092. Is this really likely to be an issue for people?

We could document that where these markers are used, parent arrays must have sensible maximum lengths. I'd be surprised if anyone was creating an API where they could enter 100000 items into an array, no?

With deeply nested types containing arrays, I have seen CEL break down. In most cases with newer k8s versions, MaxItems should suffice.

@JoelSpeed
Copy link
Contributor

there are bugs in older k8s versions where CEL misbehaves

That sounds interesting, any links?

So the OpenAPI based OneOf/AnyOf constraints would be a good alternative despite the unfriendly error message.

Yeah so perhaps we take that route of a field on the marker useOpenAPIValidation=true or something like that. @alvaroaleman any opinions?

With deeply nested types containing arrays, I have seen CEL break down. In most cases with newer k8s versions, MaxItems should suffice

Yeah I guess when adding a compatible to change to an existing API, the argument for bounding all parents can be more difficult

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants