Skip to content

apimachinery: add StringArrayOrString with json/yaml/proto marshaling#50625

Closed
sttts wants to merge 2 commits intokubernetes:masterfrom
sttts:sttts-StringArrayOrString
Closed

apimachinery: add StringArrayOrString with json/yaml/proto marshaling#50625
sttts wants to merge 2 commits intokubernetes:masterfrom
sttts:sttts-StringArrayOrString

Conversation

@sttts
Copy link
Copy Markdown
Contributor

@sttts sttts commented Aug 14, 2017

In analogy to IntOrString this allows to have either a string or a string array in JSON/YAML/proto, needed by CRD validation #47263.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Aug 14, 2017
@k8s-github-robot k8s-github-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Aug 14, 2017
@sttts sttts assigned liggitt and unassigned dchen1107 Aug 14, 2017
@sttts sttts added the sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. label Aug 14, 2017
@sttts sttts force-pushed the sttts-StringArrayOrString branch 2 times, most recently from ec99c77 to c8566ca Compare August 15, 2017 07:01
@k8s-github-robot k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 15, 2017
@sttts sttts added release-note-none Denotes a PR that doesn't merit a release note. and removed release-note-label-needed labels Aug 15, 2017
@sttts sttts force-pushed the sttts-StringArrayOrString branch from c8566ca to 6116318 Compare August 15, 2017 07:23
@k8s-github-robot k8s-github-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 15, 2017
@sttts sttts force-pushed the sttts-StringArrayOrString branch from 6116318 to 4181cf1 Compare August 15, 2017 09:59
@sttts
Copy link
Copy Markdown
Contributor Author

sttts commented Aug 15, 2017

/retest

func (arraystr *StringArrayOrString) ArrayValue() []string {
if arraystr.Type == String {
if arraystr.StrVal == "" {
return nil
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why nil instead of a zero length slice?

input string
result StringArrayOrStringHolder
}{
{`{"val1": []}`, StringArrayOrStringHolder{SAOrS: FromString("")}},
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This surprised me. Serialized as an array, not a string?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This surprised me. Serialized as an array, not a string?

Oh, unmarshal test

input string
result StringArrayOrStringHolder
}{
{`{"val1": []}`, StringArrayOrStringHolder{SAOrS: FromString("")}},
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a test unmarshalling an empty string please.

@deads2k
Copy link
Copy Markdown
Contributor

deads2k commented Aug 15, 2017

Request for an additional test. lgtm otherwise

/approve no-issue

@sttts sttts force-pushed the sttts-StringArrayOrString branch from 4181cf1 to 194cd5f Compare August 15, 2017 13:15
@sttts
Copy link
Copy Markdown
Contributor Author

sttts commented Aug 15, 2017

Test added.

@sttts sttts added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 15, 2017
@sttts
Copy link
Copy Markdown
Contributor Author

sttts commented Aug 15, 2017

/approve no-issue

@sttts sttts assigned erictune and unassigned erictune Aug 15, 2017
@deads2k
Copy link
Copy Markdown
Contributor

deads2k commented Aug 15, 2017

/lgtm

@smarterclayton you should look.

@enisoc
Copy link
Copy Markdown
Contributor

enisoc commented Aug 16, 2017

Another alternative is to say our "JSON Schema" differs slightly from the official spec, such as having to always use a string array in fields such as this, even if you only have one string (assuming there is no semantic difference between a string and an array with one entry). OpenAPI does this too; their docs explain the minor differences between OpenAPI Schema and the JSON Schema spec.

The ability to exert such control over our own API was after all one of the justifications for importing the structs. If we wanted to match the spec 100%, it may have made more sense to just store the JSON Schema as a string.

@k8s-github-robot
Copy link
Copy Markdown

@sttts PR needs rebase

@k8s-github-robot k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 16, 2017
@sttts
Copy link
Copy Markdown
Contributor Author

sttts commented Aug 17, 2017

@enisoc I am still not convinced that something like StringArrayOrString is bad. What are the issues that everybody expects?

Short term in fact we could just fall back to "string only". This is what OpenAPI v3 does and we only support v3 now anyway. But IMO this is a very stupid restriction in the spec and I expect very much hat v3+ will support more than one type. But by that time we can reconsider.

@enisoc
Copy link
Copy Markdown
Contributor

enisoc commented Aug 17, 2017

@sttts I was mostly just trying to give a hypothetical answer to the question, "What's the alternative?" I'm not familiar enough with the pain experienced with IntOrString to have an opinion on whether StringArrayOrString will cause more pain.

@smarterclayton You mentioned pain with OpenAPI. Were you referring to this thread? It seems like that issue will go away eventually when we switch to OpenAPI v3.

@sttts
Copy link
Copy Markdown
Contributor Author

sttts commented Aug 17, 2017

It seems like that issue will go away eventually when we switch to OpenAPI v3.

We still don't have the type field as array in OpenAPI v3. We have dropped this now from our JSON Schema types. So the unmarshaler will complain if you try an array. If v4 finally adds arrays, we have to reconsider.

@enisoc
Copy link
Copy Markdown
Contributor

enisoc commented Aug 17, 2017

@sttts In the thread, @mbohlool suggested you could work around the specific issue raised there with oneOf. Would that not work?

@sttts
Copy link
Copy Markdown
Contributor Author

sttts commented Aug 17, 2017

@mbohlool suggested you could work around the specific issue raised there with oneOf. Would that not work?

We are on the other side of the problem: we offer the API, but our users won't be able to use an array for types. Actually, it's not oneOf, but anyOf that can replace the array, it's just ugly. Here we did a comparison of all those variants of JSON Schema: https://docs.google.com/spreadsheets/d/1Mkm9L7CXGvRorV0Cr4Vwfu0DH7XRi24YHPiDK1NZWo4/edit?usp=sharing

@enisoc
Copy link
Copy Markdown
Contributor

enisoc commented Aug 17, 2017

Sorry if I'm creating confusion. I didn't mean to suggest that oneOf/anyOf could fix the problem faced by CRD validation.

I was trying to say, if it's true that (1) the problem in that mailing list thread is the only known pain caused by IntOrString, and (2) the problem in that thread could be resolved by going to OpenAPI V3, then it would imply that there are actually no known, long-term problems with IntOrString. Therefore we would have no reason to suspect adding StringArrayOrString will cause long-term pain.

@mml
Copy link
Copy Markdown
Contributor

mml commented Aug 17, 2017

cc @mbohlool @jpbetz

@mbohlool mbohlool added the do-not-merge DEPRECATED. Indicates that a PR should not merge. Label can only be manually applied/removed. label Aug 17, 2017
@bgrant0607
Copy link
Copy Markdown
Member

@sttts We have absolutely no obligation to allow all of JSON / JSONSchema. The K8s API uses its own conventions, and even some of those we regret. Has anyone attempted to determine which features are exercised by the current K8s API surface?

Are there any examples of StringArrayOrString in the K8s core v1 API? IIRC, we explicitly decided not to support that for command, for instance.

We do have the problem that not all of the K8s API can be described by the current OpenAPI spec. At least that was true when I last checked.

cc @mbohlool

@sttts
Copy link
Copy Markdown
Contributor Author

sttts commented Aug 18, 2017

We have absolutely no obligation to allow all of JSON / JSONSchema. The K8s API uses its own conventions

That's a good second view point. We use OpenAPI v3 to decribe kube's API (lower bound), but here we want to define the JSON Schema subset to specify APIs following the kube API conventions. Wondering whether we can derive any upper bound from the conventions we envisions also for user-provided CRDs. Sounds like a nice topic for bachelor thesis (hint!). kubeapi-lint.

@sttts
Copy link
Copy Markdown
Contributor Author

sttts commented Aug 18, 2017

no known, long-term problems with IntOrString

At least not from the spec point of view: {anyOf: [{type: "string"}, {type: "integer"}]} should be equivalent to {type:["string", "integer"]} in v3, not possible in v2.

@mbohlool
Copy link
Copy Markdown
Contributor

kubeapi-lint.

#44005

@mbohlool
Copy link
Copy Markdown
Contributor

mbohlool commented Aug 18, 2017

Let me see if I understand the problem correctly. User provides a validation schema that we will use to validate CRs. We want them to be able to provide multiple types. Currently we support OpenAPI v3 schema in CRD validation, so they can do things like '{anyOf: [{type: "string"}, {type: "integer"}]}' but they can't do {type:["string", "integer"]}.

If the above is correct, what if we do a post-processing on the provided spec and convert {type:["string", "integer"]} to {anyOf: [{type: "string"}, {type: "integer"}]} so it would be a valid OpenAPI v3 spec? I am not a fan of this solution but if we must compromise, it is the closest one to a good solution. I don't like it because it is just a nice to have feature and we are hacking around to let user do it. Also if we don't do it now, it would be always backward compatible to add it. OpenAPI folks decided not to do this as it make life harder for tools to consume it. The whole discussion can be found in this issue (and the referenced ones).

Regarding StringArrayOrString, if the only problem with IntOrString is OpenAPI v2 spec, we should be able to proceed with it (The number of open and closed issues with IntOrString, however, make me nervous). I think we are going to support v3 soon (not soon enough for 1.8, but soon). In v2, we should just represent it as Object. That means we don't do any validation on it. We are post-processing our spec to do that for some types here.

@sttts
Copy link
Copy Markdown
Contributor Author

sttts commented Aug 18, 2017

@mbohlool we could do that transformation. But as we restrict ourselves to OpenAPI v3 now in CRD validation, the problem is no problem anymore. I.e. we don't need StringArrayOrString. Our approach of validation was that we wanted to accept raw JSON Schema and then refuse certain constructs if they don't fall into OpenAPI v3. But this is essentially cosmetics and only a matter of another error message. So we switched to plain string for type, away from StringArrayOrString.

@bgrant0607
Copy link
Copy Markdown
Member

If we want common tooling, such as get, describe, schema validation, apply, etc., to work across built-in types and CRDs, then CRDs need to follow K8s API conventions, and the provided schema needs to support the same extensions we're adding for the built-in types.

cc @pwittrock

@sttts
Copy link
Copy Markdown
Contributor Author

sttts commented Aug 18, 2017

and the provided schema needs to support the same extensions we're adding for the built-in types.

This is on the roadmap, together with a common definition namespace such that you can reference e.g. a PodSpec from a CRD. Would be great to get a proposal done for that early in the 1.9 dev cycle.

@lavalamp
Copy link
Copy Markdown
Contributor

Before looking in detail, let me say that IntOrString is terrible and we should not add more things like it :)

@lavalamp
Copy link
Copy Markdown
Contributor

Why do we need to represent the jsonschema in go types at all? Do we have go code that operates on it? Why not store it in a RawExtension type (i.e., just pass-through the json).

@lavalamp
Copy link
Copy Markdown
Contributor

On IntOrString: when we made it, we thought we were being nice to humans who were going to write yaml/json by hand directly. But in hindsight, I think it was not worth it. It confuses everyone who touches it. No type spec system represents it well, excepting OpenAPI3, which we don't use yet. Automation systems don't deal with it well, either, e.g. Google's Deployment Manager. It's especially confusing that "12" and 12 weren't the same thing for a while, but now I think they are, but "12%" still isn't.

There was really no good reason not to make a struct {name, number} for port identification. An extra level in a yaml file really isn't a big deal.

Basically, it violates the principle of least surprise for all the automation systems that have to deal with the API, without adding much benefit for users who are directly writing API objects.

@k8s-github-robot k8s-github-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Aug 23, 2017
k8s-github-robot pushed a commit that referenced this pull request Aug 30, 2017
Automatic merge from submit-queue

apiextensions: validation for customresources

- [x] Add types for validation of CustomResources
- [x] Fix conversion-gen: #49747
- [x] Fix defaulter-gen: kubernetes/gengo#61
- [x] Convert to OpenAPI types
- [x] Validate CR using go-openapi
- [x] Validate CRD Schema
- [x] Add integration tests
- [x] Fix round trip tests: #51204 
- [x] Add custom fuzzer functions
- [x] Add custom conversion functions
- [x] Fix data race while updating CRD: #50098 
- [x] Add feature gate for CustomResourceValidation
- [x] Fix protobuf generation

Proposal: kubernetes/community#708
Additional discussion: #49879, #50625

**Release note**:

```release-note
Add validation for CustomResources via JSON Schema.
```

/cc @sttts @deads2k
@liggitt
Copy link
Copy Markdown
Member

liggitt commented Sep 1, 2017

this is no longer needed, correct?

@sttts
Copy link
Copy Markdown
Contributor Author

sttts commented Sep 1, 2017

Correct.

@sttts sttts closed this Sep 1, 2017
sttts pushed a commit to sttts/apiextensions-apiserver that referenced this pull request Sep 22, 2017
Automatic merge from submit-queue

apiextensions: validation for customresources

- [x] Add types for validation of CustomResources
- [x] Fix conversion-gen: #49747
- [x] Fix defaulter-gen: kubernetes/gengo#61
- [x] Convert to OpenAPI types
- [x] Validate CR using go-openapi
- [x] Validate CRD Schema
- [x] Add integration tests
- [x] Fix round trip tests: #51204
- [x] Add custom fuzzer functions
- [x] Add custom conversion functions
- [x] Fix data race while updating CRD: #50098
- [x] Add feature gate for CustomResourceValidation
- [x] Fix protobuf generation

Proposal: kubernetes/community#708
Additional discussion: kubernetes/kubernetes#49879, kubernetes/kubernetes#50625

**Release note**:

```release-note
Add validation for CustomResources via JSON Schema.
```

/cc @sttts @deads2k

Kubernetes-commit: 4457e43e7b789586096bfb564330295cf0438e70
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge DEPRECATED. Indicates that a PR should not merge. Label can only be manually applied/removed. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. release-note-none Denotes a PR that doesn't merit a release note. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.