Validate content of folder items#41
Conversation
versions/1/manifest.spec.yml
Outdated
| - Collecting Apache access and error logs | ||
| vars: | ||
| $ref: "./dataset/manifest.spec.yml#streams/items/properties/vars" | ||
| description: Input variables. |
There was a problem hiding this comment.
Aren't we duplicating this spec? That's why I had the $ref originally.
There was a problem hiding this comment.
Yes, but the provided loader doesn't support the statik filesystem. I can try to implement a custom resource loader.
Also, the library enforces a canonical path here with schema, so it would be: file://./dataset/manifest.spec.yml#streams/items/properties/vars
There was a problem hiding this comment.
I see. I think the duplication is fine for now but we should probably make comments in both places pointing to the other as a duplicate that needs to be kept in sync whenever any one of the places is changed. Longer term, let's look into implementing a custom resource loader.
There was a problem hiding this comment.
I will try to focus on this as followup. Got interested with this :)
...alidator/test/packages/good/dataset/foo/_dev/test/pipeline/test-access-raw.log-expected.json
Show resolved
Hide resolved
|
|
||
| func (s *folderItemSpec) validate(fs http.FileSystem, folderSpecPath string, itemPath string) ValidationErrors { | ||
| if s.Ref == "" { | ||
| return nil // no item's schema defined |
There was a problem hiding this comment.
What is the schema is defined inline and not referenced via $ref?
There was a problem hiding this comment.
not sure if I follow you...
this might be bad wording, should be: schema reference not provided. Does it sound better?
There was a problem hiding this comment.
Sorry, I was referring to needing something like this: https://github.com/elastic/package-spec/blob/master/code/go/internal/validator/folder.go#L125-L141. Maybe this is not the right place for it but then it should be added where this method is called from.
There was a problem hiding this comment.
Ah, now I see... I totally missed the case where schema is defined inline, didn't know about this. I will improve the implementation.
There was a problem hiding this comment.
@ycombinator Hm.. I dived into this topic and got a bit confused. Do we really need it? What would a use case for this? This $ref refers to a place in folder schema in which we refer to item's schema. It's not part of the JSON-schema, so I believe we can enforce to keep it in a separate file (to separate concerns and prevent from creating a huge all-in-one files).
tl;dr $ref in folder schema is always external.
WDYT?
There was a problem hiding this comment.
So in the folder schema, there can be two types of items: folder (for describing the structure of a sub-folder within the folder) and file (for describing the structure of a file within the folder).
For type: folder, we already have a couple of places in the spec where use $ref (example) and a couple other places where we inline the definition instead of using $ref (example).
For type: file, I checked, and at the moment we only have places where we use $ref (example). We do not yet have any places where we are inlining the spec (aka schema) for the file itself. However, I can see this being useful for simple file specs (example).
So I think there is a use case for it and we should support it.
Also, if we allowing inline specs for sub-folder items but not for file items, we are introducing an inconsistency. We could, of course, resolve this inconsistency by going the other way: that is, we can say that we only allow $refs for item specs, regardless of whether those are folder items or file items; that we do not accept inline specs for either item type. I think this is slightly inconvenient when the item has a small spec but I'd be okay with this approach too. If you prefer this, then let's make a separate PR to first remove support for inline specs for type: folder items and adjust the spec files accordingly.
| return nil, errors.Wrapf(err, "schema unmarshalling failed (path: %s)", itemSchemaPath) | ||
| } | ||
|
|
||
| schemaData, err := json.Marshal(&schema.Spec) |
There was a problem hiding this comment.
Not for this PR but we could probably convert the schema YAML files to JSON as part of the make update step so we don't have to do this yaml.Unmarshal + json.Marshal dance every time at run time.
There was a problem hiding this comment.
Yes, this could be an improvement in the future (convertion is always error prone), although I admin that YAML here is much more human readable and easier to interact.
There was a problem hiding this comment.
Indeed, which is why we'd keep the source spec files in YAML but only convert them to JSON when make update is run.
|
Ok, I fixed the issue with schema referenced inside item specs. The solution is that all "exported" elements must be specified under definitions. Then we can refer to them (out of the box, without JSON pointer magic) via:
(for input vars) |
Hmm, according to https://json-schema.org/understanding-json-schema/structuring.html it's not required to put the "exported" elements under
Maybe this is something required by the library we are using for validating JSON schema. Anyway, good find and it's actually quite nice to have a well-defined location for "exported" vars so 👍 all around! |
ycombinator
left a comment
There was a problem hiding this comment.
LGTM. Thanks for implementing this feature — it's going to fill a major gap in our validation.
I'm opening this as draft for visibility of the progress and changes.
Fixes: #38
This PR adds support for validating file content (JSON or YAML) against JSON-schema.
TODO:
Notes:
... are not extended to:
If there is a quick and nice fix, I can apply it. At the moment I enforced the upper version.