Add support for top level configuration#79
Conversation
ruflin
left a comment
There was a problem hiding this comment.
Overall LGTM. How will the files inside look exactly? Can there be more than 1 file? What is the content? Would be good to also directly specify this here. If not in the spec, in the PR description.
versions/1/spec.yml
Outdated
| type: folder | ||
| name: input | ||
| required: true | ||
| additionalContents: false |
There was a problem hiding this comment.
Related to @ruflin's comment, it would be good to flesh out contents here with a spec for the file(s) that could be be contained in this folder.
|
Thanks both, good idea. Added now, let me know what do you think |
versions/1/agent/spec.yml
Outdated
| required: true | ||
| additionalContents: false | ||
| contents: | ||
| - description: Package-level template file |
There was a problem hiding this comment.
Probably a better description needed here.
The part I was also looking for is to define, if the content of the file itself is an array or not. Also, are there some required like type? Let me give you 3 examples:
inputs:
- type: foo
value: bar
- type: foo
value: bar
type: foo
value: bar
Which option is it?
@ycombinator Do we have examples already where we also validate the content of certain assets?
There was a problem hiding this comment.
What I am trying spec out is the optional existence of /<package>/<version>/agent/input/template.yml.hbs. I was expecting to adhere to the same constraints than data streams templates: why a type or any other field must be defined inside? Is not up to the integration to make sense of whatever is in there?
There is nothing defined for data_stream/<name>/agent/stream/ contents that I can see either (even thou the default file name is stream.yml.hbs)
There was a problem hiding this comment.
On the stream.yml.hbs files the rules is, it can't be an array and alway only contain a single input. Is this the same for input templates? Is the format the last option from above?
There was a problem hiding this comment.
Ah, I wasn't aware of that. Yes, it would be the same then, last option.
There was a problem hiding this comment.
@jen-huang How will this work on the Kibana side. How will Kibana know which file to use? Convention? Or do we need to reference the file in the input part in the manifest.yml? As inputs is an array there, could there be multiple files in this directory?
There was a problem hiding this comment.
I don't have a preference about the actual term, but this is the concern:
manifest.yml
policy_templates:
- name: apm-server
inputs:
- type: apm
vars:
- name: name
default: my-default-name
template_path: ./agent/input/template.yml.hbsagent/input/template.yml.hbs
name: {{name}}elastic-agent.yml
inputs:
- id: 0e682c50-183a-11eb-916d-71d55143d422
name: ??????
revision: 1
type: apmWhat will be the value of the ?????? field? my-default-name, as per the template variable, or whatever the user set in UI when creating the integration?
Instead, forcing a (eg.) config key:
elastic-agent-2.yml
inputs:
- id: 0e682c50-183a-11eb-916d-71d55143d422
name: apm-1 # user defined in UI
revision: 1
type: apm
config:
# everything from the template
name: my-default-nameMakes sense?
There was a problem hiding this comment.
@jalvz I think the problem you describe is something that could be run into today with streams & stream templates. We generate streams in elastic-agent.yml like this:
streams:
- id: logfile-system.auth
data_stream:
dataset: system.auth
type: logs
paths:
- /var/log/auth.log*
- /var/log/secure*
...If the package author has id or data_stream.* fields in their stream template, we would run into the conflict problem you described.
For inputs, we have more of these kind of "reserved" generated field names:
inputs:
- id: b8bc5300-edfd-11ea-905a-819b5c00fe02
name: system-1
revision: 1
type: logfile
use_output: default
meta:
package:
name: system
version: 0.5.3
data_stream:
namespace: default
streams:
...I'm not sure if adding a config field at the input level is the right approach though, given same nature of the problem on the stream level. I guess for streams, we've relied on package authors not using id and data_stream.* in their templates. I wonder if this is something we can enforce via a blocklist during package validation? I recall we added package validation to enforce template syntax correctness, maybe a blocklist can be added there?
There was a problem hiding this comment.
I see @jen-huang, thanks.
In the spirit of developer-friendliness, I think we should prevent clashes from happening - for someone outside the ingest management team this problem is not obvious at all (in fact, I didn't realize it can happen with streams already) and consequences are pretty much unknown?
The issue with blocklists is to remember to update them when new fields are added, and more importantly how to make them work backwards: what if a new field foo is added to the spec but I already have it in a template?
I think this deserves a discussion before moving the spec to GA, maybe considering a breaking change. I agree with @ycombinator that the more strict the better, and currently there is no definition at all for the stream templates AFAICS.
Is there any other problem with requiring a single top level key in the templates (named config or whatever) other than it is not done for streams? Alternatively, Kibana could "inject" such key behind the scenes, based eg. on the template file name or something like that. But it wouldn't be so explicit.
WDYT?
There was a problem hiding this comment.
We definitively need the validation that these keys are not used and if a package uses them, it should be reject. We have a spec versioning, so if we add new fields as "reserved" we will increase the package spec and the new package will follow the new spec. I doubt that at the moment we have anything around that could not be fixed without a breaking change, but we should check. If we validate, we will find it.
Assuming we don't have any conflicts, what is the best format we should use?
There was a problem hiding this comment.
Sure, it will be detected but it will likely force the package maintainer to change the template and deal with 2 variants of the same configuration. To that end the best preventive action would be to come up with keys that will never ever crash (assuming you know a clash in a future version can happen), so why don't be explicit and require it upfront?
OTOH, json-schema is already meant for validation, so adding more validation alongisde means that the json-schema spec becomes less reliable (a developer might be easily confused after carefully following the spec and then find out that their package doesn't work).
Anyways, since that is a separate problem I filed #85 and removed the config bit from this PR, let me know how it looks now.
|
@jen-huang @ycombinator @ruflin does this look right now? something else missing? |
|
What we miss is the definition of the base format of the template file, see conversation above with @ycombinator . But maybe it is best to move forward even without it, so we can still change it (see comment below). @ycombinator We also have a bit a chicken / egg problem here. We are adding something new to the spec but haven't fully tested it with Kibana and an actual package yet, so it might still change. What is our best approach here? @jen-huang @ycombinator If you are fine with the changes, lets get them in. |
|
Ideally the spec will be defined first and it's changes rolled out into Also ideally the spec can be as strict as possible so as to catch as many issues as possible early on in a package's development. Given that, I would suggest for this PR:
|
jen-huang
left a comment
There was a problem hiding this comment.
LGTM for Kibana support
|
@jalvz Any chance you could post here the final content / structure of the template file? I think we are aligned but want to make sure we have it also written in YAML here. Seems there is a conflict with the generated file. |
mtojek
left a comment
There was a problem hiding this comment.
Correct me if I'm wrong, but I understand that this change is backward compatible (it's just an extension). We should be safe with adding it.
Please add a new folder with a sample package for testing purposes in:
https://github.com/elastic/package-spec/tree/master/code/go/internal/validator/test/packages
and enable it in this file: https://github.com/elastic/package-spec/blob/master/code/go/pkg/validator/validator_test.go
mtojek
left a comment
There was a problem hiding this comment.
I'm afraid that the CI still fails for this PR.
versions/1/agent/agent.spec.yml
Outdated
| spec: | ||
| # Everything under here follows JSON schema (https://json-schema.org/), written as YAML for readability | ||
| type: object | ||
| additionalProperties: true No newline at end of file |
There was a problem hiding this comment.
I understand that this schema file describes the *.yml.hbs file, which is Handlebars template. I think it's not possible to easily define a JSON schema for this file as the JSON format can be broken by template placeholders.
versions/1/agent/spec.yml
Outdated
| type: file | ||
| pattern: '^.+.yml.hbs$' | ||
| required: true | ||
| $ref: "./agent.spec.yml" No newline at end of file |
There was a problem hiding this comment.
As stated above, the .yml.hbs file is not a strict JSON/YAML file if you plan to use placeholders. You can't use (reference) a schema for this.
| @@ -0,0 +1 @@ | |||
| {} No newline at end of file | |||
There was a problem hiding this comment.
nit: It would be nice to put a real content in at least one file, so this template.yml.hbs won't be so mysterious.
There was a problem hiding this comment.
BTW, you could even put a short APM Example in here to make it more concrete.
There was a problem hiding this comment.
@ruflin I added 7673874, is not that what you were looking for?
I don't think I can use this file as a real hbs example because it is referenced in the test (correct me if Im wrong)
Also didn't want to add an APM-like config because this can be anything, it has exactly the same structure as the stream templates.
There was a problem hiding this comment.
Got it. I missed that you added two template files. The part I stumble over is that you used group. This is just an example I assume. It can all be on the top level like foo: bar?
If we could use apm example here, @mtojek will know best. My preference is always to have a real example if possible.
Follow up to elastic/package-spec#79. Kibana needs `template_path` to ascertain which input template file to read from to build the agent YAML. This PR lets the registry serve that field at the input level, if defined.
|
Hi all, I opened elastic/package-registry#655 to allow the package registry to serve the new |
Closes #70