Adding concept of dataset to package#110
Conversation
In the future metricbeat / filebeat and the agent will only support inputs. With this inputs become a first class citizen in our stack. An input is basically an agent configuration + an ingest pipeline. At the moment the package content is focused on having a config for the Beat or agent and all pipelines in one place. This complicates 2 things: * Knowing which ingest pipeline belongs to a specific input * Building integrations with multiple inputs: https://github.com/elastic/integrations/pulls Having the concept could simplify things as the package builder must not try to prevent naming configs of ingest pipeline by introducing extra long names. Also it should simplify testing as often testing is focused on inputs. With this all assets related to an input are together. As part of this PR there is an example on how such an input structure could look like. This should not replace the old place of ingest pipelines. If a user wants to build a package with just a ingest pipeline but not an input, this should also be possible in the future. The changed structure is described in the ASSET.md file.
55698dc to
236b3d9
Compare
| @@ -0,0 +1,14 @@ | |||
| # This is not an array on purpose to make sure only 1 single input is specified in this file. | |||
There was a problem hiding this comment.
There was a problem hiding this comment.
I have been thinking about this a bit more. A dataset is basically a template for an input with all its assets. All inputs with the data set access.log look exactly the same in the end. An input can exist multiple times, still it is the same dataset. So I think this fits well here. Will rename.
There was a problem hiding this comment.
Just pushed a commit with renaming it. One thing I realised is that the agent now does not have an input anymore, but only streams. So the above was renamed to agent/stream/config.yml. @ph Does this sound correct?
| type: metric | ||
|
|
||
| # Each input can be in its own release status | ||
| release: beta |
There was a problem hiding this comment.
@hbharding Some inputs can also be in beta. We probably need some design to indicate this on the create data source page where inputs can be enabled / disabled.
|
To have an example of a dataset in the repository for further discussion, I will merge this PR. There are still open questions around naming (dataset vs input) but the basic structure should stay the same. Having it in the repository will allow the EPM team to start implementing the structure and we get feedback if it works as expected. Also all future changes will be documented. |
In the future metricbeat / filebeat and the agent will only support inputs. With this inputs become a first class citizen in our stack. An input is basically an agent configuration + an ingest pipeline. At the moment the package content is focused on having a config for the Beat or agent and all pipelines in one place. This complicates 2 things:
Having the concept could simplify things as the package builder must not try to prevent naming configs of ingest pipeline by introducing extra long names. Also it should simplify testing as often testing is focused on inputs. With this all assets related to an input are together.
As part of this PR there is an example on how such an input structure could look like. This should not replace the old place of ingest pipelines. If a user wants to build a package with just a ingest pipeline but not an input, this should also be possible in the future.
The changed structure is described in the ASSET.md file.