Adding concept of dataset to package by ruflin · Pull Request #110 · elastic/package-registry

ruflin · 2019-09-19T13:37:38Z

In the future metricbeat / filebeat and the agent will only support inputs. With this inputs become a first class citizen in our stack. An input is basically an agent configuration + an ingest pipeline. At the moment the package content is focused on having a config for the Beat or agent and all pipelines in one place. This complicates 2 things:

Knowing which ingest pipeline belongs to a specific input
Building integrations with multiple inputs: https://github.com/elastic/integrations/pulls

Having the concept could simplify things as the package builder must not try to prevent naming configs of ingest pipeline by introducing extra long names. Also it should simplify testing as often testing is focused on inputs. With this all assets related to an input are together.

As part of this PR there is an example on how such an input structure could look like. This should not replace the old place of ingest pipelines. If a user wants to build a package with just a ingest pipeline but not an input, this should also be possible in the future.

The changed structure is described in the ASSET.md file.

dev/package-examples/nginx-1.2.0/input/stubstatus/fields.yml

In the future metricbeat / filebeat and the agent will only support inputs. With this inputs become a first class citizen in our stack. An input is basically an agent configuration + an ingest pipeline. At the moment the package content is focused on having a config for the Beat or agent and all pipelines in one place. This complicates 2 things: * Knowing which ingest pipeline belongs to a specific input * Building integrations with multiple inputs: https://github.com/elastic/integrations/pulls Having the concept could simplify things as the package builder must not try to prevent naming configs of ingest pipeline by introducing extra long names. Also it should simplify testing as often testing is focused on inputs. With this all assets related to an input are together. As part of this PR there is an example on how such an input structure could look like. This should not replace the old place of ingest pipelines. If a user wants to build a package with just a ingest pipeline but not an input, this should also be possible in the future. The changed structure is described in the ASSET.md file.

ruflin · 2019-10-16T12:05:13Z

dev/package-examples/nginx-1.2.0/input/access/agent/input/input.yml

@@ -0,0 +1,14 @@
+# This is not an array on purpose to make sure only 1 single input is specified in this file.


In a meeting, @skh brought up a good point here that we have a double meaning of input here (shows up twice in the path). We didn't reach a conclusion on what this should be called, but dataset was mentioned as an option.

@exekias @jsoriano @ph Some thoughts on this?

I have been thinking about this a bit more. A dataset is basically a template for an input with all its assets. All inputs with the data set access.log look exactly the same in the end. An input can exist multiple times, still it is the same dataset. So I think this fits well here. Will rename.

Just pushed a commit with renaming it. One thing I realised is that the agent now does not have an input anymore, but only streams. So the above was renamed to agent/stream/config.yml. @ph Does this sound correct?

ruflin · 2019-10-22T09:36:22Z

ASSETS.md

+type: metric
+
+# Each input can be in its own release status
+release: beta


@hbharding Some inputs can also be in beta. We probably need some design to indicate this on the create data source page where inputs can be enabled / disabled.

ruflin · 2019-10-22T11:31:09Z

To have an example of a dataset in the repository for further discussion, I will merge this PR. There are still open questions around naming (dataset vs input) but the basic structure should stay the same. Having it in the repository will allow the EPM team to start implementing the structure and we get feedback if it works as expected. Also all future changes will be documented.

ruflin commented Sep 24, 2019

View reviewed changes

dev/package-examples/nginx-1.2.0/input/stubstatus/fields.yml Outdated Show resolved Hide resolved

This was referenced Sep 25, 2019

Add pipeline reference example to coredns #108

Merged

POC/Discuss: Building integrations package elastic/integrations#3

Closed

ruflin force-pushed the input-directory branch from 55698dc to 236b3d9 Compare September 26, 2019 11:34

ruflin commented Oct 16, 2019

View reviewed changes

add machine learning files

1ccecb9

ruflin mentioned this pull request Oct 21, 2019

Machine learning assets #121

Closed

ruflin added 4 commits October 21, 2019 10:53

remove machine learning to clean up

5218252

remove more machine learning

c69754e

rename input to dataset

6def7eb

dataset in beta stage

46ffaa8

ruflin commented Oct 22, 2019

View reviewed changes

ruflin added 2 commits October 22, 2019 11:37

add some more notes

c97b2ba

extend assets, remove notes

f8d069f

ruflin marked this pull request as ready for review October 22, 2019 11:27

ruflin added 2 commits October 22, 2019 13:28

adjust dataset naming

6808789

more details

b88e86e

ruflin merged commit e29b896 into elastic:master Oct 22, 2019

ruflin changed the title ~~Adding concept of input to package~~ Adding concept of dataset+ to package Oct 22, 2019

ruflin changed the title ~~Adding concept of dataset+ to package~~ Adding concept of dataset to package Oct 22, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding concept of dataset to package#110

Adding concept of dataset to package#110
ruflin merged 10 commits intoelastic:masterfrom
ruflin:input-directory

ruflin commented Sep 19, 2019 •

edited

Loading

Uh oh!

Uh oh!

ruflin Oct 16, 2019

Uh oh!

ruflin Oct 21, 2019

Uh oh!

ruflin Oct 21, 2019

Uh oh!

ruflin Oct 22, 2019

Uh oh!

ruflin commented Oct 22, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		@@ -0,0 +1,14 @@
		# This is not an array on purpose to make sure only 1 single input is specified in this file.

Conversation

ruflin commented Sep 19, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

ruflin Oct 16, 2019

Choose a reason for hiding this comment

Uh oh!

ruflin Oct 21, 2019

Choose a reason for hiding this comment

Uh oh!

ruflin Oct 21, 2019

Choose a reason for hiding this comment

Uh oh!

ruflin Oct 22, 2019

Choose a reason for hiding this comment

Uh oh!

ruflin commented Oct 22, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ruflin commented Sep 19, 2019 •

edited

Loading