Skip to content

[Elastic Agent] Proposal: Change structure of Elastic Agent config #18758

@ruflin

Description

@ruflin

Details around datasource config

The Agent config today consists of three level: Datasources -> Inputs -> Streams. The data sources are a group of inputs and allow configuration of output or namespace on the level above inputs. Initially we only had a list of inputs in the agent config but we introduced the data source level for a few reasons:

  • Align with the UI
  • Make importing of configs possible
  • Better error reporting to the UI
  • Higher level configs to be configured for multiple inputs like output, namespace, constraints

In the following I want to dive into each point to see if this actually still matters.

Align with the UI

Having to think about the same concept when creating a config manually or through the UI is powerful to not have two different concepts. At the same time, it seems to be ok to have a more convenient way to configure groupings in the UI then on the agent side.

This is especially true as the grouping of inputs is related to packages which are not available on the agent side. If a users configures nginx he needs to manually specify a logs and metrics input anyway and the datasource grouping will not help him much.

Importing configs

Initially the idea was that data sources would make the importing of configs possible to map it to the UI. This works in case the config imported actually matches 1-1 to a package. But if the user specifies his own inputs and groups them together in some way, this will not work anymore. The solution on our end when importing would either be putting all inputs into 1 data source or create 1 data source per input. 1 data source per input would be more likely as otherwise the UI gets more complex (many inputs in 1 data source). With this, I think the import argument is not valid anymore.

Better error reporting

This still holds but I would argue we can solve it differently through metadata on each input. Each input should support additional metadata where we can add names and ids to have a better error reporting. So when an error is reported to Fleet, we know which data sources with inputs and streams inside it belongs to. Already with datasources we would need this as reporting an error just on a data source is not enough. Having this generic meta concept allows also better error reporting in the standalone case.

Higher Level Configs

It is convenient to configure namespace and output on the data source level. We should still allow this on the UI side but is not required on the agent config level. My assumption is that most users to get started use the default namespace and the default output, this means nothing has to be configured.

Users with more complex configs are likely to use some automation to build the configs in which case specifying the output and namespace more often should not be a problem.

Summary

I think on the agent side, the arguments around having the data source object do not hold up anymore.

Proposed new config

Based on the above, I suggest we remove datasources from the agent config but add namespace, output and meta information support on the input level:

inputs:
 - type: system/metrics
   namespace: default
   use_output: default
   meta:
     package.name: bar
     settings.id: foo
     hello: world
   streams:
     - metricset: cpu
       dataset: system.cpu

The part under meta is not understood by the agent but logged/shipped in case of an error.

This new config combined with the proposed changed for the fields used for the indexing strategy (elastic/package-registry#482) solves also the problem that there was no good way to set a different dataset.type for an input. For example the log input could also generate metrics. The config below will send data to metrics-foo-prod:

inputs:
 - type: logs
   dataset.type: metrics
   dataset.namespace: prod

   streams:
     - paths: /var/log/foo.log
       dataset.name: foo

Removing the datasource part also makes getting started easier for manual configuration. The simplest config now looks as following:

inputs:
 - type: logs
   streams:
     - paths: /var/log/foo.log

It is expect that the Ingest Manager in Kibana still has a grouping of inputs available but will flatten it before shipped to the Agent.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions