Skip to content

Allow customizing managed data streams at different levels of granularity #97664

@felixbarny

Description

@felixbarny

What are we trying to achieve?

On several occasions, we've been discussing to add ways to enable users to customize data streams that are set up via Fleet and via the built-in index templates, without having to create a copy of the index template and taking the onus to maintain the whole index template going forward. Instead, we'd want to offer dedicated extension points for users so that they can configure different settings/mappings/lifecycles at different levels of the data stream naming scheme:

  • All data streams (*-*-*)
  • All data streams with a certain type ({type}-*-*)
  • All data streams with a certain type and dataset ({type}-{dataset}-*)
  • All data streams with a certain type, dataset, and namespace ({type}-{dataset}-{namespace})
  • All data streams with a certain type and namespace ({type}-*-{namespace})
  • All data streams with a certain namespace (*-*-{namespace})

Some concrete use cases:

  • A user wants to send the observability signals of their tier 1 applications to a separate namespace to keep the data in the hot tier for longer and to have a longer retention
  • Setting the default retention for logs to 30 days and for metrics to 90 days
  • Enable synthetic _source for the logs-foo-* data stream that is using the logs-*-* index template, without having to create a copy of the index template with a logs-foo-* index pattern.

Why this should be in Elasticsearch

The previous discussions (elastic/kibana#149484, elastic/kibana#121118) have mostly been focussed on Fleet. But I have a strong preference for not putting this into Fleet but into Elasticsearch so that data streams that are not managed by Fleet (such as the data streams for the built-in index templates logs-*-* and metrics-*-*) can benefit from that as well.

Why is this important

This gets more important in the context of the reroute processor as documents can be routed to data streams that aren't managed by or known to Fleet. Also, we're considering to move APM index templates out of Fleet and into Elasticsearch (see #97546).

A potential solution

I've proposed one potential solution to this here: elastic/kibana#121118 (comment)

Essentially, we'd add a couple of component templates into the index templates that are managed by Fleet and Elasticsearch. For example, the composed_of section of the logs-*-* index template that is built into Elasticsearch would be extended by component templates that have a placeholder in them (exact naming tbd).

  composed_of:
   - logs@custom
   - logs-*-{{data_stream.namespace}}@custom
   - logs-{{data_stream.dataset}}@custom
   - logs-{{data_stream.dataset}}-{{data_stream.namespace}}@custom

Valid placeholders are any constant_keyword fields.

If a user wants to customize a concrete data stream logs-foo-bar, they can create the following component templates:

  • logs@custom
  • logs-*-bar@custom
  • logs-foo@custom
  • logs-foo-bar@custom

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions