Skip to content

Support for ILM in Beats #7935

@ruflin

Description

@ruflin

With elastic/elasticsearch#29823 (comment) index lifecycle management is added to Elasticsearch. ILM is especially useful in the case of Beats to automate rollover. This will ensure all indices have a similar size instead of having daily indices where size can vary based on how many Beats and the number of events sent.

This is a description of the possible implementation in Beats. It still might change during implementation.

Technical Implementation

The basic implementation if ILM is enabled in Beats will look as following:

The Beat template will contain the ILM policy which should be used:

PUT _template/filebeat-6.5.0
{
  "index_patterns": ["filebeat-6.5.0-*"],
  "settings": {
    "index.lifecycle.name": "filebeat-policy",
    "index.lifecycle.rollover_alias": "filebeat-6.5.0"
  },
  "mappings": {
    "_doc": {
      ...
    }
  }
}

This can already configured today through the following settings in Beats:

setup.template.settings.index.lifecycle.name: "filebeat-policy"
setup.template.settings.index.lifecycle.rollover_alias: "filebeat-6.5.0"

With ILM enabled these settings will be written automatically.

As soon as the template is loaded, the Beat will check for the existance of the write alias:

HEAD filebeat-6.5.0

In case the write alias does not exist yet, it will be created:

PUT filebeat-6.5.0-000001
{
  "aliases": {
      "filebeat-6.5.0":{
            "is_write_index": true
     }
  }
}

From here on all data is sent to the filebeat-6.5.0 alias and things work like usual.

Configuration

The configuration of ILM belongs to the Elasticsearch output and could look as following.

elasticsearch.output.ilm:
  enable: true
  write_alias: filebeat
  index: filebeat # Do we even want the version to be configurable? -> ES only and a lot of people break things with it
  pattern: 000001
  policy: filebeat-policy # What if the policy is also set in the settings?

The special part in the above is that the Beat version was left out. A common issue in Beats is that the version number is sometimes remove from the index which can cause issues on migration. To prevent this issue for ILM the version is automatically added to the write alias, index names and the template. We could add an additional config like automatic_version: false to disable this feature if we want.

Question:

  • Should ILM config be it's own top level entry instead of part of elasticsearch ouptut?

Example Policy

An example policy could look as following.

PUT _ilm/filebeat-policy
{
   "policy": {
     "type": "timeseries",
     "phases": {
       "hot": {
         "after": "0s",
         "actions": {
          "rollover": {
            "max_docs": "20"
          }
         }
       }
     }
   }
}

This will create a new alias every 20 documents. In a real world example larger numbers and other rollover criterias can be used.

The policy is expected to be loaded with filebeat setup ilm-policy. A policy can be loaded at any time and is not required on template generation or data ingestion.

Questions

  • Do we provide a default policy?
  • If yes, what is our default policy? Do we have different phases? Is it different per Beat?
  • What if Beats is started against an ES instance without ILM and ILM is configured? It should not start an error out.

Notes for testing

ILM policies are triggered every 10m by default. This can be changed as a cluster setting:

PUT /_cluster/settings
{
    "persistent" : {
        "indices.lifecycle.poll_interval": "5s"
    }
}

The above means policies are triggered every 5s.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions