With elastic/elasticsearch#29823 (comment) index lifecycle management is added to Elasticsearch. ILM is especially useful in the case of Beats to automate rollover. This will ensure all indices have a similar size instead of having daily indices where size can vary based on how many Beats and the number of events sent.
This is a description of the possible implementation in Beats. It still might change during implementation.
Technical Implementation
The basic implementation if ILM is enabled in Beats will look as following:
The Beat template will contain the ILM policy which should be used:
PUT _template/filebeat-6.5.0
{
"index_patterns": ["filebeat-6.5.0-*"],
"settings": {
"index.lifecycle.name": "filebeat-policy",
"index.lifecycle.rollover_alias": "filebeat-6.5.0"
},
"mappings": {
"_doc": {
...
}
}
}
This can already configured today through the following settings in Beats:
setup.template.settings.index.lifecycle.name: "filebeat-policy"
setup.template.settings.index.lifecycle.rollover_alias: "filebeat-6.5.0"
With ILM enabled these settings will be written automatically.
As soon as the template is loaded, the Beat will check for the existance of the write alias:
In case the write alias does not exist yet, it will be created:
PUT filebeat-6.5.0-000001
{
"aliases": {
"filebeat-6.5.0":{
"is_write_index": true
}
}
}
From here on all data is sent to the filebeat-6.5.0 alias and things work like usual.
Configuration
The configuration of ILM belongs to the Elasticsearch output and could look as following.
elasticsearch.output.ilm:
enable: true
write_alias: filebeat
index: filebeat # Do we even want the version to be configurable? -> ES only and a lot of people break things with it
pattern: 000001
policy: filebeat-policy # What if the policy is also set in the settings?
The special part in the above is that the Beat version was left out. A common issue in Beats is that the version number is sometimes remove from the index which can cause issues on migration. To prevent this issue for ILM the version is automatically added to the write alias, index names and the template. We could add an additional config like automatic_version: false to disable this feature if we want.
Question:
- Should ILM config be it's own top level entry instead of part of elasticsearch ouptut?
Example Policy
An example policy could look as following.
PUT _ilm/filebeat-policy
{
"policy": {
"type": "timeseries",
"phases": {
"hot": {
"after": "0s",
"actions": {
"rollover": {
"max_docs": "20"
}
}
}
}
}
}
This will create a new alias every 20 documents. In a real world example larger numbers and other rollover criterias can be used.
The policy is expected to be loaded with filebeat setup ilm-policy. A policy can be loaded at any time and is not required on template generation or data ingestion.
Questions
- Do we provide a default policy?
- If yes, what is our default policy? Do we have different phases? Is it different per Beat?
- What if Beats is started against an ES instance without ILM and ILM is configured? It should not start an error out.
Notes for testing
ILM policies are triggered every 10m by default. This can be changed as a cluster setting:
PUT /_cluster/settings
{
"persistent" : {
"indices.lifecycle.poll_interval": "5s"
}
}
The above means policies are triggered every 5s.
With elastic/elasticsearch#29823 (comment) index lifecycle management is added to Elasticsearch. ILM is especially useful in the case of Beats to automate rollover. This will ensure all indices have a similar size instead of having daily indices where size can vary based on how many Beats and the number of events sent.
This is a description of the possible implementation in Beats. It still might change during implementation.
Technical Implementation
The basic implementation if ILM is enabled in Beats will look as following:
The Beat template will contain the ILM policy which should be used:
This can already configured today through the following settings in Beats:
With ILM enabled these settings will be written automatically.
As soon as the template is loaded, the Beat will check for the existance of the write alias:
In case the write alias does not exist yet, it will be created:
From here on all data is sent to the
filebeat-6.5.0alias and things work like usual.Configuration
The configuration of ILM belongs to the Elasticsearch output and could look as following.
The special part in the above is that the Beat version was left out. A common issue in Beats is that the version number is sometimes remove from the index which can cause issues on migration. To prevent this issue for ILM the version is automatically added to the write alias, index names and the template. We could add an additional config like
automatic_version: falseto disable this feature if we want.Question:
Example Policy
An example policy could look as following.
This will create a new alias every 20 documents. In a real world example larger numbers and other rollover criterias can be used.
The policy is expected to be loaded with
filebeat setup ilm-policy. A policy can be loaded at any time and is not required on template generation or data ingestion.Questions
Notes for testing
ILM policies are triggered every 10m by default. This can be changed as a cluster setting:
The above means policies are triggered every 5s.