Skip to content

Document how to reindex a tsdb data stream #99176

@martijnvg

Description

@martijnvg

Re-indexing a tsdb data stream is more challenging than re-indexing a regular tsdb data stream. This is because when a new data stream is created, then the new backing index start and end time settings are blindly set to $now-2h to $now+2h. The backing indices of the existing tsdb data stream may overlap with this, or not at all. Directly re-indexing from the old tsdb data stream into the new tsdb data stream only works for documents that have a timestamp that matches with $now-2h to $now+2h.

Given that reindexing a tsdb data stream is possible, we should document how to do this. Currently no documentation around reindex a tsdb data stream exists.

The process looks something like this:

  • Create a specific index template for the new data stream only that will contain the re-indexed data. Otherwise other data streams may get affected. This index template should contain the new mappings / index settings that should get applied.
  • Update the template to set specific index.time_series.start_time and index.time_series.end_time index settings. The start and end time settings should be based on the lowest and highest @timestamp values in the data stream to be reindex. This way the first backing index is fixed to contain all data that is contained in the data stream that should be reindexed.
  • Update the template to set the index.number_of_shards index setting to the sum of all primary shards of all backing indices of the data stream to be reindexed.
  • Update the template to set index.number_of_replicas to zero and unset the index.lifecycle.name index setting.
  • Start the reindex operation.
  • After reindexing completed then remove the index.time_series.start_time, index.time_series.end_time index settings from the template and set index.number_of_replicas, index.number_of_shards and index.lifecycle.name to the original values.
  • Invoke the rollover api without any conditions set. Now data stream should be ready accept recent data.

Metadata

Metadata

Assignees

Labels

:StorageEngine/TSDBYou know, for Metrics>docsGeneral docs changesTeam:AnalyticsMeta label for analytical engine team (ESQL/Aggs/Geo)Team:DocsMeta label for docs team

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions