Skip to content

TSDS data streams can route writes to downsampled indices after rollover #99696

@andreidan

Description

@andreidan

Elasticsearch Version

8.10, 8.11

Installed Plugins

No response

Java Version

bundled

OS Version

Darwin

Problem Description

A tsds backing index has configured a START_TIME and END_TIME denoting the time bounds for the data they will host. The END_TIME in particular is configured based on the index.look_ahead_time setting.

All writes against a TSDS will be routed based on the document @timestamp to the correct backing index according to each index's START/END time configuration.

We can easily simulate a situation where we rollover the data stream and the now timestamp will now not be routed to the write index anymore but to the second generation index (because the second generation index's END_TIME configuration has not lapsed yet). This would normally not be a problem but if the index is read-only the write will fail.

The index could be read-only because it was downsampled. For downsampling in particular we should delay downsampling the backing index until the configured END_TIME for the backing index has lapsed. Note that a similar situation could be encountered if, say, a searchable_snapshot action is used instead of downsampling - however we should probably treat that separately.

This problem is present both in ILM and data stream lifecycle.

In ILM, currently, the only workaround is to increase the min_age of the phase where downsampling is configured, sufficiently such that the look_ahead_time for the backing indices will have lapsed by the time the index transitions to the phase where downsampling is configured.

Steps to Reproduce

PUT _cluster/settings
{
  "persistent": {
    "indices.lifecycle.poll_interval": "5s"
  }
}


PUT _component_template/test-mappings
{
  "template": {
    "mappings": {
      "properties": {
        "@timestamp": {
          "format": "epoch_millis",
          "type": "date"
        },
        "metricKey": {
          "time_series_dimension": true,
          "type": "keyword"
        },
        "value": {
          "time_series_metric": "gauge",
          "type": "float"
        }
      }
    }
  }
}

PUT _component_template/test-settings
{
  "template": {
    "settings": {
      "index": {
        "lifecycle": {
          "name": "test-lifecycle"
        },
        "look_ahead_time": "10m"
      }
    }
  }
}

PUT _index_template/test1-template
{
  "priority": 500,
  "template": {
    "settings": {
      "index": {
        "mode": "time_series"
      }
    }
  },
  "index_patterns": [
    "test1*"
  ],
  "data_stream": {
    "hidden": false,
    "allow_custom_routing": false
  },
  "composed_of": [
    "test-settings",
    "test-mappings"
  ]
}

PUT _ilm/policy/test-lifecycle
{
  "policy": {
    "phases": {
      "hot": {
        "min_age": "0ms",
        "actions": {
          "downsample": {
            "fixed_interval": "5m"
          },
          "rollover": {
            "max_primary_shard_size": "25gb",
            "max_age": "1h",
            "max_docs": 2
          },
          "set_priority": {
            "priority": 100
          }
        }
      },
      "delete": {
        "min_age": "7d",
        "actions": {
          "delete": {
            "delete_searchable_snapshot": true
          }
        }
      }
    }
  }
}


===============
Get `now` date
❯ date +"%s%3N"
1695131067275
================

POST _bulk
{"create": {"_index": "test1ds2"}}
{"metricKey": "thekey1", "@timestamp": "1695131067275", "value": 1, "instance": "server:9100", "job": "node", "env": "hml", "__name__": "up"}

== rollover the data stream
POST test1ds2/_rollover

==Get `now` date
❯ date +"%s%3N"
1695131118582

POST _bulk
{"create": {"_index": "test1ds2"}}
{"metricKey": "thekey2", "@timestamp": "1695131118582", "value": 1, "instance": "server:9100", "job": "node", "env": "hml", "__name__": "up"}

The result is

{
  "errors": true,
  "took": 0,
  "items": [
    {
      "create": {
        "_index": "test1ds2",
        "_id": null,
        "status": 403,
        "error": {
          "type": "cluster_block_exception",
          "reason": "index [downsample-5m-.ds-test1ds2-2023.09.19-000001] blocked by: [FORBIDDEN/8/index write (api)];"
        }
      }
    }
  ]
}

Logs (if relevant)

No response

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions