Skip to content

Add frozen_after field to data stream lifecycle#139042

Merged
nielsbauman merged 9 commits intoelastic:mainfrom
nielsbauman:add-frozen-after-field
Dec 19, 2025
Merged

Add frozen_after field to data stream lifecycle#139042
nielsbauman merged 9 commits intoelastic:mainfrom
nielsbauman:add-frozen-after-field

Conversation

@nielsbauman
Copy link
Copy Markdown
Contributor

@nielsbauman nielsbauman commented Dec 4, 2025

We add the new field to DataStreamLifecycle and relevant classes. We support the setting in component templates, composable templates, and the data stream lifecycle API.

Note that all of this is behind a newly added feature flag dlm_searchable_snapshots.

Resolves elastic/elasticsearch-team#2102
Resolves elastic/elasticsearch-team#2104
Resolves elastic/elasticsearch-team#2105
Resolves elastic/elasticsearch-team#2106
Resolves elastic/elasticsearch-team#2107
Resolves elastic/elasticsearch-team#2108

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for a frozen_after field to the data stream lifecycle feature, allowing users to specify when data should be frozen. The implementation includes support across component templates, composable templates, and the data stream lifecycle API, all protected by a new feature flag dlm_searchable_snapshots.

  • Adds frozen_after field to DataStreamLifecycle class and related Template/Builder classes
  • Implements serialization/deserialization with feature flag protection
  • Adds REST API capability support for the new field

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
server/src/main/java/org/elasticsearch/cluster/metadata/DataStreamLifecycle.java Core implementation of frozen_after field with parsing, serialization, and builder support
server/src/main/java/org/elasticsearch/rest/action/admin/indices/RestPutComposableIndexTemplateAction.java Adds capability declaration for frozen_after in composable index templates
server/src/main/java/org/elasticsearch/rest/action/admin/indices/RestPutComponentTemplateAction.java Adds capability declaration for frozen_after in component templates
modules/data-streams/src/main/java/org/elasticsearch/datastreams/lifecycle/rest/RestPutDataStreamLifecycleAction.java Adds capability declaration for frozen_after in data stream lifecycle API
server/src/test/java/org/elasticsearch/cluster/metadata/DataStreamLifecycleTests.java Test coverage for frozen_after field mutation and random generation
server/src/test/java/org/elasticsearch/cluster/metadata/DataStreamLifecycleTemplateTests.java Test coverage for template handling of frozen_after field
server/src/test/java/org/elasticsearch/cluster/metadata/MetadataIndexTemplateServiceTests.java Test coverage for lifecycle resolution with frozen_after field
rest-api-spec/src/yamlRestTest/resources/rest-api-spec/test/indices.put_index_template/10_basic.yml REST API test for index templates with frozen_after
rest-api-spec/src/yamlRestTest/resources/rest-api-spec/test/cluster.component_template/10_basic.yml REST API test for component templates with frozen_after
modules/data-streams/src/yamlRestTest/resources/rest-api-spec/test/data_stream/lifecycle/20_basic.yml REST API test for data stream lifecycle with frozen_after
server/src/main/resources/transport/upper_bounds/9.3.csv Updates transport version upper bound for the feature
server/src/main/resources/transport/definitions/referable/searchable_snapshots_dlm.csv Defines transport version for searchable snapshots DLM feature

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@nielsbauman nielsbauman added >non-issue :StorageEngine/Data streams Data streams and their lifecycles labels Dec 4, 2025
@nielsbauman nielsbauman marked this pull request as ready for review December 4, 2025 10:03
@elasticsearchmachine elasticsearchmachine added the Team:Data Management (obsolete) DO NOT USE. This team no longer exists. label Dec 4, 2025
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

Copy link
Copy Markdown
Contributor

@seanzatzdev seanzatzdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

Copy link
Copy Markdown
Contributor

@lukewhiting lukewhiting left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One NABD comment but otherwise looks solid :-) 👍🏻

+ ", downsamplingMethod="
+ downsamplingMethod
+ '}';
return Strings.toString(this);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe String.toString(...) invokes the xContent serialization code which is pretty heavy weight. Any reason for this over a standard, IDE generated toString?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason for this over a standard, IDE generated toString?

Yep, I have a few reasons :)

  1. I don't like having to add every new field to this method
  2. I don't think we invoke toString of this class often (maybe never) in production - I think this is mostly relevant in tests
  3. We don't serialize anything big in this class. So, while XContent serialization might relatively be more expensive than simple string concatenation, I don't think the serialization of this class is necessarily heavy in absolute terms.

Let me know what you think!

Copy link
Copy Markdown
Member

@dakrone dakrone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM, I left a couple of comments. I think we should validate (even internally) that the TimeValue is > 0 to avoid someone passing TimeValue.MINUS_ONE

private static final String SUPPORTS_FROZEN_AFTER = "dlm.frozen_after";
private static final Set<String> CAPABILITIES = Set.of(
SUPPORTS_DOWNSAMPLING_METHOD,
DataStreamLifecycle.DLM_SEARCHABLE_SNAPSHOTS_FEATURE_FLAG.isEnabled() ? SUPPORTS_FROZEN_AFTER : ""
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, is there any adverse effect to having an empty string in the set of capabilities?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wondered the same thing. I ran the test with DataStreamLifecycle.DLM_SEARCHABLE_SNAPSHOTS_FEATURE_FLAG.isEnabled() ? SUPPORTS_FROZEN_AFTER : removed (i.e. just the empty string) and everything seemed to work just fine.

Comment on lines +60 to +63
public static final TransportVersion ADDED_ENABLED_FLAG_VERSION = TransportVersions.V_8_10_X;
private static final TransportVersion INTRODUCE_LIFECYCLE_TEMPLATE = TransportVersion.fromName("introduce_lifecycle_template");
public static final TransportVersion ADD_SAMPLE_METHOD_DOWNSAMPLE_DLM = TransportVersion.fromName("add_sample_method_downsample_dlm");
public static final TransportVersion SEARCHABLE_SNAPSHOTS_DLM_TV = TransportVersion.fromName("searchable_snapshots_dlm");
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably make these all either public, or all private, to match each other, right?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have strong feelings. I don't see much value in making the private ones public if there's no need for them to be public, but I'm empathetic to consistency :)

if (value == null) {
return null;
} else {
return TimeValue.parseTimeValue(value, FROZEN_AFTER_FIELD.getPreferredName());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably validate that the value is > 0, right?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah, good one! I'll add the check to the constructor, that seems like the best place to add the validation.

I remember thinking about whether we should also validate that frozen_after is before retention and after any downsampling (but I forgot to comment that). Can you think of a reason not to add that validation too?

public static final String SUPPORTS_FAILURE_STORE = "data_stream_options.failure_store";
private static final String COMPONENT_TEMPLATE_TRACKING_INFO = "component_template_tracking_info";
static final String SUPPORTS_DOWNSAMPLING_METHOD = "dlm.downsampling_method";
static final String SUPPORTS_FROZEN_AFTER = "dlm.frozen_after";
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we re-use the other one from PutDataStreamLifecycle? Perhaps we can put the string into a common place and reference them from each different module?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was able to use the one from RestPutComponentTemplateAction in RestPutDataStreamLifecycleAction - I didn't push yet.

@nielsbauman nielsbauman added the test-release Trigger CI checks against release build label Dec 9, 2025
@nielsbauman nielsbauman enabled auto-merge (squash) December 18, 2025 09:44
@nielsbauman nielsbauman disabled auto-merge December 19, 2025 14:05
@nielsbauman nielsbauman merged commit 54b35d6 into elastic:main Dec 19, 2025
34 of 37 checks passed
@nielsbauman nielsbauman deleted the add-frozen-after-field branch December 19, 2025 14:05
szybia added a commit to szybia/elasticsearch that referenced this pull request Dec 19, 2025
* upstream/main:
  Update text_similarity_rank_retriever to support inference ID as an argument when reranking on chunks (elastic#137397)
  Mute org.elasticsearch.test.rest.yaml.CssSearchYamlTestSuiteIT test {p0=search.retrievers/result-diversification/10_mmr_result_diversification_retriever/Test MMR result diversification multiple indexes} elastic#139826
  Mute org.elasticsearch.index.mapper.SkipperSettingsTests testTSDBSkipperSettingDefaults elastic#139824
  Unmute test fix elastic#129517 (elastic#139782)
  Add back support for deserializing old refresh token in test (elastic#139811)
  Add documentation for exponential_histogram field type (elastic#139684)
  make ES|QL sample CSV test looser (elastic#139814)
  Add `frozen_after` field to data stream lifecycle (elastic#139042)
  Quieten many `ERROR` logs to `WARN` (elastic#139799)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>non-issue :StorageEngine/Data streams Data streams and their lifecycles Team:Data Management (obsolete) DO NOT USE. This team no longer exists. test-release Trigger CI checks against release build v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants