Add frozen_after field to data stream lifecycle#139042
Add frozen_after field to data stream lifecycle#139042nielsbauman merged 9 commits intoelastic:mainfrom
frozen_after field to data stream lifecycle#139042Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds support for a frozen_after field to the data stream lifecycle feature, allowing users to specify when data should be frozen. The implementation includes support across component templates, composable templates, and the data stream lifecycle API, all protected by a new feature flag dlm_searchable_snapshots.
- Adds
frozen_afterfield toDataStreamLifecycleclass and related Template/Builder classes - Implements serialization/deserialization with feature flag protection
- Adds REST API capability support for the new field
Reviewed changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| server/src/main/java/org/elasticsearch/cluster/metadata/DataStreamLifecycle.java | Core implementation of frozen_after field with parsing, serialization, and builder support |
| server/src/main/java/org/elasticsearch/rest/action/admin/indices/RestPutComposableIndexTemplateAction.java | Adds capability declaration for frozen_after in composable index templates |
| server/src/main/java/org/elasticsearch/rest/action/admin/indices/RestPutComponentTemplateAction.java | Adds capability declaration for frozen_after in component templates |
| modules/data-streams/src/main/java/org/elasticsearch/datastreams/lifecycle/rest/RestPutDataStreamLifecycleAction.java | Adds capability declaration for frozen_after in data stream lifecycle API |
| server/src/test/java/org/elasticsearch/cluster/metadata/DataStreamLifecycleTests.java | Test coverage for frozen_after field mutation and random generation |
| server/src/test/java/org/elasticsearch/cluster/metadata/DataStreamLifecycleTemplateTests.java | Test coverage for template handling of frozen_after field |
| server/src/test/java/org/elasticsearch/cluster/metadata/MetadataIndexTemplateServiceTests.java | Test coverage for lifecycle resolution with frozen_after field |
| rest-api-spec/src/yamlRestTest/resources/rest-api-spec/test/indices.put_index_template/10_basic.yml | REST API test for index templates with frozen_after |
| rest-api-spec/src/yamlRestTest/resources/rest-api-spec/test/cluster.component_template/10_basic.yml | REST API test for component templates with frozen_after |
| modules/data-streams/src/yamlRestTest/resources/rest-api-spec/test/data_stream/lifecycle/20_basic.yml | REST API test for data stream lifecycle with frozen_after |
| server/src/main/resources/transport/upper_bounds/9.3.csv | Updates transport version upper bound for the feature |
| server/src/main/resources/transport/definitions/referable/searchable_snapshots_dlm.csv | Defines transport version for searchable snapshots DLM feature |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
server/src/test/java/org/elasticsearch/cluster/metadata/MetadataIndexTemplateServiceTests.java
Show resolved
Hide resolved
|
Pinging @elastic/es-data-management (Team:Data Management) |
lukewhiting
left a comment
There was a problem hiding this comment.
One NABD comment but otherwise looks solid :-) 👍🏻
| + ", downsamplingMethod=" | ||
| + downsamplingMethod | ||
| + '}'; | ||
| return Strings.toString(this); |
There was a problem hiding this comment.
I believe String.toString(...) invokes the xContent serialization code which is pretty heavy weight. Any reason for this over a standard, IDE generated toString?
There was a problem hiding this comment.
Any reason for this over a standard, IDE generated toString?
Yep, I have a few reasons :)
- I don't like having to add every new field to this method
- I don't think we invoke
toStringof this class often (maybe never) in production - I think this is mostly relevant in tests - We don't serialize anything big in this class. So, while XContent serialization might relatively be more expensive than simple string concatenation, I don't think the serialization of this class is necessarily heavy in absolute terms.
Let me know what you think!
dakrone
left a comment
There was a problem hiding this comment.
This LGTM, I left a couple of comments. I think we should validate (even internally) that the TimeValue is > 0 to avoid someone passing TimeValue.MINUS_ONE
| private static final String SUPPORTS_FROZEN_AFTER = "dlm.frozen_after"; | ||
| private static final Set<String> CAPABILITIES = Set.of( | ||
| SUPPORTS_DOWNSAMPLING_METHOD, | ||
| DataStreamLifecycle.DLM_SEARCHABLE_SNAPSHOTS_FEATURE_FLAG.isEnabled() ? SUPPORTS_FROZEN_AFTER : "" |
There was a problem hiding this comment.
Out of curiosity, is there any adverse effect to having an empty string in the set of capabilities?
There was a problem hiding this comment.
I wondered the same thing. I ran the test with DataStreamLifecycle.DLM_SEARCHABLE_SNAPSHOTS_FEATURE_FLAG.isEnabled() ? SUPPORTS_FROZEN_AFTER : removed (i.e. just the empty string) and everything seemed to work just fine.
| public static final TransportVersion ADDED_ENABLED_FLAG_VERSION = TransportVersions.V_8_10_X; | ||
| private static final TransportVersion INTRODUCE_LIFECYCLE_TEMPLATE = TransportVersion.fromName("introduce_lifecycle_template"); | ||
| public static final TransportVersion ADD_SAMPLE_METHOD_DOWNSAMPLE_DLM = TransportVersion.fromName("add_sample_method_downsample_dlm"); | ||
| public static final TransportVersion SEARCHABLE_SNAPSHOTS_DLM_TV = TransportVersion.fromName("searchable_snapshots_dlm"); |
There was a problem hiding this comment.
We should probably make these all either public, or all private, to match each other, right?
There was a problem hiding this comment.
I don't have strong feelings. I don't see much value in making the private ones public if there's no need for them to be public, but I'm empathetic to consistency :)
| if (value == null) { | ||
| return null; | ||
| } else { | ||
| return TimeValue.parseTimeValue(value, FROZEN_AFTER_FIELD.getPreferredName()); |
There was a problem hiding this comment.
We should probably validate that the value is > 0, right?
There was a problem hiding this comment.
Ah yeah, good one! I'll add the check to the constructor, that seems like the best place to add the validation.
I remember thinking about whether we should also validate that frozen_after is before retention and after any downsampling (but I forgot to comment that). Can you think of a reason not to add that validation too?
| public static final String SUPPORTS_FAILURE_STORE = "data_stream_options.failure_store"; | ||
| private static final String COMPONENT_TEMPLATE_TRACKING_INFO = "component_template_tracking_info"; | ||
| static final String SUPPORTS_DOWNSAMPLING_METHOD = "dlm.downsampling_method"; | ||
| static final String SUPPORTS_FROZEN_AFTER = "dlm.frozen_after"; |
There was a problem hiding this comment.
Can we re-use the other one from PutDataStreamLifecycle? Perhaps we can put the string into a common place and reference them from each different module?
There was a problem hiding this comment.
I was able to use the one from RestPutComponentTemplateAction in RestPutDataStreamLifecycleAction - I didn't push yet.
* upstream/main: Update text_similarity_rank_retriever to support inference ID as an argument when reranking on chunks (elastic#137397) Mute org.elasticsearch.test.rest.yaml.CssSearchYamlTestSuiteIT test {p0=search.retrievers/result-diversification/10_mmr_result_diversification_retriever/Test MMR result diversification multiple indexes} elastic#139826 Mute org.elasticsearch.index.mapper.SkipperSettingsTests testTSDBSkipperSettingDefaults elastic#139824 Unmute test fix elastic#129517 (elastic#139782) Add back support for deserializing old refresh token in test (elastic#139811) Add documentation for exponential_histogram field type (elastic#139684) make ES|QL sample CSV test looser (elastic#139814) Add `frozen_after` field to data stream lifecycle (elastic#139042) Quieten many `ERROR` logs to `WARN` (elastic#139799)
We add the new field to
DataStreamLifecycleand relevant classes. We support the setting in component templates, composable templates, and the data stream lifecycle API.Note that all of this is behind a newly added feature flag
dlm_searchable_snapshots.Resolves elastic/elasticsearch-team#2102
Resolves elastic/elasticsearch-team#2104
Resolves elastic/elasticsearch-team#2105
Resolves elastic/elasticsearch-team#2106
Resolves elastic/elasticsearch-team#2107
Resolves elastic/elasticsearch-team#2108