Skip to content

Excluding system data streams from global and factory retention#108038

Merged
masseyke merged 22 commits intoelastic:mainfrom
masseyke:exclude-system-data-streams-from-global-retention
May 3, 2024
Merged

Excluding system data streams from global and factory retention#108038
masseyke merged 22 commits intoelastic:mainfrom
masseyke:exclude-system-data-streams-from-global-retention

Conversation

@masseyke
Copy link
Copy Markdown
Member

We have the notion of global and factory retention for data streams, where we declare the default and maximum retention for all data streams. However, we don't want system data streams to be constrained by these. For example, some system data streams will need to have unlimited retention. This PR makes it so that system data streams are not affected by global or factory data stream retention.

@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Hi @masseyke, I've created a changelog YAML for you.

…ithub.com:masseyke/elasticsearch into exclude-system-data-streams-from-global-retention
Copy link
Copy Markdown
Contributor

@nielsbauman nielsbauman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is just a draft, but I placed some comments anyway, as I was afraid I'd forget them otherwise 😄. Hopefully, there's not too much overlap with what you were already planning on doing but just hadn't come around to yet.

Comment on lines +249 to +251
for (String label : rolloverConfiguration.resolveRolloverConditions(
lifecycle.getEffectiveDataRetention(globalRetention, randomBoolean())
).getConditions().keySet()) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can extract the rollover conditions to a variable to make this a bit more readable? Same goes for the other changes in tests.

…ithub.com:masseyke/elasticsearch into exclude-system-data-streams-from-global-retention
… retention for calcuating rollover for display
Comment on lines +84 to +88
if (in.getTransportVersion().onOrAfter(TransportVersions.NO_LIFECYCLE_FOR_SYSTEM_DATA_STREAMS)) {
this.isSystemDataStream = in.readBoolean();
} else {
this.isSystemDataStream = false;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit:

Suggested change
if (in.getTransportVersion().onOrAfter(TransportVersions.NO_LIFECYCLE_FOR_SYSTEM_DATA_STREAMS)) {
this.isSystemDataStream = in.readBoolean();
} else {
this.isSystemDataStream = false;
}
this.isSystemDataStream = in.getTransportVersion().onOrAfter(TransportVersions.NO_LIFECYCLE_FOR_SYSTEM_DATA_STREAMS)) && in.readBoolean();

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kind of like the simplicity of the if/else here. That's just a matter of personal preference, right? The outcome is the same either way?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd argue that the one-liner is just as simple 😛. But it's definitely just personal preference as I've seen both variations and the outcome is the same. I'm fine with leaving it like this if that's your preference.

@masseyke masseyke marked this pull request as ready for review May 2, 2024 22:05
@masseyke masseyke requested a review from nielsbauman May 2, 2024 22:05
@elasticsearchmachine elasticsearchmachine added the Team:Data Management (obsolete) DO NOT USE. This team no longer exists. label May 2, 2024
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@masseyke masseyke requested a review from dakrone May 2, 2024 22:08
Copy link
Copy Markdown
Contributor

@nielsbauman nielsbauman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some last comments form my side, after those we'll be good to go :)

assertThat(dataStream.newEffectiveRetention(), equalTo(globalRetention.getMaxRetention()));
} else {
assertThat(affectedDataStreams.size(), is(1));
var dataStream = affectedDataStreams.get(1);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to give an IndexOutOfBoundsException.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops. Not sure how I missed that!

public void testSystemExplainLifecycle() throws Exception {
/*
* This test makes sure that for system data streams, we correctly ignore the global retention when calling
* ExplainDataStreamLifecycle. It is very simila to testExplainLifecycle, but only focuses on the retention for a system index.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit:

Suggested change
* ExplainDataStreamLifecycle. It is very simila to testExplainLifecycle, but only focuses on the retention for a system index.
* ExplainDataStreamLifecycle. It is very similar to testExplainLifecycle, but only focuses on the retention for a system index.

public static final TransportVersion SECURITY_ROLE_MAPPINGS_IN_CLUSTER_STATE = def(8_647_00_0);
public static final TransportVersion ESQL_REQUEST_TABLES = def(8_648_00_0);
public static final TransportVersion ROLE_REMOTE_CLUSTER_PRIVS = def(8_649_00_0);
public static final TransportVersion NO_LIFECYCLE_FOR_SYSTEM_DATA_STREAMS = def(8_650_00_0);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit:

Suggested change
public static final TransportVersion NO_LIFECYCLE_FOR_SYSTEM_DATA_STREAMS = def(8_650_00_0);
public static final TransportVersion NO_GLOBAL_RETENTION_FOR_SYSTEM_DATA_STREAMS = def(8_650_00_0);

Comment on lines +245 to +248
assertThat(
explainIndex.getLifecycle().getDataStreamRetention(),
equalTo(TimeValue.timeValueDays(SYSTEM_DATA_STREAM_RETENTION_DAYS))
);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this test missing a request that actually sets a global retention? Without such a request, this assertion becomes rather trivial, I think. But maybe I'm missing something.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops. Good catch!

@masseyke masseyke requested a review from nielsbauman May 3, 2024 13:08
Copy link
Copy Markdown
Contributor

@nielsbauman nielsbauman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for iterating on this Keith :)

@masseyke masseyke merged commit a561958 into elastic:main May 3, 2024
@masseyke masseyke deleted the exclude-system-data-streams-from-global-retention branch May 3, 2024 14:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>non-issue :StorageEngine/Data streams Data streams and their lifecycles Team:Data Management (obsolete) DO NOT USE. This team no longer exists. v8.15.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants