Skip to content

Skip peer-recovery retention in soft deletes policy for serverless#144223

Merged
tlrx merged 10 commits intoelastic:mainfrom
tlrx:2026/03/13-do-not-retain-ops-for-serverless
Apr 10, 2026
Merged

Skip peer-recovery retention in soft deletes policy for serverless#144223
tlrx merged 10 commits intoelastic:mainfrom
tlrx:2026/03/13-do-not-retain-ops-for-serverless

Conversation

@tlrx
Copy link
Copy Markdown
Member

@tlrx tlrx commented Mar 13, 2026

Serverless does not use peer recovery, so there is no need to retain soft-deleted documents down to the local checkpoint of the safe commit.

This change adds a retainForPeerRecovery flag to SoftDeletesPolicy and a protected hook in InternalEngine so that classes that inherit InternalEngine can disable this constraint, allowing the min retained sequence number to advance faster.

Relates #136305

Serverless does not use peer recovery, so there is no need to retain
soft-deleted documents down to the local checkpoint of the safe commit.

This change adds a `retainForPeerRecovery` flag to `SoftDeletesPolicy`
and a protected hook in InternalEngine so that classes that inherit
InternalEngine can disable this constraint, allowing the min retained
sequence number to advance faster.
@tlrx tlrx added >non-issue :Distributed/Engine Anything around managing Lucene and the Translog in an open shard. v9.4.0 labels Mar 13, 2026
@tlrx tlrx requested a review from fcofdez March 13, 2026 16:08
@tlrx
Copy link
Copy Markdown
Member Author

tlrx commented Mar 13, 2026

@fcofdez here is a contained change, were you thinking of something along those lines?

Copy link
Copy Markdown
Contributor

@fcofdez fcofdez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, I think that we should use the global checkpoint instead, but this looks like a good path forward.

final long minSeqNoToRetain = Math.min(minSeqNoForQueryingChanges, 1 + localCheckpointOfSafeCommit);
final long minSeqNoToRetain = retainForPeerRecovery
? Math.min(minSeqNoForQueryingChanges, 1 + localCheckpointOfSafeCommit)
: minSeqNoForQueryingChanges;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that in this branch we should return the global checkpoint instead? otherwise the retention leases which are synced every 30 seconds (org.elasticsearch.index.IndexService#RETENTION_LEASE_SYNC_INTERVAL_SETTING) would be too far behind and it won't be aggressive enough.

@elasticsearchmachine elasticsearchmachine added the serverless-linked Added by automation, don't add manually label Mar 20, 2026
@tlrx tlrx marked this pull request as ready for review March 20, 2026 16:19
@tlrx tlrx changed the title [Draft] Skip peer-recovery retention in soft deletes policy for serverless Skip peer-recovery retention in soft deletes policy for serverless Mar 20, 2026
@elasticsearchmachine elasticsearchmachine added the Team:Distributed Meta label for distributed team. label Mar 20, 2026
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

Copy link
Copy Markdown
Contributor

@fcofdez fcofdez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@tlrx tlrx force-pushed the 2026/03/13-do-not-retain-ops-for-serverless branch from a8ded74 to 529b041 Compare March 24, 2026 10:07
@tlrx tlrx force-pushed the 2026/03/13-do-not-retain-ops-for-serverless branch from 5c956fd to 4e0b4f8 Compare April 3, 2026 08:57
@tlrx tlrx merged commit 8760c69 into elastic:main Apr 10, 2026
35 checks passed
@tlrx tlrx deleted the 2026/03/13-do-not-retain-ops-for-serverless branch April 10, 2026 13:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Distributed/Engine Anything around managing Lucene and the Translog in an open shard. >non-issue serverless-linked Added by automation, don't add manually Team:Distributed Meta label for distributed team. v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants