Skip to content

Support no-op tombstones documents in TSDB indices with synthetic ids#144935

Merged
tlrx merged 14 commits intoelastic:mainfrom
tlrx:2026/03/25-noop-tombstone-synthetic-ids
Mar 30, 2026
Merged

Support no-op tombstones documents in TSDB indices with synthetic ids#144935
tlrx merged 14 commits intoelastic:mainfrom
tlrx:2026/03/25-noop-tombstone-synthetic-ids

Conversation

@tlrx
Copy link
Copy Markdown
Member

@tlrx tlrx commented Mar 25, 2026

No-op tombstones documents can be indexed into Lucene during the promotion of a replica after a primary failure, or after restoring a snapshot or during peer-recovery when the primary shard has no-op tombstones documents. Such documents have the __soft_deletes field set so they are automatically filtered out from search hits and GET responses.

The _tsid, @timestamp and _ts_routing_hash doc value fields (that are used to compute the synthetic _id of documents) of delete tombstones document are populated so the fields exist in the Lucene index (the values are derived from the document id of the DELETE request).

For no-op tombstone documents, it's different because we cannot deduce the doc values fields from a document id. Therefore those no-op tombstone documents must be checked for and filtered out from the TSDB synthetic id postings format.

Also, the TSDB synthetic id custom codec ensures that all open/written segment have the _tsid, @timestamp and _ts_routing_hash doc value fields. This is not true for segment that are only composed of no-op tombstones documents, so the assertions there must be relaxed.

This commit adjust the postings format and coded used in TSDB indices with synthetic ids and adds an integration test that exercise the 3 code paths where no-op tombstone document can be indexed into Lucene.

Note: Cursor greatly helped for writing the test

@tlrx tlrx added >non-issue :Distributed/Recovery Anything around constructing a new shard, either from a local or a remote source. v9.4.0 :StorageEngine/TSDB You know, for Metrics labels Mar 25, 2026
@elasticsearchmachine elasticsearchmachine added Team:Distributed Meta label for distributed team. Team:StorageEngine labels Mar 25, 2026
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

Copy link
Copy Markdown
Contributor

@fcofdez fcofdez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

flush(backingIndex);
}

// Ensure all operations are replicated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't this ensured by the bulk request semantics?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess Cursor hallucinated a bit, thanks for spotting this. I removed it in 664b739.

final int nbGaps = randomIntBetween(1, 25);
primaryShard.withEngine(engine -> {
for (int i = 0; i < nbGaps; i++) {
generateNewSeqNo(engine);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL: neat way of generating gaps without needing to isolate replicas or something like that

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Cursor! I didn't know about this but there are other usage in IT tests.

* Warning: This method can be slow because it potentially scans many documents in the segment.
* </p>
*/
int findFirstDocWithTsIdOrdinalEqualOrGreaterThan(int tsIdOrd) throws IOException {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe we should assert that tsIdOrd >= 0?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in 0b8f316

@tlrx tlrx merged commit 650e050 into elastic:main Mar 30, 2026
36 checks passed
@tlrx tlrx deleted the 2026/03/25-noop-tombstone-synthetic-ids branch March 30, 2026 12:21
felixbarny pushed a commit to felixbarny/elasticsearch that referenced this pull request Mar 30, 2026
…elastic#144935)

No-op tombstones documents can be indexed into Lucene 
during the promotion of a replica after a primary failure, or 
after restoring a snapshot or during peer-recovery when the 
primary shard has no-op tombstones documents. Such 
documents have the __soft_deletes field set so they are 
automatically filtered out from search hits and GET responses.

The _tsid, @timestamp and _ts_routing_hash doc value fields 
(that are used to compute the synthetic _id of documents) of 
delete tombstones document are populated so the fields exist 
in the Lucene index (the values are derived from the document 
id of the DELETE request).

For no-op tombstone documents, it's different because we 
cannot deduce the doc values fields from a document id. 
Therefore those no-op tombstone documents must be 
checked for and filtered out from the TSDB synthetic id 
postings format.

Also, the TSDB synthetic id custom codec ensures that all 
open/written segment have the _tsid, @timestamp and 
_ts_routing_hash doc value fields. This is not true for 
segment that are only composed of no-op tombstones 
documents, so the assertions there must be relaxed.

This commit adjust the postings format and coded used 
in TSDB indices with synthetic ids and adds an integration 
test that exercise the 3 code paths where no-op tombstone 
document can be indexed into Lucene.
tlrx added a commit to tlrx/elasticsearch that referenced this pull request Mar 30, 2026
mamazzol pushed a commit to mamazzol/elasticsearch that referenced this pull request Mar 30, 2026
…elastic#144935)

No-op tombstones documents can be indexed into Lucene 
during the promotion of a replica after a primary failure, or 
after restoring a snapshot or during peer-recovery when the 
primary shard has no-op tombstones documents. Such 
documents have the __soft_deletes field set so they are 
automatically filtered out from search hits and GET responses.

The _tsid, @timestamp and _ts_routing_hash doc value fields 
(that are used to compute the synthetic _id of documents) of 
delete tombstones document are populated so the fields exist 
in the Lucene index (the values are derived from the document 
id of the DELETE request).

For no-op tombstone documents, it's different because we 
cannot deduce the doc values fields from a document id. 
Therefore those no-op tombstone documents must be 
checked for and filtered out from the TSDB synthetic id 
postings format.

Also, the TSDB synthetic id custom codec ensures that all 
open/written segment have the _tsid, @timestamp and 
_ts_routing_hash doc value fields. This is not true for 
segment that are only composed of no-op tombstones 
documents, so the assertions there must be relaxed.

This commit adjust the postings format and coded used 
in TSDB indices with synthetic ids and adds an integration 
test that exercise the 3 code paths where no-op tombstone 
document can be indexed into Lucene.
tlrx added a commit that referenced this pull request Mar 31, 2026
ncordon pushed a commit to ncordon/elasticsearch that referenced this pull request Apr 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Distributed/Recovery Anything around constructing a new shard, either from a local or a remote source. >non-issue :StorageEngine/TSDB You know, for Metrics Team:Distributed Meta label for distributed team. Team:StorageEngine v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants