Support no-op tombstones documents in TSDB indices with synthetic ids by tlrx · Pull Request #144935 · elastic/elasticsearch

tlrx · 2026-03-25T14:22:00Z

No-op tombstones documents can be indexed into Lucene during the promotion of a replica after a primary failure, or after restoring a snapshot or during peer-recovery when the primary shard has no-op tombstones documents. Such documents have the __soft_deletes field set so they are automatically filtered out from search hits and GET responses.

The _tsid, @timestamp and _ts_routing_hash doc value fields (that are used to compute the synthetic _id of documents) of delete tombstones document are populated so the fields exist in the Lucene index (the values are derived from the document id of the DELETE request).

For no-op tombstone documents, it's different because we cannot deduce the doc values fields from a document id. Therefore those no-op tombstone documents must be checked for and filtered out from the TSDB synthetic id postings format.

Also, the TSDB synthetic id custom codec ensures that all open/written segment have the _tsid, @timestamp and _ts_routing_hash doc value fields. This is not true for segment that are only composed of no-op tombstones documents, so the assertions there must be relaxed.

This commit adjust the postings format and coded used in TSDB indices with synthetic ids and adds an integration test that exercise the 3 code paths where no-op tombstone document can be indexed into Lucene.

Note: Cursor greatly helped for writing the test

elasticsearchmachine · 2026-03-25T14:22:34Z

Pinging @elastic/es-storage-engine (Team:StorageEngine)

elasticsearchmachine · 2026-03-25T14:22:35Z

Pinging @elastic/es-distributed (Team:Distributed)

fcofdez

LGTM

fcofdez · 2026-03-27T11:38:36Z

...a-streams/src/internalClusterTest/java/org/elasticsearch/datastreams/TSDBSyntheticIdsIT.java

+            flush(backingIndex);
+        }
+
+        // Ensure all operations are replicated


isn't this ensured by the bulk request semantics?

I guess Cursor hallucinated a bit, thanks for spotting this. I removed it in 664b739.

fcofdez · 2026-03-27T11:40:10Z

...a-streams/src/internalClusterTest/java/org/elasticsearch/datastreams/TSDBSyntheticIdsIT.java

+        final int nbGaps = randomIntBetween(1, 25);
+        primaryShard.withEngine(engine -> {
+            for (int i = 0; i < nbGaps; i++) {
+                generateNewSeqNo(engine);


TIL: neat way of generating gaps without needing to isolate replicas or something like that

Thanks Cursor! I didn't know about this but there are other usage in IT tests.

fcofdez · 2026-03-27T12:35:29Z

server/src/main/java/org/elasticsearch/index/codec/tsdb/TSDBSyntheticIdDocValuesHolder.java

     * Warning: This method can be slow because it potentially scans many documents in the segment.
     * </p>
     */
    int findFirstDocWithTsIdOrdinalEqualOrGreaterThan(int tsIdOrd) throws IOException {


nit: maybe we should assert that tsIdOrd >= 0?

Added in 0b8f316

@timestamp

…elastic#144935) No-op tombstones documents can be indexed into Lucene during the promotion of a replica after a primary failure, or after restoring a snapshot or during peer-recovery when the primary shard has no-op tombstones documents. Such documents have the __soft_deletes field set so they are automatically filtered out from search hits and GET responses. The _tsid, @timestamp and _ts_routing_hash doc value fields (that are used to compute the synthetic _id of documents) of delete tombstones document are populated so the fields exist in the Lucene index (the values are derived from the document id of the DELETE request). For no-op tombstone documents, it's different because we cannot deduce the doc values fields from a document id. Therefore those no-op tombstone documents must be checked for and filtered out from the TSDB synthetic id postings format. Also, the TSDB synthetic id custom codec ensures that all open/written segment have the _tsid, @timestamp and _ts_routing_hash doc value fields. This is not true for segment that are only composed of no-op tombstones documents, so the assertions there must be relaxed. This commit adjust the postings format and coded used in TSDB indices with synthetic ids and adds an integration test that exercise the 3 code paths where no-op tombstone document can be indexed into Lucene.

Fixed by elastic#144935 Closes elastic#144582

@timestamp

…elastic#144935) No-op tombstones documents can be indexed into Lucene during the promotion of a replica after a primary failure, or after restoring a snapshot or during peer-recovery when the primary shard has no-op tombstones documents. Such documents have the __soft_deletes field set so they are automatically filtered out from search hits and GET responses. The _tsid, @timestamp and _ts_routing_hash doc value fields (that are used to compute the synthetic _id of documents) of delete tombstones document are populated so the fields exist in the Lucene index (the values are derived from the document id of the DELETE request). For no-op tombstone documents, it's different because we cannot deduce the doc values fields from a document id. Therefore those no-op tombstone documents must be checked for and filtered out from the TSDB synthetic id postings format. Also, the TSDB synthetic id custom codec ensures that all open/written segment have the _tsid, @timestamp and _ts_routing_hash doc value fields. This is not true for segment that are only composed of no-op tombstones documents, so the assertions there must be relaxed. This commit adjust the postings format and coded used in TSDB indices with synthetic ids and adds an integration test that exercise the 3 code paths where no-op tombstone document can be indexed into Lucene.

Fixed by #144935 Closes #144582

…145192) Fixed by elastic#144935 Closes elastic#144582

tlrx added 4 commits March 25, 2026 10:44

support noops

f8164a1

more fixes

561dc9e

more fixes

54cbaf0

rename and more assertions

aae6de7

tlrx added >non-issue :Distributed/Recovery Anything around constructing a new shard, either from a local or a remote source. v9.4.0 :StorageEngine/TSDB You know, for Metrics labels Mar 25, 2026

elasticsearchmachine added Team:Distributed Meta label for distributed team. Team:StorageEngine labels Mar 25, 2026

tlrx requested review from burqen, fcofdez and martijnvg and removed request for burqen and fcofdez March 25, 2026 15:04

tlrx mentioned this pull request Mar 25, 2026

FollowingEngineTests fails with AbstractTSDBSyntheticIdCodec #144582

Closed

tlrx and others added 4 commits March 25, 2026 18:11

Merge branch 'main' into 2026/03/25-noop-tombstone-synthetic-ids

569121d

Merge branch 'main' into 2026/03/25-noop-tombstone-synthetic-ids

c7e76e2

Merge branch 'main' into 2026/03/25-noop-tombstone-synthetic-ids

a8b8c8a

[CI] Auto commit changes from spotless

5bf03f7

fcofdez approved these changes Mar 27, 2026

View reviewed changes

tlrx added 5 commits March 27, 2026 17:46

remove assert

664b739

assert

0b8f316

Merge branch 'main' into 2026/03/25-noop-tombstone-synthetic-ids

569fc13

Merge branch 'main' into 2026/03/25-noop-tombstone-synthetic-ids

c3ea0a0

Merge branch 'main' into 2026/03/25-noop-tombstone-synthetic-ids

079a629

Merge branch 'main' into 2026/03/25-noop-tombstone-synthetic-ids

4dc640f

tlrx merged commit 650e050 into elastic:main Mar 30, 2026
36 checks passed

tlrx deleted the 2026/03/25-noop-tombstone-synthetic-ids branch March 30, 2026 12:21

tlrx added a commit to tlrx/elasticsearch that referenced this pull request Mar 30, 2026

[Test] Unmute FollowingEngineTests.testProcessOnceOnPrimary

2f4d148

Fixed by elastic#144935 Closes elastic#144582

tlrx mentioned this pull request Mar 30, 2026

[Test] Unmute FollowingEngineTests.testProcessOnceOnPrimary #145192

Merged

tlrx added a commit that referenced this pull request Mar 31, 2026

[Test] Unmute FollowingEngineTests.testProcessOnceOnPrimary (#145192)

d2b813d

Fixed by #144935 Closes #144582

pmpailis pushed a commit that referenced this pull request Mar 31, 2026

[Test] Unmute FollowingEngineTests.testProcessOnceOnPrimary (#145192)

e2cb4b9

Fixed by #144935 Closes #144582

ncordon pushed a commit to ncordon/elasticsearch that referenced this pull request Apr 1, 2026

[Test] Unmute FollowingEngineTests.testProcessOnceOnPrimary (elastic#…

e10f52f

…145192) Fixed by elastic#144935 Closes elastic#144582

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support no-op tombstones documents in TSDB indices with synthetic ids#144935

Support no-op tombstones documents in TSDB indices with synthetic ids#144935
tlrx merged 14 commits intoelastic:mainfrom
tlrx:2026/03/25-noop-tombstone-synthetic-ids

tlrx commented Mar 25, 2026 •

edited

Loading

Uh oh!

elasticsearchmachine commented Mar 25, 2026

Uh oh!

elasticsearchmachine commented Mar 25, 2026

Uh oh!

fcofdez left a comment

Uh oh!

fcofdez Mar 27, 2026

Uh oh!

tlrx Mar 27, 2026

Uh oh!

fcofdez Mar 27, 2026

Uh oh!

tlrx Mar 27, 2026

Uh oh!

fcofdez Mar 27, 2026

Uh oh!

tlrx Mar 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

tlrx commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elasticsearchmachine commented Mar 25, 2026

Uh oh!

elasticsearchmachine commented Mar 25, 2026

Uh oh!

fcofdez left a comment

Choose a reason for hiding this comment

Uh oh!

fcofdez Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

tlrx Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

fcofdez Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

tlrx Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

fcofdez Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

tlrx Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

tlrx commented Mar 25, 2026 •

edited

Loading