Skip to content

Synthetic id field for time_series indices (remove feature flag)#144184

Merged
burqen merged 39 commits intoelastic:mainfrom
burqen:ap/2026.03.13.synthetic-id-default-and-remove-feature-flag
Apr 7, 2026
Merged

Synthetic id field for time_series indices (remove feature flag)#144184
burqen merged 39 commits intoelastic:mainfrom
burqen:ap/2026.03.13.synthetic-id-default-and-remove-feature-flag

Conversation

@burqen
Copy link
Copy Markdown
Contributor

@burqen burqen commented Mar 13, 2026

This commit removes the feature flag for synthetic id and thereby makes
it officially available in production. The setting is controlled by
index.mapping.synthetic_id which default to true for new time_series
indices, as long as they use the default codec.

Indices with synthetic _id fields doesn't store the id physically on
disk. Instead, the id is materialized synthetically from other document
fields. This reduces the footprint on disk for time_series indices.

NOTE! The changes themselves are not super interesting, but this will enable synthetic id everywhere, turned ON by default, and as such might need some extra care.

The slightly more interesting changes are in:

  • server/src/main/java/org/elasticsearch/cluster/metadata/IndexMetadata.java
  • server/src/main/java/org/elasticsearch/common/settings/IndexScopedSettings.java
  • server/src/main/java/org/elasticsearch/index/IndexSettings.java

@burqen burqen requested review from a team as code owners March 13, 2026 11:24
@burqen burqen changed the title Ap/2026.03.13.synthetic id default and remove feature flag Remove synthetic id FeatureFlag Mar 13, 2026
@burqen burqen marked this pull request as draft March 13, 2026 11:25
@burqen burqen added >feature :StorageEngine/TSDB You know, for Metrics labels Mar 13, 2026
@burqen burqen added the test-release Trigger CI checks against release build label Mar 13, 2026
This commit removes the feature flag for synthetic id and thereby makes
it officially available in production. The setting is controlled by
`index.mapping.synthetic_id` which default to `true` for new time_series
 indices, as long as they use the default codec.

Indices with synthetic _id fields doesn't store the id physically on
disk. Instead, the id is materialized synthetically from other document
fields. This reduces the footprint on disk for time_series indices.
@burqen burqen force-pushed the ap/2026.03.13.synthetic-id-default-and-remove-feature-flag branch from 2642cff to d0d08f6 Compare March 19, 2026 13:15
@burqen burqen changed the title Remove synthetic id FeatureFlag Synthetic id field for time_series indices (remove feature flag) Mar 19, 2026
@burqen burqen requested review from fcofdez and tlrx March 19, 2026 13:16
@burqen burqen marked this pull request as ready for review March 19, 2026 13:16
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Hi @burqen, I've created a changelog YAML for you.

@burqen
Copy link
Copy Markdown
Contributor Author

burqen commented Mar 19, 2026

Test failures:

Triggering another run

burqen and others added 3 commits March 19, 2026 16:26
Add logging for index settings and node versions
after index creation in ClientYamlTestClient
@burqen
Copy link
Copy Markdown
Contributor Author

burqen commented Mar 20, 2026

Analysis of id mismatch failure in MixedClusterClientYamlTestSuiteIT

Summary: Both old node dies because of known issue, #143980. With only new nodes left, time_series index use synthetic id by default which is counter to the test expectation. I don't see a way to protect against this and i think it's reasonable to assume that the test cluster stays intact. If it falls apart, it's because of another bug somewhere.

Full analysis

Relevant test failure

REPRODUCE WITH: ./gradlew ":qa:mixed-cluster:v9.3.3#mixedClusterTest" -Dtests.class="org.elasticsearch.backwards.MixedClusterClientYamlTestSuiteIT" -Dtests.method="test {p0=tsdb/25_id_generation/routing_path matches object}" -Dtests.seed=1EC87079E8768BBC -Dtests.bwc=true -Dtests.locale=az-Cyrl-AZ -Dtests.timezone=Asia/Chita -Druntime.java=25

MixedClusterClientYamlTestSuiteIT > test > test {p0=tsdb/25_id_generation/routing_path matches object} FAILED
    java.lang.AssertionError: java.lang.AssertionError: 
    Expected: "8bgiqW9JKwAyp1bZAAABeRnRGTM"
         but: was "JFvylKRG_nnAwbD-SKLruAAWrvVxXB-xBpIYlWsMx6RFR8fHaX___obmLubMqSK48Q"

This is a mixed cluster test that has two 9.3.3-SNAPSHOT nodes, and two 9.4.0-SNAPSHOT nodes.

For some reason the index uses synthetic id by default (it is not set in tsdb/25_id). The only way this can happen (I think) is if a 9.4.0 node is master and it uses a late index version (not supported by 9.3.3). This should not be possible in a mixed cluster and it would be a pretty serious bug.

My current theory is that the two older nodes have died and the cluster is operating on only the two newer nodes. They would then give IndexVersion.current() to new indices which would explain the default behavior. This is somewhat supported by lots of test failing with java.io.UncheckedIOException: java.net.ConnectException: Connection refused.

4bc666f add temporary logging to try and catch index version and the state of cluster nodes when creating the index used by tsdb/25_id. (TO BE REVERTED BEFORE MERGE)

EDIT: The test uses synthetic id because both old nodes disconnect from the cluster (leaving only new nodes left which will default to synthetic id for time_series indices)

Cluster Timeline — id_generation_test Analysis

v9.3.3-0 and v9.3.3-1 are the "upgraded" nodes — they appear running both 9.3.3 (before upgrade) and 9.4.0 (after restart). v9.3.3-2 and v9.3.3-3 never upgrade.

Setup sequence

Time Event
15:52:04 Cluster forms. v9.3.3-1 elected master. All 4 nodes at 9.3.3
15:52:06 v9.3.3-0 disconnects (upgrading)
15:52:33 v9.3.3-0 rejoins at 9.4.0, wins new election (term 3), becomes master
15:52:34 v9.3.3-1 disconnects (upgrading)
15:53:01 v9.3.3-1 rejoins at 9.4.0

Index creation cluster membership

# Time Nodes in cluster Notes
1 15:53:18 v9.3.3-0 (9.4.0, master), v9.3.3-1 (9.4.0), v9.3.3-2 (9.3.3), v9.3.3-3 (9.3.3) Full 4-node mixed cluster
15:54:16 v9.3.3-2 disconnects (reason: disconnected)
2–6 15:54:48–15:59:01 v9.3.3-0 (9.4.0, master), v9.3.3-1 (9.4.0), v9.3.3-3 (9.3.3) 3-node mixed cluster
16:01:24 v9.3.3-3 disconnects (reason: disconnected)
7–15 16:16:09–16:21:41 v9.3.3-0 (9.4.0, master), v9.3.3-1 (9.4.0) 2-node fully-upgraded cluster

Key observations

  1. Master is always v9.3.3-0 at 9.4.0 — it wins the election after restart and holds it throughout.
  2. Creation # 1 happens in the fully mixed state (2 old + 2 new nodes). This is the most interesting from an ID generation perspective — both upgraded and non-upgraded nodes are data nodes.
  3. Creations # 2–6 happen in a partially mixed state (1 old node v9.3.3-3 still in cluster).
  4. Creations # 7–15 happen after all old nodes have left — pure 9.4.0 cluster. These are the "fully upgraded" test runs.

EDIT 2: Node v9.3.3-2 and v9.3.3-3 die because of changes in serialization of confidence_interval mapping. It's a known issue:

Full stacktrace:

[2026-03-19T15:54:15,799][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [v9.3.3-2] fatal error in thread [elasticsearch[v9.3.3-2][clusterApplierService#updateTask][T#1]], exiting
java.lang.AssertionError: provided source [{"_doc":{"properties":{"another_vector":{"type":"dense_vector","dims":4,"index":true,"similarity":"l2_norm","index_options":{"type":"int4_hnsw","m":16,"ef_construction":100}},"name":{"type":"keyword"},"vector":{"type":"dense_vector","dims":4,"index":true,"similarity":"l2_norm","index_options":{"type":"int4_hnsw","m":16,"ef_construction":100}}}}}] differs from mapping [{"_doc":{"properties":{"another_vector":{"type":"dense_vector","dims":4,"index":true,"similarity":"l2_norm","index_options":{"type":"int4_hnsw","m":16,"ef_construction":100,"confidence_interval":0.0}},"name":{"type":"keyword"},"vector":{"type":"dense_vector","dims":4,"index":true,"similarity":"l2_norm","index_options":{"type":"int4_hnsw","m":16,"ef_construction":100,"confidence_interval":0.0}}}}}]
	at org.elasticsearch.index.mapper.DocumentMapper.<init>(DocumentMapper.java:74) ~[elasticsearch-9.3.3-SNAPSHOT.jar:?]
	at org.elasticsearch.index.mapper.MapperService.newDocumentMapper(MapperService.java:644) ~[elasticsearch-9.3.3-SNAPSHOT.jar:?]
	at org.elasticsearch.index.mapper.MapperService.updateMapping(MapperService.java:396) ~[elasticsearch-9.3.3-SNAPSHOT.jar:?]
	at org.elasticsearch.index.IndexService.updateMapping(IndexService.java:886) ~[elasticsearch-9.3.3-SNAPSHOT.jar:?]
	at org.elasticsearch.indices.cluster.IndicesClusterStateService.createIndicesAndUpdateShards(IndicesClusterStateService.java:644) ~[elasticsearch-9.3.3-SNAPSHOT.jar:?]
	at org.elasticsearch.indices.cluster.IndicesClusterStateService.doApplyClusterState(IndicesClusterStateService.java:359) ~[elasticsearch-9.3.3-SNAPSHOT.jar:?]
	at org.elasticsearch.indices.cluster.IndicesClusterStateService.applyClusterState(IndicesClusterStateService.java:304) ~[elasticsearch-9.3.3-SNAPSHOT.jar:?]
	at org.elasticsearch.cluster.service.ClusterApplierService.callClusterStateAppliers(ClusterApplierService.java:605) ~[elasticsearch-9.3.3-SNAPSHOT.jar:?]
	at org.elasticsearch.cluster.service.ClusterApplierService.callClusterStateAppliers(ClusterApplierService.java:591) ~[elasticsearch-9.3.3-SNAPSHOT.jar:?]
	at org.elasticsearch.cluster.service.ClusterApplierService.applyChanges(ClusterApplierService.java:564) ~[elasticsearch-9.3.3-SNAPSHOT.jar:?]
	at org.elasticsearch.cluster.service.ClusterApplierService.runTask(ClusterApplierService.java:493) ~[elasticsearch-9.3.3-SNAPSHOT.jar:?]
	at org.elasticsearch.cluster.service.ClusterApplierService$UpdateTask.run(ClusterApplierService.java:183) ~[elasticsearch-9.3.3-SNAPSHOT.jar:?]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:1046) ~[elasticsearch-9.3.3-SNAPSHOT.jar:?]
	at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:218) ~[elasticsearch-9.3.3-SNAPSHOT.jar:?]
	at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:184) ~[elasticsearch-9.3.3-SNAPSHOT.jar:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1090) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:614) ~[?:?]
	at java.lang.Thread.run(Thread.java:1474) ~[?:?]

@tlrx
Copy link
Copy Markdown
Member

tlrx commented Apr 3, 2026

Triggering a bunch of CI builds
Green 1267fba f3283a1
Red d8c45ff1 8300538 107f7fe

Known test failures:

Footnotes

  1. https://github.com/elastic/elasticsearch/issues/145421

@kkrik-es
Copy link
Copy Markdown
Member

kkrik-es commented Apr 6, 2026

Let's also add a release highlight note for this.

@fcofdez fcofdez requested a review from kkrik-es April 7, 2026 11:31
@burqen
Copy link
Copy Markdown
Contributor Author

burqen commented Apr 7, 2026

This is getting close to ready. The only thing missing now is a green build and a 👍 from you fine Storage engine folk. I'm pinging you @kkrik-es, since you've been involved here lately, but please feel free to forward the review request if you think that is a better fit.

@burqen burqen merged commit 8d84fa9 into elastic:main Apr 7, 2026
37 checks passed
mromaios pushed a commit to mromaios/elasticsearch that referenced this pull request Apr 9, 2026
…stic#144184)

* Synthetic id field for time_series indices

This commit removes the feature flag for synthetic id and thereby makes
it officially available in production. The setting is controlled by
`index.mapping.synthetic_id` which default to `true` for new time_series
 indices, as long as they use the default codec.

Indices with synthetic _id fields doesn't store the id physically on
disk. Instead, the id is materialized synthetically from other document
fields. This reduces the footprint on disk for time_series indices.

* Update docs/changelog/144184.yaml

* Temporary logging for tsdb/25_id

Add logging for index settings and node versions
after index creation in ClientYamlTestClient

* Bump IndexVersion for synthetic id On by default

Bump the IndexVersion that protect default behavior so that existing
time_series indices keep their current behavior and only new indices,
created with the new index version use synthetic id by default.

* Remove feature flag added in merge

* Avoid use of setting when feature flag disabled

Only set DISABLE_SEQUENCE_NUMBERS explicitly if the feature flag is
enabled in TSDBSyntheticIdsIT.

* Disable synthetic id in faulty version range

---------

Co-authored-by: Francisco Fernández Castaño <francisco.fernandez.castano@gmail.com>
Co-authored-by: Tanguy Leroux <tlrx.dev@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>feature :StorageEngine/TSDB You know, for Metrics Team:StorageEngine test-release Trigger CI checks against release build v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants