Decrease TieredMergePolicy's default number of segments per tier to 8. by jpountz · Pull Request #14823 · apache/lucene

jpountz · 2025-06-20T12:05:07Z

TieredMergePolicy currently allows 10 segments per tier. With Lucene being increasingly deployed with separate indexing and search tiers that get updated via segment-based replication, I believe that it would make sense for Lucene to have more aggressive merging defaults, a price that is only paid once on the indexing tier, but that benefits all search nodes that serve queries for this index.

Note that this is still a somewhat conservative default, applications with low latency requirements and low update rates will likely want to go even further, with 4 segments per tier, or even 2.

BaseMergePolicyTestCase#testSimulateAppendOnly reports a write amplification increase from 3.4 to 3.8, while BaseMergePolicyTestCase#testSimulateUpdates reports a write amplification increase from 4.5 to 4.9. In exchange, the number of segments between the floor and max segment sizes decreases by about 20%.

This should especially help queries that have a high per-segment overhead: PK lookups, point queries, multi-term queries and vector searches.

`TieredMergePolicy` currently allows 10 segments per tier. With Lucene being increasingly deployed with separate indexing and search tiers that get updated via segment-based replication, I believe that it would make sense for Lucene to have more aggressive merging defaults, a price that is only paid once on the indexing tier, but that benefits all search nodes that serve queries for this index. Note that this is still a somewhat conservative default, applications with low latency requirements and low update rates will likely want to go even further, with 4 segments per tier, or even 2. `BaseMergePolicyTestCase#testSimulateAppendOnly` reports a write amplification increase from 3.4 to 3.8, while `BaseMergePolicyTestCase#testSimulateUpdates` reports a write amplification increase from 4.5 to 4.9. In exchange, the number of segments between the floor and max segment sizes decreases by about 20%. This should especially help queries that have a high per-segment overhead: PK lookups, point queries, multi-term queries and vector searches.

github-actions · 2025-06-20T12:06:05Z

This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR.

mikemccand · 2025-06-20T12:22:49Z

+1, this is a great idea -- more aggessive merging by default makes sense.

mikemccand

Thanks @jpountz

jpountz · 2025-06-20T21:54:56Z

Thanks @mikemccand ! I'll wait a few days before merging to give others a chance to take a look.

vigyasharma

+1 for this change. It gives us slightly slower indexing but still within reasonable defaults.

easyice · 2025-06-24T09:18:14Z

lucene/core/src/java/org/apache/lucene/index/TieredMergePolicy.java


  private long floorSegmentBytes = 16 * 1024 * 1024L;
-  private double segsPerTier = 10.0;
+  private double segsPerTier = 8.0;


It seems the default value in the JavaDoc for the setSegmentsPerTier method might be outdated？

Thanks for catching this!

rmuir · 2025-06-24T16:28:44Z

My only concern is that it will have a really serious impact on the HNSW indexing/merging.

I know this indexing performance part isn't as fun/sexy as speeding up the queries, but not everyone has thousands of QPS: the queries could probably be slower. It could probably have 1% shittier recall and 10x faster indexing. I'm concerned wrong tradeoffs are made, and I feel like instead a lot of users suffer from the inefficient indexing. This will make that pain worse.

I don't think it should block the change but we should keep an eye on this.

rmuir · 2025-06-24T17:44:38Z

Here's the trend of HNSW indexing performance in the nightly bench:

You can see why I think wrong tradeoffs are being made. It is trending in the wrong direction and everyone is focused instead on making 1000QPS into 1200QPS or whatever.

jpountz · 2025-06-25T12:44:13Z

To be fair, this chart suggests a quite dramatic degradation over time, but these big drops are mostly due to the benchmark becoming harder by increasing the number of dimensions of vectors (annotation FK) or changing the model (annotation GT), and there are some speedups hidden among these big drops such as a merging speedup (annotation IE).

That said, I agree with your point, we want Lucene to work well on near-realtime use-cases that have high update rates as well, so we can't afford to make indexing and merging absurdly slow just for the sake of having slightly faster search.

rmuir · 2025-06-25T14:11:46Z

but these big drops are mostly due to the benchmark becoming harder by increasing the number of dimensions of vectors (annotation FK) or changing the model (annotation GT), and there are some speedups hidden among these big drops such as a merging speedup (annotation IE).

I don't think that statement is accurate. The first big drop is DW. LUCENE-10109: Increase default beam width from 16 to 100.

This one explicitly makes indexing more than 3x slower (from 409 GB/hr -> 140.7 GB/hr). The commit message states "This new default tries to balance good search performance with indexing speed."

Is the balance a good one?

jpountz · 2025-06-25T20:34:25Z

I'll need @benwtrent or @msokolov to provide an educated answer to this question, but the issue mentions that the previous value also had worse recall.

msokolov · 2025-06-25T21:45:05Z

The right default value for beam width is a little hard to say. Values reported in public vary wildly. Try asking Gemini what is the right default and it refuses to be pinned down! Unfortunately it depends - on the application setup (does it have a fancy second-stage ranker, or do we rely on the vector scores directly for ranking), on the vector data set (its dimensions, the size of the index). We're finding that "high quality" vectors might require lower values for these exploration parameters, and applications that gather lots of semantic hits and then later re-rank them might be less sensitive to lower recall and can tolerate cheaper indexing settings. You can see that these sources recommend values like 16, 64, 128, and 256:

Maybe a good rule of thumb would be to pick the place where the slope of these latency/recall curves is 45 degrees? But of course that will depend on the units of the chart.

My view is this is something that, at least as a default, we could detune a bit. Maybe we could retreat to 64? But anyway this issue is about merge policy segment counts I think so we should probably open a new issue if we want to change that

rmuir · 2025-06-25T22:47:36Z

Yeah I'm just commenting on overall indexing-vs-search speed perf. As I mentioned above, it isn't intended to block this issue at all: I just hope we are working to have good defaults that take indexing speed into consideration, as it is very slow for HNSW: I imagine this issue will make that even worse.

benwtrent · 2025-06-26T13:30:01Z

Increase default beam width from 16 to 100.

16 is a tiny ef_construction/beamWidth. I would expect graph quality to be pretty poor once reaching a non-trivial amount of vectors.

I would argue the previous numbers (tiny beamwidth or tiny vectors), don't reflect a real use-case.

there are some speedups hidden among these big drops such as a merging speedup (annotation IE).

While we have improved merge times, there is likely more ground to be recovered in making overall HNSW indexing speed better. My gut still tells me we are doing something silly in our diversity checking, doing too much unnecessary work in building the HNSW graph...

#14823) `TieredMergePolicy` currently allows 10 segments per tier. With Lucene being increasingly deployed with separate indexing and search tiers that get updated via segment-based replication, I believe that it would make sense for Lucene to have more aggressive merging defaults, a price that is only paid once on the indexing tier, but that benefits all search nodes that serve queries for this index. Note that this is still a somewhat conservative default, applications with low latency requirements and low update rates will likely want to go even further, with 4 segments per tier, or even 2. `BaseMergePolicyTestCase#testSimulateAppendOnly` reports a write amplification increase from 3.4 to 3.8, while `BaseMergePolicyTestCase#testSimulateUpdates` reports a write amplification increase from 4.5 to 4.9. In exchange, the number of segments between the floor and max segment sizes decreases by about 20%. This should especially help queries that have a high per-segment overhead: PK lookups, point queries, multi-term queries and vector searches.

jpountz added this to the 10.3.0 milestone Jun 20, 2025

github-project-automation bot added this to OpenSearch Lucene & Core Performance Tracking Jun 20, 2025

github-project-automation bot moved this to Open in OpenSearch Lucene & Core Performance Tracking Jun 20, 2025

github-actions bot added the module:core/index label Jun 20, 2025

CHANGES

79ca899

mikemccand approved these changes Jun 20, 2025

View reviewed changes

vigyasharma approved these changes Jun 23, 2025

View reviewed changes

stefanvodita mentioned this pull request Jun 23, 2025

Expand TieredMergePolicy deletePctAllowed limits #11761

Closed

easyice reviewed Jun 24, 2025

View reviewed changes

jpountz added 2 commits June 24, 2025 15:24

javadocs

82a6779

Merge branch 'main' into TieredMergePolicy_8_segs_per_tier

b9007fb

Merge branch 'main' into TieredMergePolicy_8_segs_per_tier

02360df

Merge branch 'main' into TieredMergePolicy_8_segs_per_tier

09bd055

jpountz merged commit 7adb5d6 into apache:main Jun 30, 2025
8 checks passed

github-project-automation bot moved this from Open to Merged in OpenSearch Lucene & Core Performance Tracking Jun 30, 2025

jpountz deleted the TieredMergePolicy_8_segs_per_tier branch June 30, 2025 11:57

jpountz mentioned this pull request Jun 30, 2025

Update TieredMergePolicy's default number of segments per tier elastic/elasticsearch#130328

Open

mikemccand mentioned this pull request Sep 2, 2025

Can nightly charts somehow separate test-only changes from actual Lucene code changes? mikemccand/luceneutil#454

Open

cpoerschke mentioned this pull request Sep 15, 2025

SOLR-17917: Lucene 10.3 upgrade apache/solr#3659

Merged

harshavamsi mentioned this pull request Oct 2, 2025

fix fieldtypeinferencetest opensearch-project/OpenSearch#19510

Merged

3 tasks

Conversation

jpountz commented Jun 20, 2025

Uh oh!

github-actions bot commented Jun 20, 2025

Uh oh!

mikemccand commented Jun 20, 2025

Uh oh!

mikemccand left a comment

Choose a reason for hiding this comment

Uh oh!

jpountz commented Jun 20, 2025

Uh oh!

vigyasharma left a comment

Choose a reason for hiding this comment

Uh oh!

easyice Jun 24, 2025

Choose a reason for hiding this comment

Uh oh!

jpountz Jun 24, 2025

Choose a reason for hiding this comment

Uh oh!

rmuir commented Jun 24, 2025

Uh oh!

rmuir commented Jun 24, 2025

Uh oh!

jpountz commented Jun 25, 2025

Uh oh!

rmuir commented Jun 25, 2025

Uh oh!

jpountz commented Jun 25, 2025

Uh oh!

msokolov commented Jun 25, 2025

Uh oh!

rmuir commented Jun 25, 2025

Uh oh!

benwtrent commented Jun 26, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants