Skip to content

DiskBBQ: num_candidates dynamic visit percentage is woefully low on medium+ sized datasets #142617

@benwtrent

Description

@benwtrent

Elasticsearch Version

9.2+

Installed Plugins

No response

Java Version

bundled

OS Version

any

Problem Description

We inappropriately scale num_candidates as a value of visit percentage. We scale it too aggressively given the data size. Instead it should be more linear and less dependent on data size as a whole. Additionally, this transformation should be VERY simple, right now its pretty complicated.

Steps to Reproduce

have a medium sized index (100k+), see num_candidate visit percentage is likely < 1%.

Logs (if relevant)

No response

Metadata

Metadata

Assignees

Labels

:Search Relevance/VectorsVector search>bugTeam:Search RelevanceMeta label for the Search Relevance team in Elasticsearchpriority:normalA label for assessing bug priority to be used by ES engineers

Type

No fields configured for Bug.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions