Comparative benchmarking and fine-tuning of optimized native scorers

Now that we have implemented optimized native scorers for all data types, across our 2 supported architectures (x64 and ARM64), both single and "bulk", we need to fine tune them.
The native functions internally have different implementations/parameters that can be changed to optimize them: bulk size, prefetching, using of different SIMD instructions, specialized implementations for higher tier HW (SVE/AVX-512), etc. We also have different bulk algorithms and implementations, different unrolling levels and mechanisms. We should assess which ones are the most effectives, and adopt them across the codebase, to make code more readable, maintenable, and consistently more efficient.

Related tasks/issues:

Add missing benchmarks/tests:
- [x] https://github.com/elastic/elasticsearch/pull/145173
- [x] https://github.com/elastic/elasticsearch/pull/145096

Consolidate/fix implementations:
- [x] https://github.com/elastic/elasticsearch/pull/145310
- [x] https://github.com/elastic/elasticsearch/pull/145316

Optimizations:
- [x] https://github.com/elastic/elasticsearch/pull/145116
- [x] https://github.com/elastic/elasticsearch/pull/144649
- [x] https://github.com/elastic/elasticsearch/pull/144505
- [ ] https://github.com/elastic/elasticsearch/pull/143040
- [x] https://github.com/elastic/elasticsearch/pull/145683
- [ ] (disk)BBQ optimized kernels (especially bulk) with AVX-512

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comparative benchmarking and fine-tuning of optimized native scorers #145411

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Comparative benchmarking and fine-tuning of optimized native scorers #145411

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions