Skip to content

Add benchmark to test fastCodePointCount#140591

Merged
parkertimmins merged 6 commits intoelastic:mainfrom
parkertimmins:parker/fast-code-point-benchmark
Mar 9, 2026
Merged

Add benchmark to test fastCodePointCount#140591
parkertimmins merged 6 commits intoelastic:mainfrom
parkertimmins:parker/fast-code-point-benchmark

Conversation

@parkertimmins
Copy link
Copy Markdown
Contributor

Add benchmark for fastCodePointCount added in #140388

@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Hi @parkertimmins, I've created a changelog YAML for you.

@parkertimmins
Copy link
Copy Markdown
Contributor Author

Results:

Benchmark                                          (numCodePoints)   (type)  Mode  Cnt    Score    Error  Units
BytesRefCodePointCountBenchmark.elasticsearchSwar                1    ascii  avgt    5    1.852 ±  0.003  ns/op
BytesRefCodePointCountBenchmark.elasticsearchSwar                1  unicode  avgt    5    1.852 ±  0.004  ns/op
BytesRefCodePointCountBenchmark.elasticsearchSwar               10    ascii  avgt    5    5.066 ±  0.004  ns/op
BytesRefCodePointCountBenchmark.elasticsearchSwar               10  unicode  avgt    5    5.008 ±  0.113  ns/op
BytesRefCodePointCountBenchmark.elasticsearchSwar              100    ascii  avgt    5    9.078 ±  0.004  ns/op
BytesRefCodePointCountBenchmark.elasticsearchSwar              100  unicode  avgt    5   13.894 ±  0.038  ns/op
BytesRefCodePointCountBenchmark.elasticsearchSwar             1000    ascii  avgt    5   27.948 ±  0.021  ns/op
BytesRefCodePointCountBenchmark.elasticsearchSwar             1000  unicode  avgt    5   68.097 ±  0.151  ns/op
BytesRefCodePointCountBenchmark.luceneUnicodeUtil                1    ascii  avgt    5    1.722 ±  0.005  ns/op
BytesRefCodePointCountBenchmark.luceneUnicodeUtil                1  unicode  avgt    5    1.721 ±  0.004  ns/op
BytesRefCodePointCountBenchmark.luceneUnicodeUtil               10    ascii  avgt    5    4.827 ±  0.005  ns/op
BytesRefCodePointCountBenchmark.luceneUnicodeUtil               10  unicode  avgt    5    5.138 ±  0.026  ns/op
BytesRefCodePointCountBenchmark.luceneUnicodeUtil              100    ascii  avgt    5   22.137 ±  0.044  ns/op
BytesRefCodePointCountBenchmark.luceneUnicodeUtil              100  unicode  avgt    5   70.340 ±  0.663  ns/op
BytesRefCodePointCountBenchmark.luceneUnicodeUtil             1000    ascii  avgt    5  165.873 ±  0.073  ns/op
BytesRefCodePointCountBenchmark.luceneUnicodeUtil             1000  unicode  avgt    5  648.503 ± 23.703  ns/op

The new method is slightly slower on strings of lengths 1 and 10 characters: in the tests around 5-8% slower. On strings of length 100, the new version is 2.4x as fast on ascii, and 5x as fast on unicode. For strings of length 1000, the new version is 6x as fast on ascii, and 9.5x as fast on unicode.

@parkertimmins parkertimmins marked this pull request as ready for review January 13, 2026 17:40
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

Copy link
Copy Markdown
Member

@martijnvg martijnvg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@parkertimmins parkertimmins enabled auto-merge (squash) January 14, 2026 17:09
@parkertimmins parkertimmins merged commit 462a4c1 into elastic:main Mar 9, 2026
35 checks passed
@parkertimmins parkertimmins deleted the parker/fast-code-point-benchmark branch March 9, 2026 15:04
parkertimmins added a commit to parkertimmins/elasticsearch that referenced this pull request Mar 9, 2026
parkertimmins added a commit that referenced this pull request Mar 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants