Skip to content

add static reflection benchmark to 'large random' benchmark and allows deserialization (with static reflection) from objects and arrays.#2349

Merged
lemire merged 1 commit intojson_builder_initfrom
json_builder_init_with_large_random
Mar 17, 2025
Merged

add static reflection benchmark to 'large random' benchmark and allows deserialization (with static reflection) from objects and arrays.#2349
lemire merged 1 commit intojson_builder_initfrom
json_builder_init_with_large_random

Conversation

@lemire
Copy link
Member

@lemire lemire commented Mar 17, 2025

Unfortunately, the performance is not where I'd like it to be. The static reflection adds an overhead in the large random' benchmark that I cannot yet explain.

While toying with this benchmark, I realized that we were not yet able to turn an array or an object into a custom type automatically, this PR also fixes that, as well as various small typos.

deserialization (with static reflection) from objects and arrays.
@lemire lemire requested a review from FranciscoThiesen March 17, 2025 19:57
@FranciscoThiesen
Copy link
Member

LGTM! How much of an overhead did you see in practice for static reflection in large random?

@lemire
Copy link
Member Author

lemire commented Mar 17, 2025

@FranciscoThiesen

It is quite significant:

$ ./buildreflect/benchmark/bench_ondemand --benchmark_filter=large_random
2025-03-17T22:01:48+00:00
Running ./buildreflect/benchmark/bench_ondemand
Run on (128 X 2000 MHz CPU s)
CPU Caches:
  L1 Data 48 KiB (x64)
  L1 Instruction 32 KiB (x64)
  L2 Unified 1280 KiB (x64)
  L3 Unified 49152 KiB (x2)
Load Average: 0.10, 0.09, 0.09
simdjson::dom implementation:      icelake
simdjson::ondemand implementation (stage 1): icelake
simdjson::ondemand implementation (stage 2): fallback
Creating a source file spanning 44921 KB 
---------------------------------------------------------------------------------------------------------------------
Benchmark                                                           Time             CPU   Iterations UserCounters...
---------------------------------------------------------------------------------------------------------------------
large_random<simdjson_ondemand>/manual_time                  51298485 ns     54886540 ns           14 best_branch_miss=592.582k best_bytes_per_sec=899.04M best_cache_miss=790.891k best_cache_ref=3.46504M best_cycles=162.186M best_cycles_per_byte=3.52584 best_docs_per_sec=19.5447 best_frequency=3.16987G best_instructions=516.401M best_instructions_per_byte=11.2263 best_instructions_per_cycle=3.18401 best_items_per_sec=19.5447M branch_miss=592.955k bytes=45.9991M bytes_per_second=855.155M/s cache_miss=805.879k cache_ref=3.46556M cycles=162.545M cycles_per_byte=3.53366 docs_per_sec=19.4938/s frequency=3.16862G/s instructions=516.401M instructions_per_byte=11.2263 instructions_per_cycle=3.17697 items=1000k items_per_second=19.4938M/s [BEST: throughput=  0.90 GB/s doc_throughput=    19 docs/s instructions=   516400780 cycles=   162185673 branch_miss=  592582 cache_miss=  790891 cache_ref=   3465044 items=   1000000 avg_time=  51298485 ns]
large_random<simdjson_ondemand_static_reflect>/manual_time   79790203 ns     83347196 ns            9 best_branch_miss=620.639k best_bytes_per_sec=579.913M best_cache_miss=814.165k best_cache_ref=3.4717M best_cycles=251.971M best_cycles_per_byte=5.47774 best_docs_per_sec=12.607 best_frequency=3.17661G best_instructions=727.402M best_instructions_per_byte=15.8134 best_instructions_per_cycle=2.88684 best_items_per_sec=12.607M branch_miss=620.426k bytes=45.9991M bytes_per_second=549.794M/s cache_miss=825.37k cache_ref=3.47252M cycles=253.382M cycles_per_byte=5.50842 docs_per_sec=12.5329/s frequency=3.17561G/s instructions=727.402M instructions_per_byte=15.8134 instructions_per_cycle=2.87077 items=1000k items_per_second=12.5329M/s [BEST: throughput=  0.58 GB/s doc_throughput=    12 docs/s instructions=   727401824 cycles=   251971213 branch_miss=  620639 cache_miss=  814165 cache_ref=   3471699 items=   1000000 avg_time=  79790202 ns]

@lemire
Copy link
Member Author

lemire commented Mar 17, 2025

I'll merge this PR.

@lemire lemire merged commit 846ae99 into json_builder_init Mar 17, 2025
139 checks passed
@lemire lemire deleted the json_builder_init_with_large_random branch March 17, 2025 22:02
lemire added a commit that referenced this pull request Jul 14, 2025
* Initial work on JSON builder

* moving the files back to ondemand for now.

* tweak

* more later

* update

* minor edits

* dropping vs arm (missing support)

* adding tests. we still specialized write_string_escaped

* tweaking

* fix typo

* tweaking the approach

* minor fix

* missing store

* another missing store

* Attempt at fixing failing serialization tests. (#2292)

* Fixing appeand_float typo (#2294)

* applying a couple of fixes

* updating single header

* fix for pre C++17 if constexpr

* Fixing unused argument problem and updating the singleheader file

* various pedantic fixes

* Sketch of builder

* reordering.

* simplify

* Adding draft of static reflection based deserialization

* Updating simdjson singleheader

* patching the automated deserialization.

* automated

* Adding support for smart pointers of user defined types.

* Adding specialization for smart pointers for basic types. I think it is highly likely that this can be done in a more generic way.

* Referncing a later version of rapidjson that fixed the issue related with assignment attempt of a const variable for GenericStringRef class.

* guarding the tests

* adding documentation for string_builder

* saving

* rename to 'append'

* saving

* non-functional benchmarks (#2342)

* non-functional benchmarks

* Fix typo

* various fixes

* tweaking

---------

Co-authored-by: Daniel Lemire <dlemire@lemire.me>
Co-authored-by: Francisco Geiman Thiesen <franciscogthiesen@gmail.com>

* tuning

* various minor fixes

* minor tweak

* minor simplification

* updating amal

* adding a cast

* update

* fancy casting

* removing dead code

* Pushing latest changes. CITM benchmark is still not working.

* Still not working, but now I am getting only 10 errors.

* add static reflection benchmark to 'large random' benchmark and allows (#2349)

deserialization (with static reflection) from objects and arrays.

Co-authored-by: Daniel Lemire <dlemire@lemire.me>

* Removing std::map from CitmCatalog definition, since that is not currently supported.

* Added free to rust bench, segfault is still happening..

* The syntax changed: ^E became ^^E. (#2350)

* The syntax changed: ^E became ^^E.

* guarding

---------

Co-authored-by: Daniel Lemire <dlemire@lemire.me>

* Adding support for string_view_keyed_map types.

* Adding concepts as a conditional include.

* updating single-header

* Adding concepts to ondemand deps

* rust benchmark is finally working

* Fixing small typo in docs.

* adding docker config and instructions so that our users can test the static reflection (#2358)

* adding docker config and instructions so that our users can test the
static reflection

* completing the instructions

* pruning white spaces

---------

Co-authored-by: Daniel Lemire <dlemire@lemire.me>

* minor optimizations on the JSON builder branch

* avoiding undef behaviour

* saving

* somewhat nicer builder

* make it possible to run just one benchmark

* adding linux perf

* fixing minor issue

* updating swar

* Adding real world compilation benchmark (#2379)

* Adding compilation benchmark for json parsing with and without reflection

* Moving it to the benchmark folder, also reducing a bit the number of iterations.

* Removing script from root folder.

* Reducing number of iterations

* Update benchmark/benchmark_reflection_usage_compilation.sh

Co-authored-by: Daniel Lemire <daniel@lemire.me>

* Update benchmark/benchmark_reflection_usage_compilation.sh

Co-authored-by: Daniel Lemire <daniel@lemire.me>

* Update benchmark/benchmark_reflection_usage_compilation.sh

Co-authored-by: Daniel Lemire <daniel@lemire.me>

* Making the script more customizable and also test whether the compiler being used supports reflection before actually running the benchmark

---------

Co-authored-by: Daniel Lemire <daniel@lemire.me>

* Using define_static_string from  https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2025/p3491r2.html (#2389)

* Applying changes needed after latest reflection paper updates.

* Working, but no template for yet.

* Updating single-header to incldue the use of define_static_string.

* copying over master

---------

Co-authored-by: Daniel Lemire <dlemire@lemire.me>
Co-authored-by: Francisco Geiman Thiesen <franciscogthiesen@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants