Skip to content

benchmark: it is built with /Ob1, so vector algorithm dispatcher is noticeable #4496

@AlexGuteniev

Description

@AlexGuteniev

Noticed while working #4495 . When I decided to use sized if constexpr dispatch, instead of using the same version for all element sizes, I observed significant perf degradation for small element sizes. A part of it is due to not inlining the dispatcher.

The benchmark is built with /Ob1. Looks like it is implied due to CMake RelWithDebugInfo configuration, as opposed to Release.

What are our takeaways?


I see the following options:

  • Mark vector algorithms dispatchers inline, consider making other STL functions inline.
    • This helps other projects with RelWithDebugInfo to inline STL, though it would obfuscate the debugger
  • Override the option in the benchmark
  • Make the benchmark Release by default, instead of RelWithDebugInfo
    • I wouldn't like that. RelWithDebugInfo is convenient for profiling
  • Accept the cost of dispatching as a penalty for vector algorithms that use dispatching
    • I don't think this is fair
  • Use specializations instead of if constexpr
    • Throughput?
  • Manually inline the dispatch like for __std_reverse_copy_trivially_copyable...
    • Copypasta

Metadata

Metadata

Labels

fixedSomething works now, yay!testRelated to test code

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions