Skip to content

Issues with multithreaded code and CPU dispatching. #65

@alexey-milovidov

Description

@alexey-milovidov

Suppose we are calling base64_encode or base64_decode in a loop (for different inputs) and doing it from multiple threads (for different data).

  1. If we pass non-zero flags to these routines, it will write to a single global variable repeatedly in codec_choose_forced function and it will lead to "false sharing" and poor scalability.

  2. There is no method to pre-initialize the choice of codec. (Actually, there is: we can simply call one of the encode/decode routines in advance with empty input, but it looks silly). If we don't do that and if we run our code with thread-sanitizer, it will argue about data race on codec function pointers. In fact, it is safe, because it is a single pointer - single machine word that is (supposedly) placed in aligned memory location. But we have to annotate it as _Atomic and store/load with memory_order_relaxed. Look at the similar issue here: Make dynamic dispatch free of TSan warnings simdjson/simdjson#256

  3. Suppose we use these routines in a loop for short inputs. They have a branch to check if encoders/decoders were initialized. We want to move these branches out of the loop: check for CPU and call specialized implementation directly. But architecture specific methods are not exported and we cannot do that. We also have to pay for two non-inlined function calls.

All these issues was found while integrating this library to ClickHouse: ClickHouse/ClickHouse#8397

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions