-
Notifications
You must be signed in to change notification settings - Fork 116
make most of simdutf constexpr #868
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
This is great, brilliant even ! Yes, having distinct implementations is not great (makes testing more difficult and so forth).
Hmmm... what do you think can be a blocker ? They are non allocating in general. So they should be straight out immediately easy to turn into
Fair point. But maybe we can narrow it down to a few functions that are too messy in the scalar namespace. So maybe we could do something like this...
|
8b93956 to
a5f203d
Compare
|
I made partial progress:
I was able to work around this by making a std::vector with a copy of the data: const uint8_t *data{nullptr};
std::vector<uint8_t> tmp(buf, buf + len);
if consteval {
data = tmp.data();
} else {
data = reinterpret_cast<const uint8_t *>(buf);
}this kind of works but I am going to do some more experimentation. |
|
@pauldreik Are you sure ? |
|
Gemini say: Reference in N4950 Section 7.7, "Constant expressions," paragraph 5, item (5.15) states that an expression E is not a core constant expression if the evaluation of a potential result involves a: "(5.15) a reinterpret_cast (7.6.1.10);" |
yes, you have to also use the function to see it. https://godbolt.org/z/7GeKb7r76 |
|
@pauldreik Using AI and a bit C++ knowledge, I wrote a wrapper that seems to work: https://godbolt.org/z/n86945sa4 The idea goes like this: template <typename to, typename from>
requires (sizeof(to) == sizeof(from))
struct constexpr_ptr {
from* p;
constexpr constexpr_ptr() noexcept = default;
constexpr constexpr_ptr(from* ptr) noexcept : p(ptr) {}
constexpr to operator*() const noexcept { return std::bit_cast<to>(*p); }
constexpr constexpr_ptr& operator++() noexcept { ++p; return *this; }
constexpr constexpr_ptr operator++(int) noexcept { auto old = *this; ++p; return old; }
constexpr constexpr_ptr& operator--() noexcept { --p; return *this; }
constexpr constexpr_ptr operator--(int) noexcept { auto old = *this; --p; return old; }
constexpr constexpr_ptr& operator+=(std::ptrdiff_t n) noexcept { p += n; return *this; }
constexpr constexpr_ptr& operator-=(std::ptrdiff_t n) noexcept { p -= n; return *this; }
constexpr constexpr_ptr operator+(std::ptrdiff_t n) const noexcept { return p + n; }
constexpr constexpr_ptr operator-(std::ptrdiff_t n) const noexcept { return p - n; }
constexpr std::ptrdiff_t operator-(const constexpr_ptr& o) const noexcept { return p - o.p; }
constexpr to operator[](std::ptrdiff_t n) const noexcept { return std::bit_cast<to>(*(p + n)); }
constexpr auto operator<=>(const constexpr_ptr&) const noexcept = default;
};
template <typename to, typename from>
constexpr constexpr_ptr<to, from> constexpr_cast_ptr(from* p) noexcept {
return p;
}Then using it like so: constexpr char g(const uint8_t* c) {
return constexpr_cast_ptr<char>(c)[0];
}This seems to work. I wonder why this is not in the std library? (When I indicate that it is made by AI, it is understood that it is not to be trusted. Just a demo.) |
|
@pauldreik I think that such an abstraction should be zero (runtime) cost, so we could use it for all out casts. |
a5f203d to
7ee5f7e
Compare
interesting! I will give it a try. |
cd0d06e to
dd18395
Compare
|
I have now implemented constexpr support for a few randomly selected functions and I think it definitely is doable. I also think the changes needed are quite reasonable, and the runtime performance should not be affected at all. It would be nice to hear from someone who would actually benefit from this feature! |
|
I am working on this (slowly). It goes in the right direction! |
a20c0de to
9fae3ef
Compare
2efaa98 to
edcace4
Compare
This makes most of simdutf constexpr. it relies on the
if constevalfeature from C++23 and is used within the span api.See issue #865
Overall description
The scalar implementation has been slightly modified to be able to run as constexpr. The scalar implementation is already tested and proven. It is also differentially fuzzed against the other implementations. It was therefore better to "promote it" to constexpr instead of writing a new, untested implementation.
It is best described through an example. Here is an example of a changed function which is part of the public api, where dispatch is being made at compile time to either select the constexpr version or the normal, performant code:
It can be seen that the normal, simd accelerated path works just as before. At constexpr time, the scalar implementation is used. The scalar implementation looks like this:
so the scalar implementation is mostly behaving as before, but if evaluated at compile time problematic optimizations involving
memcpyandreinterpret_castare avoided.Header reorganization
Making the scalar implementations available in the header required quite a bit of code/file reorganization. It makes the header bigger, but it does not seem to hurt compilation time.
Compile time performance
As measured by the added
scripts/compilation_benchmark.pywhich does a release build and median of three:so there is a slight increase of compilation time.
Runtime performance
The runtime is not expected to change at all, the increased amount of dispatch should be optimized away already at compile time.
What is not constexpr?