Conversation
|
Is it an attempt to re-use the parser in these situations? for (auto val : from(json_string_1)) { ... }
for (auto val : from(json_string_2)) { ... }Because this would be cool, but wish we wouldn't need to allocate it on the heap in the first place though, right? Maybe a thread local |
|
I like this! One thing that I miss is having some benchmark/perf number to we can assess how much this impacts performance. (Given some analysis that I've done locally, I definitely expect this will speed-up things, but I am still curious about how much) |
FranciscoThiesen
left a comment
There was a problem hiding this comment.
Great work (that will likely lead to significant perf improvement)
the-moisrex
left a comment
There was a problem hiding this comment.
Much of parser itself is already on the heap; this will try to remove this one additional indirection.
|
|
||
| return parser_instance; | ||
| } | ||
|
|
There was a problem hiding this comment.
simdjson_inline simdjson_warn_unused std::optional<ondemand::parser> & parser::get_parser_instance() {
thread_local std::optional<ondemand::parser> parser_instance;
// Si l'instance n'existe pas encore, on la crée
if (!parser_instance.has_value()) {
parser_instance = std::make_optional<ondemand::parser>();
}
return parser_instance;
}| #endif // SIMDJSON_CONDITIONAL_INCLUDE | ||
|
|
||
| #include <memory> | ||
| #include <thread> |
There was a problem hiding this comment.
That's C++17. We compile as C++11.
| private: | ||
| friend bool release_parser(); | ||
| friend ondemand::parser& get_parser(); | ||
| static simdjson_inline simdjson_warn_unused std::unique_ptr<ondemand::parser> & get_parser_instance(); |
There was a problem hiding this comment.
static simdjson_inline simdjson_warn_unused std::optional<ondemand::parser> & get_parser_instance();
The parser instance itself is small. But it will then allocate quite a bit of memory and that's what we worry about. |
Yes, that's why I suggest It's just one less allocation, that's all. |
|
We can do the same thing portably with a union and a bool. |
|
@the-moisrex I'll make a comment with your name, pointing out that it could be implemented differently. It is an internal matter, so we can change it later. |
|
@the-moisrex @FranciscoThiesen We now have a rather complete benchmark. Unfortunately, the approach with ranges comes with a significant performance penalty. I think it is probably fine to push it but with a warning that is will slow things down. The thread-local parser is pretty much free. |
|
Here are the numbers on an Apple system: The last two use the new ranges API. The performance difference is quite obvious. It is about 2000 instructions of a penalty to parse a single 'Car' instance. I would say that it is quite bad... But as long we warning users off that it comes with a significant penalty, I think it is ok. |
|
@the-moisrex @FranciscoThiesen I have removed the C++20 ranges examples from our documentation and I have marked it as experimental. For the time being, until we can determine why it cost thousands of instructions per value in overhead, I would prefer that our users do not touch it. |
|
Great work @lemire. I ran the benchmarks a bit on a noisy system, there are a few differences I can see between the classic way and the ranges way.
If we remove the /// Make it an adaptor
template <std::ranges::range Range>
auto operator()(Range &&rng) const noexcept {
return std::forward<Range>(rng) | no_errors | std::views::transform(*this);
}This code show cases 2 of the 3 problems. For the second point it shows that we have to construct this For the first point, there's an extra if statement of error handling in We could maybe fix the API by doing something like this: for (Car car : (from | no_errors | as<Car>)(json_string)) { ... }
// or maybe this:
for (Car car : from | no_errors | as<Car>(json_string))) { ... }
// or maybe this (but I'm not sure if it's possible cleanly):
for (Car car : no_errors | as<Car>(json_string)) { ... }
// Maybe even this:
for (Car car : from | no_errors | to<Car> | json_string) { ... }
// Or this:
for (Car car : (from | no_errors | to<Car>) << json_string) { ... }
// Or this:
for (Car car : (from | no_errors | to<Car>) + json_string) { ... }
// This looks ugly, and I'm not sure about this:
for (Car car : from | no_errors | to<Car> <<= json_string) { ... }
for (Car car : from | no_errors | to<Car> = json_string) { ... }
// But we can do these two without needing the parenthesis, but it don't look great:
// Operator Precedence doesn't leave us many choices.
for (Car car : from | no_errors | to<Car> && json_string) { ... }
for (Car car : from | no_errors | to<Car> || json_string) { ... }The point is to mark the But my system is not a good system to do benchmarks, so it might be a different thing. Also, why preferring Also I got a segfault when I ran the benchmarks in debug mode, but I couldn't reproduce it. There might be a silent race-condition in the benchmarks or something. |
Because otherwise, you will get |
If you have a linux or macos system, and you run the benchmark in privilege mode (i.e., 'sudo') then you should see the performance counters (i.e., the number of instructions). Even if you machine is noisy, the number of instructions should be a telling tale. If you share with me a public key, I can give you access to a quiet Linux server.
Here, importantly, we have a per-element overhead that's very large (i.e., about 2000 instructions). |
|
The ranges work is not ready. I don't think we should publish it. We have insufficient testing and benchmarks. It should not be slower on a per-element basis. It is fine if setting up the loop takes a few bit of extra work, but for long loops, the cost should be amortized. |
for(Car car : simdjson::from(json) | simdjson::to<Car>) We now have 3 versions, this will get confusing for users pretty quick: template <typename T>
static constexpr to_adaptor<T> to{};
static constexpr to_adaptor<> from{};
template <typename T = void>
using as = to_adaptor<T>;
I agree. Ranges do have a cost, and also compilers still have a hard time optimizing them, and utilities like Let me try a few ideas see if we can match the instructions counts. |
|
If I haven't done so, I am removing 'as'. Note that the standard provides the following: auto vec = std::views::iota(1, 5)
| std::views::transform([](int v){ return v * 2; })
| std::ranges::to<std::vector>();Specifically |
|
@the-moisrex Verified, 'as' has been removed:
We still have the |
|
If the conflict in name can be confusing for users, then I suggest renaming I have this so far: for(Car car : auto_parser<>{} | simdjson::to<Car> | json) {
dummy = dummy + car.year;
}But modifying Calling Let me think of something else. |
This PR will not include those. We just keep |
|
@the-moisrex To be clearer, we have |
|
@the-moisrex @FranciscoThiesen I am merging this PR. I think that the thread-local parser is a good step. We also remove ranges for now, but make it easier to add them later. |
|
Merged. |

This modifies @the-moisrex's code slightly so that we can default on a thread-local parser. This should drastically improve performance in some cases.