Speed up toXContent Collection Serialization in some Spots#78742
Speed up toXContent Collection Serialization in some Spots#78742original-brownbear merged 5 commits intoelastic:masterfrom original-brownbear:faster-serialize-collections
Conversation
Found this when benchmarking large cluster states. When serializing collections we'd mostly not take any advantage of what we know about the collection contents (like we do in `StreamOutput`). This PR adds a couple of helpers to the x-content-builder similar to what we have on `StreamOutput` to allow for faster serializing by avoiding the writer lookup and some self-reference checks.
|
Pinging @elastic/es-distributed (Team:Distributed) |
|
Pinging @elastic/es-core-infra (Team:Core/Infra) |
DaveCTurner
left a comment
There was a problem hiding this comment.
LGTM
How much does this save us out of interest?
|
|
||
| public XContentBuilder xContentList(String name, ToXContent... values) throws IOException { | ||
| startArray(name); | ||
| for (ToXContent value : values) { |
There was a problem hiding this comment.
Maybe check for non null array here
There was a problem hiding this comment.
Huh right TIL, I I always figured I'd get [null] here if passed a null but turns out you actually get the null :)
| // Maps & Iterable | ||
| ////////////////////////////////// | ||
|
|
||
| public XContentBuilder stringListField(String name, Collection<String> values) throws IOException { |
There was a problem hiding this comment.
Nit: maybe just array(String name, Collection<String> values) ? Same remark for the other new methods.
There was a problem hiding this comment.
Can't do that unfortunately because we have:
public XContentBuilder field(String name, Iterable<?> values) throws IOException {
return field(name).value(values);
}
which collides with it. Same with the other cases, these super generic ? or Object type methods collide with everything.
There was a problem hiding this comment.
Oh right, I did not see this one. Let's keep like it is then.
|
Thanks David & Tanguy!
It's not entirely trivial to isolate the effect because this changes the profile quite a bit and we seemingly also see inlining in more spots now, but I'd say we're at roughly O(5%) savings + I have a follow-up based on this in the pipeline that should have a bigger impact :) |
…78755) Found this when benchmarking large cluster states. When serializing collections we'd mostly not take any advantage of what we know about the collection contents (like we do in `StreamOutput`). This PR adds a couple of helpers to the x-content-builder similar to what we have on `StreamOutput` to allow for faster serializing by avoiding the writer lookup and some self-reference checks.
|
@original-brownbear is this related to #26907 as well? |
|
@dliappis yea to some degree for sure. A good chunk of the speedup here is from simply not having to do the self-reference check now in the changed scenarios. |
Found this when benchmarking large cluster states. When serializing collections we'd mostly
not take any advantage of what we know about the collection contents (like we do in
StreamOutput).This PR adds a couple of helpers to the x-content-builder similar to what we have on
StreamOutputto allow for faster serializing by avoiding the writer lookup and some self-reference checks and uses them
in obvious spots I could quickly identify in profiling or via the IDE.
relates #77466