changefeedccl: Improve JSON encoder performance #88064
changefeedccl: Improve JSON encoder performance #88064craig[bot] merged 2 commits intocockroachdb:masterfrom
Conversation
ab46f61 to
8c9c047
Compare
|
Full benchmark results: |
8c9c047 to
f7dee74
Compare
HonoreDB
left a comment
There was a problem hiding this comment.
Reviewed 2 of 2 files at r1, 11 of 11 files at r2, 1 of 1 files at r3, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @stevendanna)
f7dee74 to
d7a672a
Compare
8ccf155 to
e0b2467
Compare
|
@HonoreDB: This needs to have a more careful review; In addition, I removed the forced setting of key_in_value when using bare envelope -- why would you force that? |
fc0c3a5 to
63e48ca
Compare
HonoreDB
left a comment
There was a problem hiding this comment.
Reviewed 11 of 13 files at r5, 18 of 23 files at r6, 5 of 5 files at r7, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @stevendanna)
63e48ca to
0d91118
Compare
Add JSON encoder benchmark. Release note: None Release justification: test only change
9ac0374 to
575f5dd
Compare
575f5dd to
e819c0c
Compare
Rewrite JSON encoder to improve its performance.
Prior to this change JSON encoder was very inefficient.
This inefficiency had multiple underlying reasons:
* New Go map objects were constructed for each event.
* Underlying json conversion functions had inefficiencies
(tracked in cockroachdb#87968)
* Conversion of Go maps to JSON incurs the cost
of sorting the keys -- for each row. Sorting,
particularly when rows are wide, has significant cost.
* Each conversion to JSON allocated new array builder
(to encode keys) and new object builder; that too has cost.
* Underlying code structure, while attempting to reuse
code when constructing different "envelope" formats,
cause the code to be more inefficient.
This PR addresses all of the above. In particular, since
a schema version for the table is guaranteeed to have
the same set of primary key and value columns, we can construct
JSON builders once. The expensive sort operation can be performed
once per version; builders can be memoized and cached.
The performance impact is significant:
* Key encoding speed up is 5-30%, depending on the number of primary
keys.
* Value encoding 30% - 60% faster (slowest being "wrapped" envelope
with diff -- which effectively encodes 2x values)
* Byte allocations per row reduces by over 70%, with the number
of allocations reduced similarly.
Release note (enterprise change): Changefeed JSON encoder
performance improved by 50%.
Release justification: performance improvement
e819c0c to
c281a29
Compare
|
bors r=honoredb |
|
Build succeeded: |
|
Adding backport-22.2 label; but will not backport to 22.2.0; will wait until at least 22.2.1 |
Rewrite JSON encoder to improve its performance.
Prior to this change JSON encoder was very inefficient.
This inefficiency had multiple underlying reasons:
(tracked in tree: Improve performance of tree.AsJSON #87968)
of sorting the keys -- for each row. Sorting,
particularly when rows are wide, has significant cost.
(to encode keys) and new object builder; that too has cost.
code when constructing different "envelope" formats,
cause the code to be more inefficient.
This PR addresses all of the above. In particular, since
a schema version for the table is guaranteed to have
the same set of primary key and value columns, we can construct
JSON builders once. The expensive sort operation can be performed
once per version; builders can be memoized and cached.
The performance impact is significant:
keys.
with diff -- which effectively encodes 2x values)
of allocations reduced similarly.
Release note (enterprise change): Changefeed JSON encoder
performance improved by 50%.
Release justification: performance improvement