-
Notifications
You must be signed in to change notification settings - Fork 3.5k
[Python] Improve python builder performance #8766
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
8396420 to
ec85faf
Compare
89b94cf to
c2467b0
Compare
c2467b0 to
30c7020
Compare
- Builder startup work is cheaper: StartObject now seeds vtable state with [0] * numfields, lazy create sharedStrings dict to speed up objects with no strings - Offset/Pad/Prep all work off cached head/buffer lengths and zero-fill via slices - Prepend now handles alignment + byte writes in one pass - Vtable write is batched: WriteVtable gathers all field offsets plus metadata and streams them
30c7020 to
2c6300f
Compare
|
I'll take a crack at this one too but might help to get some additional eyes -- @fliiiix if you have some spare time I'd appreciate the help! |
|
I had a look and closed it again 🙈 It would have been probably better to review if the changes would be split into separate commits but i will give it an other look if i find time tomorrow |
OK I have split this into 3 parts that build off each other, suggest review just the first and merge, then rebase and review the second, and so on, so that you see just one new commit worth of changes at a time: I actually removed some of the optimizations that seemed to produce extremely minimal gain so the group of three is less code overall. Feel free to close this if you prefer the phased approach. |
|
Prefer the 3 commit approach |
Builder startup work is cheaper: StartObject now seeds vtable state with [0] * numfields, lazy create sharedStrings dict to speed up objects with no strings
Offset/Pad/Prep all work off cached head/buffer lengths and zero-fill via slices
Prepend now handles alignment + byte writes in one pass
Vtable write is batched: WriteVtable gathers all field offsets plus metadata and streams them
Performance is 1.6x faster: old time: 63.0 us, new time 39.2 us
before_perf.txt
after_perf.txt