Improves bsdiff performance by preventing excessive iterations when processing similar data blocks#2693
Conversation
…y less than 8 bytes
|
Just for niceness, could you add comments in the source code citing where you fetched the changes from the chromiumos URLs that you mentioned here? I don't think they added additional licensing clauses to bsdiff.c so I don't believe we have to carry over additional licensing text. I think I will also want to test these changes somewhat. Our bsdiff implementation rarely changes and changes are hard for me to assess but it looks like we've a strong reason to land this. By the way, another difference between these versions of bsdiff is that we use sais for the suffix sorting algorithm, while chromium uses divsufsort. Both are a departure from the original qsufsort algorithm. divsufsort may be better than sais, but I never qualified it (both are better than qsufsort I believe). (This is a separate thing from this issue). |
|
One more thing I forgot, |
|
Thanks for the review @zorgiepoo! I've added comments in the source code citing the ChromiumOS URLs where I sourced these changes, as requested. I've also updated Regarding testing, we have been using these changes in our CI infrastructure now for 2 weeks without any negative side-effects. I will post an update here if that changes. Interesting point about sais vs. divsufsort -- good to know. Let me know if there's anything else needed before this is ready for approval! |
|
Thanks. I will merge this for now. Tested this locally with a few different bundles and haven't ran into issues. |
We detected a performance issue in our CI where the BinaryDelta process was taking over 4 hours to generate deltas between our macOS binaries. After timing individual
bsdiffoperations, I identified that a single diff was causing most of the slowdown. The issue occurred in the main macOS binary (Contents/MacOS/<AppName>) diffing process after product changes that removed approximately 22MB of binary code. By implementing patches from the ChromiumOS project's version ofbsdiff, we reduced the entire BinaryDelta process from hours to just 5 minutes.The changes introduce a mechanism to detect when
bsdiffgets stuck processing large blocks of data that differ by less than 8 bytes. After 100 iterations in such a state, the algorithm breaks out of the loop to prevent performance degradation. Additionally, the search comparison operator was updated to include equality cases.Original ChromiumOS Changes:
https://chromium.googlesource.com/chromiumos/third_party/bsdiff/+/a055996c743add7a9558839276fd1e4994d16bd3%5E%21/#F0
https://chromium.googlesource.com/chromiumos/third_party/bsdiff/+/58146f74abd6b1b69693943195f37f4ac6a6acef%5E%21/#F0
https://chromium.googlesource.com/chromiumos/third_party/bsdiff/+/426e4aa1cbeb3c8a73002047d7a796ca8e5e17d4%5E%21/#F0
Misc Checklist
Testing
I tested and verified my change by using one or multiple of these methods:
macOS version tested: 15.3