Skip to content

Improves bsdiff performance by preventing excessive iterations when processing similar data blocks#2693

Merged
zorgiepoo merged 4 commits intosparkle-project:2.xfrom
thebrowsercompany:willf/port_bsdiff_optimizations
Mar 11, 2025
Merged

Improves bsdiff performance by preventing excessive iterations when processing similar data blocks#2693
zorgiepoo merged 4 commits intosparkle-project:2.xfrom
thebrowsercompany:willf/port_bsdiff_optimizations

Conversation

@wfairclough
Copy link
Copy Markdown
Contributor

We detected a performance issue in our CI where the BinaryDelta process was taking over 4 hours to generate deltas between our macOS binaries. After timing individual bsdiff operations, I identified that a single diff was causing most of the slowdown. The issue occurred in the main macOS binary (Contents/MacOS/<AppName>) diffing process after product changes that removed approximately 22MB of binary code. By implementing patches from the ChromiumOS project's version of bsdiff, we reduced the entire BinaryDelta process from hours to just 5 minutes.

The changes introduce a mechanism to detect when bsdiff gets stuck processing large blocks of data that differ by less than 8 bytes. After 100 iterations in such a state, the algorithm breaks out of the loop to prevent performance degradation. Additionally, the search comparison operator was updated to include equality cases.

Original ChromiumOS Changes:

426e4aa AU: bsdiff: Expand pathological case where files differ by <8 bytes by Thieu Le · 13 years ago
58146f7 AU: Fix bsdiff hang by Thieu Le · 13 years ago
a055996 bsdiff: Speed up pathological case. by Thieu Le · 14 years ago
a055996
bsdiff: Speed up pathological case.

bsdiff does not properly handle the case where there is a large block of
data in the new file that only differs from the old file by less than 8
bytes.  This causes bsdiff to continue searching through the files one
byte at a time and at each byte, re-compare the same large block of data
which leads to excessively long run times.  This fix checks for this
edge condition and breaks out of the search loop early.  This retains
the size efficiency of the patch file for most binaries while preserving
the runtime efficiency for files that fall into this category.

https://chromium.googlesource.com/chromiumos/third_party/bsdiff/+/a055996c743add7a9558839276fd1e4994d16bd3%5E%21/#F0

58146f7
AU: Fix bsdiff hang

https://chromium.googlesource.com/chromiumos/third_party/bsdiff/+/58146f74abd6b1b69693943195f37f4ac6a6acef%5E%21/#F0

426e4aa
AU: bsdiff: Expand pathological case where files differ by <8 bytes

Modify bsdiff to better handle the case where files differ by <8 bytes
in some regions, not limitting this case to linear traversal.

BUG=chromium-os:28552
TEST=Manual bsdiff of problematic files, update engine unit tests

https://chromium.googlesource.com/chromiumos/third_party/bsdiff/+/426e4aa1cbeb3c8a73002047d7a796ca8e5e17d4%5E%21/#F0

Misc Checklist

  • My change requires a documentation update on Sparkle's website repository
  • My change requires changes to generate_appcast, generate_keys, or sign_update

Testing

I tested and verified my change by using one or multiple of these methods:

  • Sparkle Test App
  • Unit Tests
  • My own app
  • Other (please specify)

macOS version tested: 15.3

@zorgiepoo
Copy link
Copy Markdown
Member

zorgiepoo commented Mar 9, 2025

Just for niceness, could you add comments in the source code citing where you fetched the changes from the chromiumos URLs that you mentioned here? I don't think they added additional licensing clauses to bsdiff.c so I don't believe we have to carry over additional licensing text.

I think I will also want to test these changes somewhat. Our bsdiff implementation rarely changes and changes are hard for me to assess but it looks like we've a strong reason to land this.

By the way, another difference between these versions of bsdiff is that we use sais for the suffix sorting algorithm, while chromium uses divsufsort. Both are a departure from the original qsufsort algorithm. divsufsort may be better than sais, but I never qualified it (both are better than qsufsort I believe). (This is a separate thing from this issue).

@zorgiepoo
Copy link
Copy Markdown
Member

zorgiepoo commented Mar 9, 2025

One more thing I forgot, latestMinorVersionForMajorVersion() should be updated by bumping the minor version for major versions 2, 3, and 4 (major version 1 is dead and doesn't need updating). That way, we record/document this modification of bsdiff was used when generating new patches. As far as I understand, this change should not break major compatibility for older clients.

@wfairclough
Copy link
Copy Markdown
Contributor Author

Thanks for the review @zorgiepoo!

I've added comments in the source code citing the ChromiumOS URLs where I sourced these changes, as requested. I've also updated latestMinorVersionForMajorVersion() to bump the minor version for major versions 2, 3, and 4 to document this bsdiff modification.

Regarding testing, we have been using these changes in our CI infrastructure now for 2 weeks without any negative side-effects. I will post an update here if that changes.

Interesting point about sais vs. divsufsort -- good to know.

Let me know if there's anything else needed before this is ready for approval!

@zorgiepoo
Copy link
Copy Markdown
Member

Thanks. I will merge this for now. Tested this locally with a few different bundles and haven't ran into issues.

@zorgiepoo zorgiepoo merged commit 997778d into sparkle-project:2.x Mar 11, 2025
2 checks passed
@zorgiepoo zorgiepoo added this to the 2.8 milestone Mar 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants