Skip to content

Fix crash in clearDownloadedUpdate when no downloaded update is tracked#2863

Merged
zorgiepoo merged 2 commits intosparkle-project:2.xfrom
JulianPscheid:fix/clear-downloaded-update-assertion-crash
Mar 23, 2026
Merged

Fix crash in clearDownloadedUpdate when no downloaded update is tracked#2863
zorgiepoo merged 2 commits intosparkle-project:2.xfrom
JulianPscheid:fix/clear-downloaded-update-assertion-crash

Conversation

@JulianPscheid
Copy link
Copy Markdown
Contributor

clearDownloadedUpdate asserts that either _resumableUpdate or _downloadedUpdateForRemoval is non-nil, but several code paths can reach it when both are nil. The code after the assert already handles nil correctly (the if check on line 203), so the assert is the sole cause of the crash.

We're seeing this in production as a SIGABRT — 519 occurrences across 279 users on Sparkle 2.9.0, macOS 14 through 26. The call paths that trigger it:

  • installerDidFailToApplyDeltaUpdate (delta application failure)
  • extractUpdate: completion handler (extraction error, non-auth)
  • SPUUIBasedUpdateDriver (user skips a downloaded update)

The fix replaces the assert with an early return + SULog warning, so the condition is still logged for diagnostics without taking down the app.

Checklist

  • My change is being tested and reviewed against the Sparkle 2.x branch.
  • My change is being backported to master branch (Sparkle 1.x). Not applicable, this code doesn't exist in 1.x.
  • I have reviewed and commented my code, particularly in hard-to-understand areas.

Type of change

  • Bug fix (non-breaking change which fixes an issue)

Testing

  • Sparkle Test App
  • My own app

Replace assert(downloadedUpdateObject != nil) with an early return
and SULog warning. The assertion causes a SIGABRT in production when
clearDownloadedUpdate is called but both _resumableUpdate and
_downloadedUpdateForRemoval are nil.

Multiple code paths can trigger this: delta update application failure
(installerDidFailToApplyDeltaUpdate), extraction errors, and the UI
driver's skip-update path. The code after the assertion already
handled nil gracefully via an if-check, so the assert was the sole
cause of the crash.

Observed in production: 519 occurrences across 279 users since
Sparkle 2.9.0.
@zorgiepoo
Copy link
Copy Markdown
Member

zorgiepoo commented Mar 23, 2026

I don't think this code has changed recently (probably has worked this way since 2021 maybe with no other reports?). This doesn't explain why the assert happens, nor fixes the core issue if there is a problem. The call sites you mention should only call the method when there is a downloaded/resumable method. The nil check was probably added in conjunction with the isKindOfClass check, separate from the assert. Do you have a repro case?

@zorgiepoo
Copy link
Copy Markdown
Member

zorgiepoo commented Mar 23, 2026

installerDidFailToApplyDeltaUpdate (delta application failure)

Cannot reproduce this. Modify/add a file in the original app bundle to force delta failure.

extractUpdate: completion handler (extraction error, non-auth)

"Non-auth" is not a correct description. Cannot reproduce this. Require auth by doing sudo chown root your-app-bundle.app, when asked to authorize, cancel the request.

SPUUIBasedUpdateDriver (user skips a downloaded update)

Cannot reproduce this. Turn automatic downloading on, defaults delete your-app-bundle-id SULastCheckTime, restart app, check for update and skip the update.

My change is being backported to master branch (Sparkle 1.x). Not applicable, this code doesn't exist in 1.x.

On side note I don't understand how you got this old PR template.

…er messages

sendInstallationData kicks off the installer process, but completionHandler
was deferred until the probe callback fired. Installer messages like
SPUExtractionStarted could arrive on the main queue in the gap, triggering
installerDidStartExtracting -> clearDownloadedUpdate before the caller set
_downloadedUpdateForRemoval. A subsequent clearDownloadedUpdate (e.g. from
installerDidFailToApplyDeltaUpdate) then found both ivars nil and hit the
assertion.

The probe result is only logged, so completing before the probe is safe.
@JulianPscheid
Copy link
Copy Markdown
Contributor Author

Thanks for testing those paths. I dug deeper and found the actual root cause -- it's a timing issue, not a logic issue in the callers.

The race is in launchAutoUpdateSilently:completion:. On the success path (line 486+), sendInstallationData fires off the installer process, and then completionHandler(nil) is deferred until the probe callback completes (which can take up to 7 seconds on timeout). During that window, the installer process is already running and sending messages back via dispatch_async(dispatch_get_main_queue()).

Here's the sequence that crashes for delta updates:

  1. Installer launcher succeeds, sendInstallationData sends data to installer
  2. Probe starts (async, completion dispatched to main queue later)
  3. Installer receives data, starts extraction, sends SPUExtractionStarted
  4. Main queue processes SPUExtractionStarted -> installerDidStartExtracting -> clearDownloadedUpdate
    • _resumableUpdate is found (set in downloadDriverDidDownloadUpdate), cleaned up, cleared to nil
    • _downloadedUpdateForRemoval is nil (the completionHandler(nil) that sets it hasn't fired yet)
  5. Delta extraction fails, installer sends SPUArchiveExtractionFailed
  6. Main queue processes it -> installerDidFailToApplyDeltaUpdate -> clearDownloadedUpdate
    • _resumableUpdate = nil (cleared in step 4)
    • _downloadedUpdateForRemoval = nil (probe still hasn't completed)
    • Assert fires

The probe result is only used for logging ("Error: failed to probe status service"), so calling completionHandler(nil) before the probe is safe and ensures _downloadedUpdateForRemoval is set before any installer messages arrive.

I've pushed a second commit (688d278) with this fix alongside the defensive early-return in clearDownloadedUpdate. The first commit handles the symptom, the second addresses the root cause. Happy to split/squash/rewrite however you prefer.

This explains why you couldn't reproduce: the crash only happens when installer messages arrive on the main queue before the probe completes. The probe typically finishes fast on a local dev machine, so the window is narrow. In production across hundreds of machines with varying XPC/status service latencies, it hits often enough (519 times across 279 users).

@zorgiepoo
Copy link
Copy Markdown
Member

Thanks that makes sense. That probe message was added recently to try to speed up launching that agent process. It is still quite worrying the probe was slower than after a delta update was downloaded and failed to extract.. but that's a separate issue.

@zorgiepoo zorgiepoo added this to the 2.9.1 milestone Mar 23, 2026
@zorgiepoo zorgiepoo merged commit c625894 into sparkle-project:2.x Mar 23, 2026
2 checks passed
@zorgiepoo
Copy link
Copy Markdown
Member

Secondary separate concern is the rate you may be seeing delta update failures, if you're using generate_appcast and the optional sparkle:deltaFromSparkleExecutableSize, sparkle:deltaFromSparkleLocales attributes it adds.

@JulianPscheid
Copy link
Copy Markdown
Contributor Author

Thanks for the quick merge.

On the delta failure rate -- we actually only started shipping deltas with our most recent release (2.15.3, about 10 days ago). The crash has been showing up since December 2025, so the bulk of the volume (370+ out of 519 events) was on versions that never had delta entries in the appcast. The probe race explains the delta crashes cleanly, but there's likely another path hitting clearDownloadedUpdate with both ivars nil that we haven't pinned down yet. The defensive early-return in SPUCoreBasedUpdateDriver covers it either way, but figured it's worth flagging in case it rings a bell.

We've set --maximum-deltas 0 on our end for now. Will look into the deltaFromSparkleExecutableSize / deltaFromSparkleLocales attributes before turning them back on -- good call on that.

@zorgiepoo
Copy link
Copy Markdown
Member

zorgiepoo commented Mar 23, 2026

The crash has been showing up since December 2025, so the bulk of the volume (370+ out of 519 events) was on versions that never had delta entries in the appcast

This would be before 2.9.0 was released which added the probe check, but we've not had any reports about this before then. That is not to rule out there could be an issue somewhere though.

It is still quite worrying the probe was slower than after a delta update was downloaded and failed to extract

Actually in this case the update was already downloaded, so this may be within expectations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants