build: consume Electron-generated PGO profiles in release builds#51815
Conversation
71fcd20 to
2192922
Compare
3a0a018 to
77c8699
Compare
|
@MarshallOfSound has manually backported this PR to "42-x-y", please check out #51828 |
|
@MarshallOfSound has manually backported this PR to "43-x-y", please check out #51829 |
77c8699 to
7692cdf
Compare
Official release builds currently apply Chrome's published PGO profile. PGO profiles match functions by symbol name and control-flow hash, so every function that differs from Chrome - all of Node.js, patched Chromium files, V8 built with Node's flags, and the Electron shell - silently gets no optimization guidance and is laid out as cold. The same applies to V8's builtins profile: Chrome's published profiles reject Electron's promise/async builtins because the Node.js integration changes their codegen. This change points release builds at Electron-generated profiles served from the build-tools storage account instead: - build/pgo_profiles/<target>.pgo.txt state files name the profile each target consumes; updating profiles only changes these files. - gclient hooks download the named profiles via script/pgo/download-profiles.py. - A small Chromium patch teaches the standard chrome_pgo_phase = 2 resolution to read Electron's state files (including per-arch Linux profiles, which upstream does not have). Keeping Chromium's own PGO configuration authoritative means every compiler and linker flag they maintain - -fprofile-use, warning suppressions, extended-TSP block layout, and anything they add in the future - applies to Electron's profiles unchanged. An explicitly set pgo_data_path still takes precedence as an override. - Chrome's published profiles are no longer downloaded (checkout_pgo_profiles is now False). chrome_pgo_phase already defaults to 2 for official builds, so a single release.gn import works for every platform/arch with no per-platform configuration. Measured against Chrome's profiles (Linux x64, otherwise identical builds): +9.5% on Speedometer 3.1, +16% geomean across 22 app-operation benchmarks, with contextBridge calls +44-51%.
7692cdf to
8cefc80
Compare
deepak1556
left a comment
There was a problem hiding this comment.
LGTM, do you want to flip the default in this PR given the PGO generation is manual. Concern is mainly on a profile generation would be missed before a release. But I am fine if its tracked as a wg-releases duty for now.
I'm not concerned too much about PGOs on main, and PGOs from release branches should be stabilized. I'll add it as a release wg action to ensure we have a recent PGO before cutting the first stable. And I'm gonna wire up the automation this week to keep building them |
|
Release Notes Persisted
|
Description of Change
Wires Electron release builds to consume Electron-generated PGO profiles instead of Chrome's published ones. This is the consumption side of the PGO pipeline (generation side: #51812 — the two PRs are independent and can merge in either order).
The problem
Release builds apply Chrome's published PGO profile (
chrome_pgo_phase = 2). PGO profiles match functions by symbol name + control-flow hash, so every function that differs from Chrome — all of Node.js, patched Chromium files, V8 built with Node's flags, and the Electron shell — gets no optimization guidance and is laid out as cold. The same applies to V8's builtins profile: Chrome's rejects Electron's promise/async builtins (RunMicrotasks,AsyncFunctionAwait,FulfillPromise, …) because Node's integration changes their codegen.This isn't theoretical. Shipping Electron 44 has a 2.1× regression in
crypto.randomBytesvs Electron 42 (829K → 390K ops/s) caused by exactly this: a BoringSSL patch changed function hashes, Chrome's profile silently stopped covering them. The Electron profile recovers it completely (839K ops/s).How it works
chrome_pgo_phase = 2flow resolves; all compiler/linker flags come from upstream's config and track it automatically (this addresses the cflags-drift concern raised in review — the parallel-config approach had in fact already drifted, missing-enable-ext-tsp-block-placement)release.gnimport works on every platform/arch —chrome_pgo_phasealready defaults to 2 for official builds, so it only wires the V8 builtins profilecheckout_pgo_profiles = False); an explicitly setpgo_data_pathstill takes precedence as an escape hatch back to themwin-x86,linux-arm) get the C++ profile but keep no builtins PGO (Electron only generates a 64-bit builtins profile)Benchmark results
All benchmark records (click to expand)
1. Headline: shipping nightly vs this work (conclusive, statistically controlled)
Methodology: official
v44.0.0-nightly.20260529vs an identical-source build with Electron PGO profiles + ThinLTO¹, Linux x64, 38 benchmarks × 5 interleaved rounds per build (A,B,A,B,… to cancel thermal/cache drift), idle machine, Welch's t-test at p < 0.05.response.json()+36.6%, fetch 1KB +28.9%, XHR +27.7%¹ The comparison build also includes ThinLTO
--lto-O2(#51669/#51809), since that is the configuration releases will ship with once both land. For PGO's isolated contribution see section 2.Full 38-test table (click to expand)
All values ops/s unless noted. The single non-win (1MB typed array marshaling) is dominated by raw memcpy, which PGO cannot accelerate.
2. PGO's isolated contribution (Chrome profile vs Electron profile, same build otherwise)
crypto.randomBytes3. The training-coverage story (why profiles must cover app workloads)
A profile trained only on browser benchmarks pessimizes code those benchmarks never run (PGO marks uncovered functions cold). Measured cost on a benchmark-only profile, and recovery after adding Electron-specific training (main-process Node.js, contextBridge/IPC marshaling, networking over TLS — see #51812):
4. V8 builtins profile coverage
RunMicrotasks,AsyncFunctionAwait,FulfillPromise,PromiseConstructor, …)5. Cumulative Speedometer 3.1 progression (Linux x64, containerized; same source, same V8)
--lto-O2(#51669 / #51809)With the full optimization stack, Electron is as fast as — and in some cases faster than — Chrome on the same workloads.
6. Real-hardware corroboration (macOS, Apple Silicon)
ThinLTO-only numbers from #51669/#51809 testing — included to show container results translate to real hardware (PGO stacks on top of these):
Relationship to other PRs
Checklist
npm testpassesRelease Notes
Notes: Improved runtime performance.