cl/phase1/forkchoice: don't maintain GLOAS indexed weight store pre-GLOAS#21698
Merged
domiwei merged 1 commit intoJun 9, 2026
Merged
Conversation
…LOAS The indexedWeightStore (new in the GLOAS merge) was instantiated unconditionally and its IndexVote/RemoveVote were called per validator index on every attestation via setLatestMessage, regardless of fork. Its results are only consumed by GLOAS get_head; pre-GLOAS get_head uses the non-indexed weightStore, so this was pure overhead — a top CPU hotspot (RemoveVote) plus heavy allocation/GC, all under the fork-choice write lock. On high-validator-count networks this stalls block import and head updates past the attestation deadline, causing stale head votes. Gate the index maintenance on the GLOAS vote path only.
Contributor
There was a problem hiding this comment.
Pull request overview
This PR reduces pre-GLOAS fork-choice CPU/GC overhead by avoiding maintenance of the GLOAS-only indexedWeightStore on the pre-GLOAS attestation vote path, while keeping indexed vote tracking enabled for the GLOAS path.
Changes:
- Add an explicit
maintainIndexedVotesflag tosetLatestMessageand disable indexed vote maintenance inupdateLatestMessagesPreGloas. - Keep indexed vote maintenance enabled in
updateLatestMessagesGloas. - Add tests to assert the pre-GLOAS path does not populate the indexed store, while the GLOAS path does.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| cl/phase1/forkchoice/on_attestation.go | Gates indexedWeightStore maintenance behind a flag so pre-GLOAS attestations don’t update the indexed store. |
| cl/phase1/forkchoice/weight_store_indexed_test.go | Adds regression tests covering pre-GLOAS vs GLOAS indexed-store maintenance behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+155
to
+156
| // The indexed weight store is only read by GLOAS get_head; maintaining it on | ||
| // the pre-GLOAS path is wasted per-vote work under the fork-choice lock. |
yperbasis
approved these changes
Jun 9, 2026
domiwei
approved these changes
Jun 9, 2026
domiwei
pushed a commit
that referenced
this pull request
Jun 9, 2026
…e pre-GLOAS (#21699) Cherry-pick of #21698 to release/3.5. Fixes the Caplin pre-GLOAS attestation head-vote regression: the GLOAS `indexedWeightStore` was maintained on every vote regardless of fork (top CPU hotspot + GC pressure under the fork-choice lock) but never read pre-GLOAS, delaying head updates past the attestation deadline. Validated on a live Hoodi node (node CPU ~7.5%->`<2%`, head-vote misses 40%->0%). Clean cherry-pick; `on_attestation.go` is identical between `main` and `release/3.5`.
yperbasis
added a commit
that referenced
this pull request
Jun 9, 2026
Resolves the on_attestation.go conflict by taking #21698's GLOAS-gated index maintenance (the stale IndexVote comment this branch fixed was removed there). Since #21698 gates indexed-vote maintenance to the GLOAS vote path, the differential test's pre-GLOAS fixtures no longer populate the index, so buildExAnteStore now mirrors latestMessages into the index directly.
yperbasis
added a commit
that referenced
this pull request
Jun 9, 2026
RemoveVote allocated a fresh slice per call to filter out a validator's entry. On the GLOAS vote path it runs per attestation, so on high-validator-count networks it is a top CPU consumer and GC source under the fork-choice lock (the cost #21698 removed pre-GLOAS but which returns under GLOAS now that getHeadGloas reads the index). Compact the entry out in place instead.
yperbasis
added a commit
that referenced
this pull request
Jun 10, 2026
getHeadGloas reads the indexedWeightStore, but index maintenance is GLOAS-gated (#21698): pre-GLOAS attestations fill latestMessages without touching the index. At the pre-GLOAS->GLOAS boundary GetHead flips from the full-scan store (which reads all of latestMessages) to the index, still empty — so attestation weight is understated until validators re-attest over ~1 epoch, risking a wrong head if a fork straddles the transition. The spec keeps store.latest_messages as a single dict never reset across forks, and Prysm/Lighthouse keep one continuous per-validator vote store; Erigon already maintains latestMessages continuously, so make the derived index match it at first use: headWeightStore imports the full latestMessages snapshot into the index once (clear-then-rebuild, so early GLOAS votes are not double-counted). The differential test now starts from a cold index and asserts headWeightStore seeds it to equal the full-scan store; a new test covers seed idempotency. Interim fix; superseded by the delta-propagation rework tracked in #21704.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Lido Hoodi validators lost attestation score after switching to
release/3.5. They missed head votes at ~30–60% per epoch (≈2× a random network sample) while target and source votes were always correct — the attestations were produced, published, and included on-chain on time, but voted for a stale head.Root cause
The GLOAS merge (#18956) added
indexedWeightStore(cl/phase1/forkchoice/weight_store_indexed.go). It is instantiated unconditionally, and itsIndexVote/RemoveVoteare called per validator index on every attestation viasetLatestMessage, regardless of fork.However its results are only consumed by GLOAS
get_head. Pre-GLOASget_head(andtiming.go) use the non-indexedweightStore, andGetIndexedWeightStore()has no callers. So on pre-GLOAS chains the index is maintained but never read.On a high-validator-count network (Hoodi, ~1.2M active validators) this maintenance was the single largest CPU consumer (
RemoveVote, ~15% of CPU, fresh slice allocation per call) plus a large GC load — all under the fork-choice write lock. The lock contention delayedOnBlock(block import) andget_headpast the attestation deadline, so the head served to the validator client was stale → wrong head votes.Fix
Gate the indexed-store maintenance to the GLOAS vote path only, via an explicit
maintainIndexedVotesflag onsetLatestMessage(mirroring the existingupdateLatestMessagespre-GLOAS/GLOAS dispatch).Validation (live Hoodi node, before vs after rebuild)
indexedWeightStore.RemoveVote(was Pull from go-ethereum up to 26aea73 (7 Feb 2019) #1 hotspot) and the GC storm both gone.Affects all pre-GLOAS networks on this branch (mainnet/Sepolia/Gnosis), with impact scaling by validator count.
Tests
TestPreGloasDoesNotMaintainIndexedWeightStore— fails before the fix (the pre-GLOAS path populates the index), passes after.TestGloasMaintainsIndexedWeightStore— locks in that the GLOAS path still indexes votes.go test ./cl/phase1/forkchoice/...,make lint,make erigonall clean.Follow-up
indexedWeightStoreis currently unused even on GLOAS (get_headuses the non-indexed store). The Caplin team should either wire it into GLOASget_heador remove it; tracking separately.