blockstm: replace O(n²) UpdateDeps with inverted-index DepsBuilder#2094
blockstm: replace O(n²) UpdateDeps with inverted-index DepsBuilder#2094kamuikatsurgi merged 9 commits intodevelopfrom
Conversation
Codecov Report❌ Patch coverage is
❌ Your patch status has failed because the patch coverage (80.00%) is below the target coverage (90.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## develop #2094 +/- ##
===========================================
+ Coverage 51.17% 51.23% +0.05%
===========================================
Files 878 878
Lines 152499 152602 +103
===========================================
+ Hits 78041 78179 +138
+ Misses 69350 69321 -29
+ Partials 5108 5102 -6
... and 14 files with indirect coverage changes
🚀 New features to boost your workflow:
|
|
@claude, review this PR focusing on performance and security. Also, make sure no regression or bug is introduced. |
|
Claude finished @kamuikatsurgi's task in 1m 49s —— View job PR Review: BlockSTM O(n²) → O(n·R) Performance Optimization
SummaryThis PR replaces the O(n²) Performance Analysis ✅Algorithm ImprovementThe new
Complexity: O(n·R) vs O(n²·W) where R = avg read-set size, W = avg write-set size. Bitset Implementation (
|
| Test | Coverage |
|---|---|
TestBitset_* |
All bitset operations (Set, Test, Or, AndNot, ForEach, ToSlice) |
TestDepsBuilder_NoDeps |
Independent transactions |
TestDepsBuilder_LinearChain |
Chain dependencies |
TestDepsBuilder_TransitiveReduction |
Redundant edge removal |
TestDepsBuilder_DiamondDeps |
Fan-out/fan-in topology |
TestDepsBuilder_LatestWriterWins |
Overwrite semantics |
TestDepsBuilder_DynamicGrow |
Capacity expansion beyond 512 |
TestDepsBuilder_NonSequentialReturnsError |
Error handling |
TestDepsBuilder_CorrectDAG |
20 randomized trials verifying correctness |
TestDepsBuilder_EquivalenceWithUpdateDeps |
Comparison with old algorithm |
The TestDepsBuilder_EquivalenceWithUpdateDeps test explicitly documents the behavioral difference on "blind overwrites" (a tx writes a key without reading it first) - this is acceptable because DepsBuilder is more precise in this case.
Potential Minor Improvements
1. Consider Pre-allocating Inner Maps
In GetDeps() (dag.go:402), the inner map is created without capacity hint:
inner := make(map[int]bool)Since the number of dependencies is known from the bitset's popcount, consider:
inner := make(map[int]bool, bits.OnesCount64(db.directDeps[i].words[0])) // for small blocksImpact: Minor allocation reduction; not critical.
2. Document Channel Timeout
The 1-second timeout in worker.go:1337 is reasonable but could use a comment explaining why this duration was chosen.
Security Analysis ✅
- No integer overflow risks: Bitset indices are bounded by transaction count (max ~1000 per block)
- No race conditions:
DepsBuilderruns in a single goroutine; channel communication is properly synchronized withsync.Onceandsync.WaitGroup - No panic paths: Non-sequential index returns error rather than panicking; bitset operations don't access out-of-bounds indices due to
ensureCapacity - No memory exhaustion: Bitset width grows by doubling (logarithmic memory growth)
Verdict
Approve ✅
The implementation is correct, well-tested, and achieves the stated performance goals. The bug fixes address real issues that could cause consensus failures. The behavioral difference on blind overwrites is documented and actually represents more precise dependency tracking.
|
|


The block producer computes a transaction dependency DAG during sequential execution and embeds it in the block header. Importers use it to schedule parallel BlockSTM execution without speculative re-execution overhead.
The old algorithm (
UpdateDeps) scanned every preceding transaction's full write-set for each new transaction: O(n²) per block. At 300 tx/block, this was the dominant cost in the miner, confirmed by CPU profiling.Algorithm
DepsBuilder replaces it with two data structures:
Inverted index —
lastWriter map[Key]int: maps each state key to the index of the most recent transaction that wrote it. For a new tx at index i, a single pass over its read-set resolves all direct dependencies in O(R), where R is the read-set size. Only the latest writer matters; earlier writers are already covered by transitivity.Word-parallel bitsets —
directDeps []TxBitsetandreachable []TxBitset: each bitset is []uint64 of width (N+63)/64. Transitive reduction runs in O(N/64) per tx using two bitset operations:Total complexity: O(n·R) vs O(n²·W) previously, where R = avg read-set size and W = avg write-set size. Default bitset width is 512; ensureCapacity doubles on overflow and widens all existing bitsets atomically before appending new ones — maintaining the invariant that all bitsets have identical len(words) at all times.
The output format (map[int]map[int]bool) is unchanged, full backwards compatibility with old importers.
Correctness fixes
ValidatorBytes preservation — If DepsBuilder errored, the original code skipped rlp.DecodeBytes and re-encoded a zero-value BlockExtraData, silently dropping ValidatorBytes. These are consensus-critical on Cancun sprint transitions. Fix: always decode the existing extra data before overwriting TxDependency.
delayFlag off-by-one (pre-existing) — mvReadMapList[i-1] checked the wrong transaction's reads when deciding whether to include the DAG hint. The last transaction's coinbase/burnt-contract balance reads were never checked. If only the last tx read these balances, the hint was incorrectly embedded. Fixed to mvReadMapList[i].
Backwards compatibility
Verified on a devnet with 1 new validator node and 3 older RPC nodes. After 1,352+
blocks:
execution completion