blockstm: replace O(n²) UpdateDeps with inverted-index DepsBuilder by kamuikatsurgi · Pull Request #2094 · 0xPolygon/bor

kamuikatsurgi · 2026-03-02T06:48:52Z

The block producer computes a transaction dependency DAG during sequential execution and embeds it in the block header. Importers use it to schedule parallel BlockSTM execution without speculative re-execution overhead.

The old algorithm (UpdateDeps) scanned every preceding transaction's full write-set for each new transaction: O(n²) per block. At 300 tx/block, this was the dominant cost in the miner, confirmed by CPU profiling.

Algorithm

DepsBuilder replaces it with two data structures:

Inverted index — lastWriter map[Key]int: maps each state key to the index of the most recent transaction that wrote it. For a new tx at index i, a single pass over its read-set resolves all direct dependencies in O(R), where R is the read-set size. Only the latest writer matters; earlier writers are already covered by transitivity.

Word-parallel bitsets — directDeps []TxBitset and reachable []TxBitset: each bitset is []uint64 of width (N+63)/64. Transitive reduction runs in O(N/64) per tx using two bitset operations:

reachable[i] = ∪ reachable[j] for j in directDeps[i]     // transitive closure
directDeps[i] &^= reachable[i]    // remove redundant edges

Total complexity: O(n·R) vs O(n²·W) previously, where R = avg read-set size and W = avg write-set size. Default bitset width is 512; ensureCapacity doubles on overflow and widens all existing bitsets atomically before appending new ones — maintaining the invariant that all bitsets have identical len(words) at all times.

The output format (map[int]map[int]bool) is unchanged, full backwards compatibility with old importers.

Correctness fixes

ValidatorBytes preservation — If DepsBuilder errored, the original code skipped rlp.DecodeBytes and re-encoded a zero-value BlockExtraData, silently dropping ValidatorBytes. These are consensus-critical on Cancun sprint transitions. Fix: always decode the existing extra data before overwriting TxDependency.

delayFlag off-by-one (pre-existing) — mvReadMapList[i-1] checked the wrong transaction's reads when deciding whether to include the DAG hint. The last transaction's coinbase/burnt-contract balance reads were never checked. If only the last tx read these balances, the hint was incorrectly embedded. Fixed to mvReadMapList[i].

Backwards compatibility

Verified on a devnet with 1 new validator node and 3 older RPC nodes. After 1,352+
blocks:

Zero consensus errors on all nodes
All three checkpoints logged on old nodes: decode, dependency verification, parallel
execution completion

…lder

codecov · 2026-03-02T07:55:56Z

Codecov Report

❌ Patch coverage is 80.00000% with 28 lines in your changes missing coverage. Please review.
✅ Project coverage is 51.23%. Comparing base (82b9c74) to head (ab3e24d).
⚠️ Report is 2 commits behind head on develop.

Files with missing lines	Patch %	Lines
miner/worker.go	28.20%	28 Missing ⚠️

❌ Your patch status has failed because the patch coverage (80.00%) is below the target coverage (90.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #2094      +/-   ##
===========================================
+ Coverage    51.17%   51.23%   +0.05%     
===========================================
  Files          878      878              
  Lines       152499   152602     +103     
===========================================
+ Hits         78041    78179     +138     
+ Misses       69350    69321      -29     
+ Partials      5108     5102       -6

Files with missing lines	Coverage Δ
core/blockstm/dag.go	`71.03% <100.00%> (+25.93%)`	⬆️
miner/worker.go	`68.46% <28.20%> (-0.82%)`	⬇️

... and 14 files with indirect coverage changes

Files with missing lines	Coverage Δ
core/blockstm/dag.go	`71.03% <100.00%> (+25.93%)`	⬆️
miner/worker.go	`68.46% <28.20%> (-0.82%)`	⬇️

... and 14 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

kamuikatsurgi · 2026-03-02T08:15:06Z

@claude, review this PR focusing on performance and security. Also, make sure no regression or bug is introduced.

claude · 2026-03-02T08:15:26Z

Claude finished @kamuikatsurgi's task in 1m 49s —— View job

PR Review: BlockSTM O(n²) → O(n·R) Performance Optimization

Read and analyze the changed files
Review core/blockstm/dag.go for performance and correctness
Review core/blockstm/dag_test.go for test coverage
Review miner/worker.go for security and regression issues
Compile findings and provide detailed feedback

Summary

This PR replaces the O(n²) UpdateDeps algorithm with an O(n·R) DepsBuilder using an inverted index and word-parallel bitsets. The implementation is well-designed and correct. I found no security vulnerabilities or regressions. The bug fixes for ValidatorBytes preservation and the delayFlag off-by-one are both valid and important.

Performance Analysis ✅

Algorithm Improvement

The new DepsBuilder achieves significant performance gains through:

Inverted Index (lastWriter map[Key]int) - O(1) lookup for the latest writer of any key, eliminating the O(n·W) scan per transaction in the old algorithm.
Word-Parallel Bitsets - Transitive reduction runs in O(N/64) using SIMD-friendly operations (Or, AndNot), compared to O(n) map operations previously.
Memory Efficiency - TxBitset uses []uint64 (8 bytes per 64 transactions) vs nested map[int]bool (40+ bytes per dependency entry).

Complexity: O(n·R) vs O(n²·W) where R = avg read-set size, W = avg write-set size.

Bitset Implementation (`dag.go:243-292`)

The bitset operations are correct:

Set/Test use proper bit indexing with i/64 for word index and i%64 for bit position
ForEach uses bits.TrailingZeros64 for efficient iteration (hardware-accelerated on modern CPUs)
word &= word - 1 clears the lowest set bit correctly

Correctness Analysis ✅

DepsBuilder Algorithm (`dag.go:345-390`)

The core algorithm is correct:

// For each read, find the latest writer (inverted index lookup)
for _, rd := range readList {
    if writer, ok := db.lastWriter[rd.Path]; ok && writer < index {
        db.directDeps[index].Set(writer)
    }
}

// Compute transitive closure from direct dependencies
db.directDeps[index].ForEach(func(j int) {
    db.reachable[index].Or(&db.reachable[j])
})

// Transitive reduction: remove redundant edges
db.directDeps[index].AndNot(&db.reachable[index])

// Update reachability for future transactions
db.reachable[index].Or(&db.directDeps[index])

The key insight is that reachable[j] is complete for all j < index since transactions are processed sequentially. This allows the reduction in a single pass.

Dynamic Capacity Growth (`dag.go:325-343`)

The ensureCapacity function correctly handles blocks with >512 transactions:

Doubles width until it can hold the needed index
Extends all existing bitsets atomically to maintain the invariant that all bitsets have identical len(words)

Error Handling

Non-sequential indices are properly rejected (dag.go:353-356), and the error is sticky - subsequent calls return the same error, and GetDeps() returns nil.

Bug Fixes ✅

1. ValidatorBytes Preservation (`worker.go:1382-1385`)

Before: If len(env.mvReadMapList) == 0, the code skipped rlp.DecodeBytes and re-encoded a zero-value BlockExtraData, silently dropping ValidatorBytes.

After: rlp.DecodeBytes is always called before overwriting TxDependency:

// Always decode header extra data before overwriting TxDependency.
if err := rlp.DecodeBytes(env.header.Extra[...], &blockExtraData); err != nil {
    return err
}

This is a consensus-critical fix for Cancun sprint transitions where ValidatorBytes must be preserved.

2. delayFlag Off-by-One (`worker.go:1398`)

Before: reads := env.mvReadMapList[i-1] - checked the previous transaction's reads, not the current one.

After: reads := env.mvReadMapList[i] - correctly checks transaction i's reads.

This fix ensures the last transaction's coinbase/burnt-contract balance reads are properly detected.

Test Coverage ✅

The test suite is comprehensive:

Test	Coverage
`TestBitset_*`	All bitset operations (Set, Test, Or, AndNot, ForEach, ToSlice)
`TestDepsBuilder_NoDeps`	Independent transactions
`TestDepsBuilder_LinearChain`	Chain dependencies
`TestDepsBuilder_TransitiveReduction`	Redundant edge removal
`TestDepsBuilder_DiamondDeps`	Fan-out/fan-in topology
`TestDepsBuilder_LatestWriterWins`	Overwrite semantics
`TestDepsBuilder_DynamicGrow`	Capacity expansion beyond 512
`TestDepsBuilder_NonSequentialReturnsError`	Error handling
`TestDepsBuilder_CorrectDAG`	20 randomized trials verifying correctness
`TestDepsBuilder_EquivalenceWithUpdateDeps`	Comparison with old algorithm

The TestDepsBuilder_EquivalenceWithUpdateDeps test explicitly documents the behavioral difference on "blind overwrites" (a tx writes a key without reading it first) - this is acceptable because DepsBuilder is more precise in this case.

Potential Minor Improvements

1. Consider Pre-allocating Inner Maps

In GetDeps() (dag.go:402), the inner map is created without capacity hint:

inner := make(map[int]bool)

Since the number of dependencies is known from the bitset's popcount, consider:

inner := make(map[int]bool, bits.OnesCount64(db.directDeps[i].words[0])) // for small blocks

Impact: Minor allocation reduction; not critical.

2. Document Channel Timeout

The 1-second timeout in worker.go:1337 is reasonable but could use a comment explaining why this duration was chosen.

Security Analysis ✅

No integer overflow risks: Bitset indices are bounded by transaction count (max ~1000 per block)
No race conditions: DepsBuilder runs in a single goroutine; channel communication is properly synchronized with sync.Once and sync.WaitGroup
No panic paths: Non-sequential index returns error rather than panicking; bitset operations don't access out-of-bounds indices due to ensureCapacity
No memory exhaustion: Bitset width grows by doubling (logarithmic memory growth)

Verdict

Approve ✅

The implementation is correct, well-tested, and achieves the stated performance goals. The bug fixes address real issues that could cause consensus failures. The behavioral difference on blind overwrites is documented and actually represents more precise dependency tracking.
|

miner/worker.go

sonarqubecloud · 2026-03-03T05:37:19Z

Quality Gate failed

Failed conditions
11 Security Hotspots

See analysis details on SonarQube Cloud

kamuikatsurgi added 5 commits February 27, 2026 16:22

blockstm, miner: replace O(n²) UpdateDeps with inverted-index DepsBui…

6076873

…lder

fix: return err in case of non sequential index

17b87b1

miner: fix ValidatorBytes loss when DepsBuilder fails, add error logging

d235282

miner: fix DAG hint safety check and nits

0ab9986

dag: DepsBuilder and UpdateDeps test

7201fbc

pratikspatil024 approved these changes Mar 2, 2026

View reviewed changes

kamuikatsurgi added 2 commits March 2, 2026 16:40

chore: record dagBuildTimer metric

160f6b8

fix: dag metric calculation

dfab6f7

marcello33 approved these changes Mar 2, 2026

View reviewed changes

cffls reviewed Mar 2, 2026

View reviewed changes

miner/worker.go Show resolved Hide resolved

miner/worker.go Show resolved Hide resolved

kamuikatsurgi added 2 commits March 3, 2026 11:02

chore: add TestDelayFlagOffByOne test

5a5a194

Merge branch 'develop' into kamui/blockstm-perf

ab3e24d

cffls approved these changes Mar 3, 2026

View reviewed changes

kamuikatsurgi merged commit aaaa392 into develop Mar 3, 2026
20 of 22 checks passed

kamuikatsurgi deleted the kamui/blockstm-perf branch March 3, 2026 19:08

mt-polygon-technology mentioned this pull request Mar 5, 2026

Add Claude Code security review rules for Bor codebase #2110

Merged

18 tasks

kamuikatsurgi mentioned this pull request Mar 27, 2026

core/blockstm: remove old UpdateDeps DAG algorithm #2168

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

blockstm: replace O(n²) UpdateDeps with inverted-index DepsBuilder#2094

blockstm: replace O(n²) UpdateDeps with inverted-index DepsBuilder#2094
kamuikatsurgi merged 9 commits intodevelopfrom
kamui/blockstm-perf

kamuikatsurgi commented Mar 2, 2026

Uh oh!

codecov bot commented Mar 2, 2026 •

edited

Loading

Uh oh!

kamuikatsurgi commented Mar 2, 2026

Uh oh!

claude bot commented Mar 2, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

sonarqubecloud bot commented Mar 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

kamuikatsurgi commented Mar 2, 2026

Algorithm

Correctness fixes

Backwards compatibility

Uh oh!

codecov bot commented Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

kamuikatsurgi commented Mar 2, 2026

Uh oh!

claude bot commented Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: BlockSTM O(n²) → O(n·R) Performance Optimization

Summary

Performance Analysis ✅

Algorithm Improvement

Bitset Implementation (dag.go:243-292)

Correctness Analysis ✅

DepsBuilder Algorithm (dag.go:345-390)

Dynamic Capacity Growth (dag.go:325-343)

Error Handling

Bug Fixes ✅

1. ValidatorBytes Preservation (worker.go:1382-1385)

2. delayFlag Off-by-One (worker.go:1398)

Test Coverage ✅

Potential Minor Improvements

1. Consider Pre-allocating Inner Maps

2. Document Channel Timeout

Security Analysis ✅

Verdict

Uh oh!

Uh oh!

Uh oh!

sonarqubecloud bot commented Mar 3, 2026

Quality Gate failed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov bot commented Mar 2, 2026 •

edited

Loading

claude bot commented Mar 2, 2026 •

edited

Loading

Bitset Implementation (`dag.go:243-292`)

DepsBuilder Algorithm (`dag.go:345-390`)

Dynamic Capacity Growth (`dag.go:325-343`)

1. ValidatorBytes Preservation (`worker.go:1382-1385`)

2. delayFlag Off-by-One (`worker.go:1398`)