Skip to content

fix(blockassembly): make DiskTxMap.bytesWritten access atomic#1029

Merged
liam merged 1 commit into
bsv-blockchain:mainfrom
liam:liam/disktxmap-byteswritten-race
Jun 3, 2026
Merged

fix(blockassembly): make DiskTxMap.bytesWritten access atomic#1029
liam merged 1 commit into
bsv-blockchain:mainfrom
liam:liam/disktxmap-byteswritten-race

Conversation

@liam

@liam liam commented Jun 3, 2026

Copy link
Copy Markdown
Collaborator

Data race

diskShard.bytesWritten was a plain int64:

  • written by each disk's writerLoop goroutine (d.bytesWritten += …), and
  • read by Stats() (diskBytes += m.disks[i].bytesWritten).

Stats() is called in production via reportDiskMapStats during moveForwardBlockbefore Clear() quiesces the writers — so the read races the active increments from a different goroutine. That's a data race (go test -race flags it) and a torn read on 32-bit. The field's own comment claimed "only touched by the single writer goroutine — no atomic needed", which Stats() violates.

Fix

Access bytesWritten via sync/atomic (AddInt64 in the writer, LoadInt64 in Stats()). The field stays a plain int64 rather than atomic.Int64 on purpose: diskShard is assigned as a struct literal at construction (m.disks[i] = diskShard{…}), so keeping it copy-safe avoids tainting the struct with a noCopy and a vet copylocks failure.

Test

TestDiskTxMap_StatsRaceWithWriters drives Set (writer-loop increments) concurrently with Stats() reads. Reverting to plain access makes it fail under -race with a DATA RACE report on bytesWritten; with the fix it's clean.

Verification

  • go test -race: passes with fix; fails (DATA RACE) when reverted to plain access.
  • go build, go vet, gofmt, gci: clean.

Third fix from the concurrency/lifecycle audit (after #1027 kafka final-drain and #1028 blockassembly ErrChan deadlock).

bytesWritten was a plain int64 incremented by each disk's writerLoop
goroutine and read by Stats() from the block-processing goroutine
(reportDiskMapStats runs during moveForwardBlock, before Clear() quiesces
the writers). That is an unsynchronized concurrent read/write — a data race
the -race detector flags, and a torn read on 32-bit. The field comment
even claimed it was single-goroutine-only, which Stats() violates.

Access it via sync/atomic (AddInt64 in the writer, LoadInt64 in Stats);
the field stays a plain int64 so diskShard remains copy-safe (it is
assigned as a struct literal at construction). Add a -race regression test
that reads Stats() concurrently with active writers; reverting to plain
access trips the detector.
@github-actions

github-actions Bot commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

🤖 Claude Code Review

Status: Complete


Current Review:

No issues found. The data race fix is correct and well-tested.

Analysis:

  • The race condition on bytesWritten is properly fixed using atomic.AddInt64 (write) and atomic.LoadInt64 (read)
  • The field remains a plain int64 rather than atomic.Int64 to keep the struct copy-safe for literal initialization
  • Updated comments accurately document the concurrent access pattern
  • The new test TestDiskTxMap_StatsRaceWithWriters effectively reproduces the race with concurrent Set and Stats calls
  • Test uses existing helper functions (newTestDiskTxMap, makeInpoints) and appropriate iteration count (5000) for race detection

@sonarqubecloud

sonarqubecloud Bot commented Jun 3, 2026

Copy link
Copy Markdown

@github-actions

github-actions Bot commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

Benchmark Comparison Report

Baseline: main (unknown)

Current: PR-1029 (1985f92)

Summary

  • Regressions: 0
  • Improvements: 0
  • Unchanged: 144
  • Significance level: p < 0.05
All benchmark results (sec/op)
Benchmark Baseline Current Change p-value
_NewBlockFromBytes-4 1.755µ 1.749µ ~ 0.200
SplitSyncedParentMap_SetIfNotExists/256_buckets-4 61.80n 61.96n ~ 0.200
SplitSyncedParentMap_SetIfNotExists/16_buckets-4 61.60n 61.74n ~ 0.700
SplitSyncedParentMap_SetIfNotExists/1_bucket-4 61.74n 61.76n ~ 1.000
SplitSyncedParentMap_ConcurrentSetIfNotExists/256_buckets... 31.36n 30.55n ~ 0.100
SplitSyncedParentMap_ConcurrentSetIfNotExists/16_buckets_... 54.68n 53.53n ~ 0.100
SplitSyncedParentMap_ConcurrentSetIfNotExists/1_bucket_pa... 122.9n 117.9n ~ 0.400
MiningCandidate_Stringify_Short-4 262.6n 261.0n ~ 0.200
MiningCandidate_Stringify_Long-4 1.909µ 1.918µ ~ 0.100
MiningSolution_Stringify-4 974.3n 971.5n ~ 0.700
BlockInfo_MarshalJSON-4 1.829µ 1.789µ ~ 0.100
NewFromBytes-4 130.5n 132.8n ~ 0.100
AddTxBatchColumnar_Validation-4 2.483µ 2.465µ ~ 0.700
OffsetValidationLoop-4 635.7n 640.5n ~ 0.400
Mine_EasyDifficulty-4 60.66µ 60.97µ ~ 0.400
Mine_WithAddress-4 6.924µ 6.939µ ~ 1.000
BlockAssembler_AddTx-4 0.02783n 0.02537n ~ 0.700
AddNode-4 11.28 10.84 ~ 0.100
AddNodeWithMap-4 12.33 12.06 ~ 0.200
DiskTxMap_SetIfNotExists-4 3.516µ 3.432µ ~ 0.200
DiskTxMap_SetIfNotExists_Parallel-4 3.313µ 3.325µ ~ 0.400
DiskTxMap_ExistenceOnly-4 311.4n 311.4n ~ 1.000
Queue-4 187.8n 189.1n ~ 0.300
AtomicPointer-4 4.188n 4.851n ~ 0.200
ReorgOptimizations/DedupFilterPipeline/Old/10K-4 873.4µ 863.9µ ~ 0.100
ReorgOptimizations/DedupFilterPipeline/New/10K-4 848.6µ 790.6µ ~ 0.100
ReorgOptimizations/AllMarkFalse/Old/10K-4 106.1µ 111.3µ ~ 0.100
ReorgOptimizations/AllMarkFalse/New/10K-4 62.87µ 62.45µ ~ 0.700
ReorgOptimizations/HashSlicePool/Old/10K-4 56.38µ 68.70µ ~ 0.100
ReorgOptimizations/HashSlicePool/New/10K-4 11.36µ 11.59µ ~ 0.700
ReorgOptimizations/NodeFlags/Old/10K-4 4.614µ 5.630µ ~ 0.100
ReorgOptimizations/NodeFlags/New/10K-4 1.596µ 1.588µ ~ 1.000
ReorgOptimizations/DedupFilterPipeline/Old/100K-4 9.934m 9.458m ~ 0.100
ReorgOptimizations/DedupFilterPipeline/New/100K-4 9.886m 9.998m ~ 0.700
ReorgOptimizations/AllMarkFalse/Old/100K-4 1.158m 1.079m ~ 0.100
ReorgOptimizations/AllMarkFalse/New/100K-4 685.5µ 682.9µ ~ 1.000
ReorgOptimizations/HashSlicePool/Old/100K-4 569.0µ 607.7µ ~ 0.100
ReorgOptimizations/HashSlicePool/New/100K-4 337.9µ 304.1µ ~ 0.700
ReorgOptimizations/NodeFlags/Old/100K-4 51.38µ 51.45µ ~ 0.700
ReorgOptimizations/NodeFlags/New/100K-4 17.91µ 17.78µ ~ 0.100
TxMapSetIfNotExists-4 52.55n 52.37n ~ 0.400
TxMapSetIfNotExistsDuplicate-4 39.64n 39.73n ~ 0.700
ChannelSendReceive-4 612.1n 610.0n ~ 0.600
DirectSubtreeAdd/4_per_subtree-4 46.64n 47.07n ~ 0.700
DirectSubtreeAdd/64_per_subtree-4 23.40n 23.25n ~ 0.100
DirectSubtreeAdd/256_per_subtree-4 22.11n 22.39n ~ 0.500
DirectSubtreeAdd/1024_per_subtree-4 21.03n 21.04n ~ 1.000
DirectSubtreeAdd/2048_per_subtree-4 20.73n 20.66n ~ 0.800
SubtreeProcessorAdd/4_per_subtree-4 232.6n 229.7n ~ 0.200
SubtreeProcessorAdd/64_per_subtree-4 230.0n 227.8n ~ 0.400
SubtreeProcessorAdd/256_per_subtree-4 228.2n 225.0n ~ 0.200
SubtreeProcessorAdd/1024_per_subtree-4 226.7n 224.1n ~ 0.400
SubtreeProcessorAdd/2048_per_subtree-4 219.8n 219.0n ~ 0.400
SubtreeProcessorRotate/4_per_subtree-4 222.5n 223.6n ~ 1.000
SubtreeProcessorRotate/64_per_subtree-4 223.9n 221.7n ~ 0.100
SubtreeProcessorRotate/256_per_subtree-4 222.6n 225.7n ~ 0.100
SubtreeProcessorRotate/1024_per_subtree-4 223.2n 225.7n ~ 0.100
SubtreeNodeAddOnly/4_per_subtree-4 44.36n 44.20n ~ 0.200
SubtreeNodeAddOnly/64_per_subtree-4 28.62n 28.53n ~ 0.700
SubtreeNodeAddOnly/256_per_subtree-4 27.78n 27.71n ~ 0.400
SubtreeNodeAddOnly/1024_per_subtree-4 27.19n 27.12n ~ 0.700
SubtreeCreationOnly/4_per_subtree-4 91.30n 90.56n ~ 0.400
SubtreeCreationOnly/64_per_subtree-4 325.2n 321.5n ~ 0.700
SubtreeCreationOnly/256_per_subtree-4 1.096µ 1.103µ ~ 0.300
SubtreeCreationOnly/1024_per_subtree-4 3.509µ 3.449µ ~ 0.200
SubtreeCreationOnly/2048_per_subtree-4 6.282µ 6.215µ ~ 0.100
SubtreeProcessorOverheadBreakdown/64_per_subtree-4 224.1n 224.2n ~ 1.000
SubtreeProcessorOverheadBreakdown/1024_per_subtree-4 221.7n 219.4n ~ 0.200
ParallelGetAndSetIfNotExists/1k_nodes-4 1.600m 1.590m ~ 0.700
ParallelGetAndSetIfNotExists/10k_nodes-4 4.069m 4.110m ~ 0.100
ParallelGetAndSetIfNotExists/50k_nodes-4 5.622m 5.670m ~ 0.100
ParallelGetAndSetIfNotExists/100k_nodes-4 7.828m 7.916m ~ 0.100
SequentialGetAndSetIfNotExists/1k_nodes-4 1.407m 1.401m ~ 0.100
SequentialGetAndSetIfNotExists/10k_nodes-4 3.537m 3.483m ~ 0.700
SequentialGetAndSetIfNotExists/50k_nodes-4 10.86m 10.71m ~ 0.700
SequentialGetAndSetIfNotExists/100k_nodes-4 19.60m 19.76m ~ 0.700
ProcessOwnBlockSubtreeNodesParallel/1k_nodes-4 1.617m 1.637m ~ 0.400
ProcessOwnBlockSubtreeNodesParallel/10k_nodes-4 6.603m 6.686m ~ 0.100
ProcessOwnBlockSubtreeNodesParallel/100k_nodes-4 10.67m 10.59m ~ 0.700
ProcessOwnBlockSubtreeNodesSequential/1k_nodes-4 1.433m 1.444m ~ 1.000
ProcessOwnBlockSubtreeNodesSequential/10k_nodes-4 6.386m 6.417m ~ 0.400
ProcessOwnBlockSubtreeNodesSequential/100k_nodes-4 35.12m 35.51m ~ 0.200
CalcBlockWork-4 474.8n 472.5n ~ 0.700
CalculateWork-4 639.2n 642.9n ~ 0.400
BuildBlockLocatorString_Helpers/Size_10-4 1.345µ 1.329µ ~ 0.100
BuildBlockLocatorString_Helpers/Size_100-4 14.61µ 15.85µ ~ 1.000
BuildBlockLocatorString_Helpers/Size_1000-4 127.7µ 125.9µ ~ 0.100
CatchupWithHeaderCache-4 104.3m 104.4m ~ 0.200
_BufferPoolAllocation/16KB-4 4.100µ 4.059µ ~ 0.400
_BufferPoolAllocation/32KB-4 11.12µ 10.14µ ~ 0.100
_BufferPoolAllocation/64KB-4 16.77µ 22.26µ ~ 0.100
_BufferPoolAllocation/128KB-4 32.67µ 34.95µ ~ 0.200
_BufferPoolAllocation/512KB-4 118.1µ 144.3µ ~ 0.100
_BufferPoolConcurrent/32KB-4 21.57µ 22.28µ ~ 0.400
_BufferPoolConcurrent/64KB-4 32.09µ 33.97µ ~ 0.400
_BufferPoolConcurrent/512KB-4 160.3µ 152.9µ ~ 0.100
_SubtreeDeserializationWithBufferSizes/16KB-4 682.1µ 670.8µ ~ 1.000
_SubtreeDeserializationWithBufferSizes/32KB-4 678.8µ 686.7µ ~ 1.000
_SubtreeDeserializationWithBufferSizes/64KB-4 701.7µ 737.8µ ~ 0.100
_SubtreeDeserializationWithBufferSizes/128KB-4 677.2µ 739.4µ ~ 0.100
_SubtreeDeserializationWithBufferSizes/512KB-4 679.1µ 648.5µ ~ 0.100
_SubtreeDataDeserializationWithBufferSizes/16KB-4 37.65m 37.23m ~ 0.100
_SubtreeDataDeserializationWithBufferSizes/32KB-4 37.37m 37.64m ~ 1.000
_SubtreeDataDeserializationWithBufferSizes/64KB-4 37.56m 37.43m ~ 0.700
_SubtreeDataDeserializationWithBufferSizes/128KB-4 37.34m 37.06m ~ 0.100
_SubtreeDataDeserializationWithBufferSizes/512KB-4 37.02m 37.14m ~ 1.000
_PooledVsNonPooled/Pooled-4 742.8n 740.2n ~ 0.200
_PooledVsNonPooled/NonPooled-4 8.135µ 8.069µ ~ 0.400
_MemoryFootprint/Current_512KB_32concurrent-4 7.208µ 6.920µ ~ 0.100
_MemoryFootprint/Proposed_32KB_32concurrent-4 11.037µ 9.891µ ~ 0.100
_MemoryFootprint/Alternative_64KB_32concurrent-4 10.515µ 9.666µ ~ 0.100
_prepareTxsPerLevel-4 413.7m 414.9m ~ 1.000
_prepareTxsPerLevelOrdered-4 3.445m 3.637m ~ 0.100
_prepareTxsPerLevel_Comparison/Original-4 414.1m 410.8m ~ 0.200
_prepareTxsPerLevel_Comparison/Optimized-4 3.560m 3.679m ~ 0.700
SubtreeSizes/10k_tx_4_per_subtree-4 1.266m 1.263m ~ 1.000
SubtreeSizes/10k_tx_16_per_subtree-4 299.9µ 305.9µ ~ 0.400
SubtreeSizes/10k_tx_64_per_subtree-4 74.19µ 73.66µ ~ 0.400
SubtreeSizes/10k_tx_256_per_subtree-4 18.04µ 18.14µ ~ 0.700
SubtreeSizes/10k_tx_512_per_subtree-4 8.992µ 9.262µ ~ 0.700
SubtreeSizes/10k_tx_1024_per_subtree-4 4.515µ 4.446µ ~ 0.400
SubtreeSizes/10k_tx_2k_per_subtree-4 2.206µ 2.236µ ~ 0.300
BlockSizeScaling/10k_tx_64_per_subtree-4 70.95µ 70.38µ ~ 0.400
BlockSizeScaling/10k_tx_256_per_subtree-4 17.86µ 17.93µ ~ 1.000
BlockSizeScaling/10k_tx_1024_per_subtree-4 4.469µ 4.411µ ~ 0.400
BlockSizeScaling/50k_tx_64_per_subtree-4 373.0µ 376.4µ ~ 0.700
BlockSizeScaling/50k_tx_256_per_subtree-4 90.02µ 89.18µ ~ 1.000
BlockSizeScaling/50k_tx_1024_per_subtree-4 21.95µ 21.84µ ~ 0.700
SubtreeAllocations/small_subtrees_exists_check-4 151.8µ 152.0µ ~ 0.400
SubtreeAllocations/small_subtrees_data_fetch-4 161.7µ 161.3µ ~ 0.700
SubtreeAllocations/small_subtrees_full_validation-4 314.6µ 310.2µ ~ 0.400
SubtreeAllocations/medium_subtrees_exists_check-4 8.962µ 8.972µ ~ 1.000
SubtreeAllocations/medium_subtrees_data_fetch-4 9.474µ 9.481µ ~ 1.000
SubtreeAllocations/medium_subtrees_full_validation-4 17.81µ 17.60µ ~ 0.700
SubtreeAllocations/large_subtrees_exists_check-4 2.126µ 2.104µ ~ 0.700
SubtreeAllocations/large_subtrees_data_fetch-4 2.269µ 2.248µ ~ 0.700
SubtreeAllocations/large_subtrees_full_validation-4 4.389µ 4.374µ ~ 0.400
StoreBlock_Sequential/BelowCSVHeight-4 347.4µ 335.2µ ~ 0.100
StoreBlock_Sequential/AboveCSVHeight-4 339.5µ 336.4µ ~ 0.700
GetUtxoHashes-4 263.5n 271.7n ~ 0.200
GetUtxoHashes_ManyOutputs-4 42.94µ 44.79µ ~ 0.100
_NewMetaDataFromBytes-4 230.4n 228.7n ~ 0.400
_Bytes-4 395.5n 399.4n ~ 0.100
_MetaBytes-4 137.2n 138.0n ~ 0.700

Threshold: >10% with p < 0.05 | Generated: 2026-06-03 11:08 UTC

@liam liam requested review from freemans13, ordishs and sugh01 June 3, 2026 15:06

@ordishs ordishs left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, minimal fix for a genuine data race on diskShard.bytesWritten.

  • Both access sites (writer AddInt64, Stats() LoadInt64) now atomic; no stragglers on plain access.
  • Atomics target the canonical array-element address — correctly paired.
  • Keeping the field a plain int64 (rather than atomic.Int64) is the right call: diskShard is value-assigned at construction, so noCopy would trip vet copylocks. Matches the existing m.count pattern.
  • TestDiskTxMap_StatsRaceWithWriters is a real -race regression — verified passing locally under -race.
  • Misleading "no atomic needed" comment corrected; Stats() now documents its concurrent-call contract.

Approve.

@liam liam merged commit cc4dc5c into bsv-blockchain:main Jun 3, 2026
34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants