Skip to content

perf(ci): fix degenerate unit-test coverage (469MB→merged) via covdata#996

Closed
oskarszoon wants to merge 1 commit into
bsv-blockchain:mainfrom
oskarszoon:ci-speed/unit-coverage-covdata
Closed

perf(ci): fix degenerate unit-test coverage (469MB→merged) via covdata#996
oskarszoon wants to merge 1 commit into
bsv-blockchain:mainfrom
oskarszoon:ci-speed/unit-coverage-covdata

Conversation

@oskarszoon

Copy link
Copy Markdown
Contributor

The test job's coverage output is a 469 MB / 5.5M-line coverage.out that's ~115× duplicated: -coverpkg=./... instruments every test binary for the whole codebase, and go test concatenates (not merges) all ~105 binaries' full-codebase profiles. Only 47,667 blocks are real; coverage is 58.71%. Writing/handling that file is ~313s of the ~511s Run Go tests step (the 10,173 tests themselves run in ~94s), and SonarQube then ingests the full 469 MB.

Fix

Switch make test to Go binary coverage: -cover -coverpkg=./... … -args -test.gocoverdir=<dir>, then merge once with go tool covdata textfmt. Same instrumentation, same coverage — but the output is properly merged into a few-MB profile instead of a 469 MB concatenation.

Verified the mechanism locally: covdata output is deduped (1,334 blocks → 1,335 lines, no duplication), coverage % identical, format is standard Go coverage (mode: atomic), gotestsum forwards the flags, -race compatible.

Expected

  • Run Go tests step ~511s → ~230s (the ~313s coverage-write tail goes away); test job ~9.2m → ~4.5m.
  • coverage.out ~469 MB → a few MB; faster artifact upload + SonarQube ingest.
  • SonarQube coverage unchanged (~58.71%) — same filename/format, just merged.

This speeds the unit job and cuts runner-minutes + SonarQube load. It does not lower the overall PR gate (set by pr-smoketests ~10.6m; pr-tests sits just under it) — it's an efficiency + faster-unit-feedback fix.

Verify on this PR

  • test job duration drops.
  • SonarQube coverage % stays ~58.71% (not lower).
  • coverage.out artifact is small (few MB, not ~469 MB).

make test used -coverprofile + -coverpkg=./..., which made go test concatenate
(not merge) every test binary's full-codebase profile -> a 469MB/5.5M-line
coverage.out (~115x duplicated; 47,667 real blocks, 58.71% coverage). Writing it
was ~313s of the ~511s test step; SonarQube ingested all 469MB.

Switch to binary coverage (-cover -coverpkg=./... -args -test.gocoverdir=DIR)
merged once with 'go tool covdata textfmt'. Same coverage, merged into a few-MB
profile. Expected: test step ~511s->~230s, coverage.out 469MB->few MB.
@github-actions

github-actions Bot commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

🤖 Claude Code Review

Status: Complete


Current Review:

No issues found. The changes correctly implement binary coverage collection via GOCOVERDIR, replacing the degenerate concatenated -coverprofile approach. The implementation follows Go's standard binary coverage pattern and properly cleans up the temporary directory.

The .gitignore addition appropriately excludes the coverage directory from version control.

@github-actions

github-actions Bot commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

Benchmark Comparison Report

Baseline: main (unknown)

Current: PR-996 (e4996a6)

Summary

  • Regressions: 0
  • Improvements: 0
  • Unchanged: 144
  • Significance level: p < 0.05
All benchmark results (sec/op)
Benchmark Baseline Current Change p-value
_NewBlockFromBytes-4 1.655µ 1.638µ ~ 0.200
SplitSyncedParentMap_SetIfNotExists/256_buckets-4 71.12n 71.26n ~ 0.400
SplitSyncedParentMap_SetIfNotExists/16_buckets-4 71.42n 71.78n ~ 1.000
SplitSyncedParentMap_SetIfNotExists/1_bucket-4 71.32n 71.46n ~ 0.700
SplitSyncedParentMap_ConcurrentSetIfNotExists/256_buckets... 36.05n 33.16n ~ 0.700
SplitSyncedParentMap_ConcurrentSetIfNotExists/16_buckets_... 58.88n 57.82n ~ 0.700
SplitSyncedParentMap_ConcurrentSetIfNotExists/1_bucket_pa... 198.8n 162.1n ~ 0.600
MiningCandidate_Stringify_Short-4 221.4n 219.9n ~ 0.200
MiningCandidate_Stringify_Long-4 1.661µ 1.649µ ~ 0.100
MiningSolution_Stringify-4 847.3n 851.0n ~ 0.400
BlockInfo_MarshalJSON-4 1.744µ 1.788µ ~ 0.400
NewFromBytes-4 127.5n 128.5n ~ 0.100
AddTxBatchColumnar_Validation-4 2.441µ 2.465µ ~ 0.200
OffsetValidationLoop-4 641.7n 640.0n ~ 0.700
Mine_EasyDifficulty-4 67.28µ 66.81µ ~ 0.100
Mine_WithAddress-4 6.963µ 7.736µ ~ 0.700
BlockAssembler_AddTx-4 0.02560n 0.02590n ~ 0.700
AddNode-4 10.39 10.98 ~ 0.200
AddNodeWithMap-4 11.56 10.68 ~ 0.100
DirectSubtreeAdd/4_per_subtree-4 61.12n 60.03n ~ 1.000
DirectSubtreeAdd/64_per_subtree-4 29.02n 28.82n ~ 0.700
DirectSubtreeAdd/256_per_subtree-4 27.73n 27.79n ~ 0.700
DirectSubtreeAdd/1024_per_subtree-4 26.46n 26.48n ~ 0.400
DirectSubtreeAdd/2048_per_subtree-4 26.01n 26.09n ~ 0.200
SubtreeProcessorAdd/4_per_subtree-4 295.2n 296.5n ~ 1.000
SubtreeProcessorAdd/64_per_subtree-4 286.6n 285.1n ~ 0.700
SubtreeProcessorAdd/256_per_subtree-4 287.6n 286.2n ~ 1.000
SubtreeProcessorAdd/1024_per_subtree-4 278.2n 286.9n ~ 0.100
SubtreeProcessorAdd/2048_per_subtree-4 278.5n 287.1n ~ 0.100
SubtreeProcessorRotate/4_per_subtree-4 283.8n 281.2n ~ 0.700
SubtreeProcessorRotate/64_per_subtree-4 278.2n 282.1n ~ 0.700
SubtreeProcessorRotate/256_per_subtree-4 283.5n 281.8n ~ 0.400
SubtreeProcessorRotate/1024_per_subtree-4 289.0n 279.3n ~ 0.100
SubtreeNodeAddOnly/4_per_subtree-4 55.19n 55.16n ~ 0.400
SubtreeNodeAddOnly/64_per_subtree-4 36.21n 36.13n ~ 0.700
SubtreeNodeAddOnly/256_per_subtree-4 35.19n 35.15n ~ 0.400
SubtreeNodeAddOnly/1024_per_subtree-4 34.54n 34.59n ~ 0.400
SubtreeCreationOnly/4_per_subtree-4 110.9n 112.5n ~ 0.100
SubtreeCreationOnly/64_per_subtree-4 348.7n 354.0n ~ 0.100
SubtreeCreationOnly/256_per_subtree-4 1.230µ 1.239µ ~ 0.100
SubtreeCreationOnly/1024_per_subtree-4 3.879µ 3.814µ ~ 0.100
SubtreeCreationOnly/2048_per_subtree-4 6.800µ 6.762µ ~ 0.700
SubtreeProcessorOverheadBreakdown/64_per_subtree-4 286.6n 278.9n ~ 0.700
SubtreeProcessorOverheadBreakdown/1024_per_subtree-4 278.9n 283.2n ~ 0.100
ParallelGetAndSetIfNotExists/1k_nodes-4 2.012m 2.001m ~ 0.700
ParallelGetAndSetIfNotExists/10k_nodes-4 5.224m 5.257m ~ 1.000
ParallelGetAndSetIfNotExists/50k_nodes-4 7.178m 7.381m ~ 0.200
ParallelGetAndSetIfNotExists/100k_nodes-4 9.724m 10.120m ~ 0.100
SequentialGetAndSetIfNotExists/1k_nodes-4 1.774m 1.786m ~ 0.700
SequentialGetAndSetIfNotExists/10k_nodes-4 4.525m 4.501m ~ 1.000
SequentialGetAndSetIfNotExists/50k_nodes-4 14.56m 13.91m ~ 0.200
SequentialGetAndSetIfNotExists/100k_nodes-4 26.76m 26.04m ~ 1.000
ProcessOwnBlockSubtreeNodesParallel/1k_nodes-4 2.054m 2.080m ~ 0.700
ProcessOwnBlockSubtreeNodesParallel/10k_nodes-4 8.446m 8.528m ~ 0.200
ProcessOwnBlockSubtreeNodesParallel/100k_nodes-4 13.83m 13.89m ~ 0.700
ProcessOwnBlockSubtreeNodesSequential/1k_nodes-4 1.815m 1.804m ~ 0.700
ProcessOwnBlockSubtreeNodesSequential/10k_nodes-4 8.766m 8.495m ~ 0.100
ProcessOwnBlockSubtreeNodesSequential/100k_nodes-4 43.48m 47.66m ~ 0.100
DiskTxMap_SetIfNotExists-4 3.862µ 3.652µ ~ 0.200
DiskTxMap_SetIfNotExists_Parallel-4 3.560µ 3.514µ ~ 1.000
DiskTxMap_ExistenceOnly-4 342.5n 334.9n ~ 1.000
Queue-4 191.3n 186.8n ~ 0.100
AtomicPointer-4 3.686n 3.634n ~ 0.200
ReorgOptimizations/DedupFilterPipeline/Old/10K-4 838.9µ 854.5µ ~ 0.400
ReorgOptimizations/DedupFilterPipeline/New/10K-4 771.7µ 766.1µ ~ 0.700
ReorgOptimizations/AllMarkFalse/Old/10K-4 116.8µ 106.6µ ~ 0.100
ReorgOptimizations/AllMarkFalse/New/10K-4 64.20µ 64.54µ ~ 0.700
ReorgOptimizations/HashSlicePool/Old/10K-4 66.61µ 56.15µ ~ 0.200
ReorgOptimizations/HashSlicePool/New/10K-4 11.10µ 11.20µ ~ 0.700
ReorgOptimizations/NodeFlags/Old/10K-4 4.566µ 4.429µ ~ 0.100
ReorgOptimizations/NodeFlags/New/10K-4 1.584µ 1.529µ ~ 0.100
ReorgOptimizations/DedupFilterPipeline/Old/100K-4 9.910m 9.340m ~ 0.400
ReorgOptimizations/DedupFilterPipeline/New/100K-4 10.50m 10.27m ~ 0.400
ReorgOptimizations/AllMarkFalse/Old/100K-4 1.156m 1.159m ~ 1.000
ReorgOptimizations/AllMarkFalse/New/100K-4 704.8µ 704.5µ ~ 1.000
ReorgOptimizations/HashSlicePool/Old/100K-4 564.7µ 507.9µ ~ 0.700
ReorgOptimizations/HashSlicePool/New/100K-4 208.0µ 207.4µ ~ 1.000
ReorgOptimizations/NodeFlags/Old/100K-4 49.18µ 51.88µ ~ 0.200
ReorgOptimizations/NodeFlags/New/100K-4 17.29µ 18.04µ ~ 0.100
TxMapSetIfNotExists-4 49.27n 50.09n ~ 0.200
TxMapSetIfNotExistsDuplicate-4 41.16n 41.65n ~ 0.100
ChannelSendReceive-4 590.7n 580.7n ~ 0.400
CalcBlockWork-4 509.9n 517.7n ~ 0.100
CalculateWork-4 697.1n 716.9n ~ 0.700
BuildBlockLocatorString_Helpers/Size_10-4 1.509µ 1.482µ ~ 1.000
BuildBlockLocatorString_Helpers/Size_100-4 13.04µ 12.98µ ~ 1.000
BuildBlockLocatorString_Helpers/Size_1000-4 128.1µ 127.9µ ~ 0.400
CatchupWithHeaderCache-4 104.5m 104.4m ~ 0.400
SubtreeSizes/10k_tx_4_per_subtree-4 1.338m 1.334m ~ 0.700
SubtreeSizes/10k_tx_16_per_subtree-4 322.3µ 312.9µ ~ 0.700
SubtreeSizes/10k_tx_64_per_subtree-4 76.55µ 75.72µ ~ 0.700
SubtreeSizes/10k_tx_256_per_subtree-4 19.27µ 18.85µ ~ 0.100
SubtreeSizes/10k_tx_512_per_subtree-4 9.599µ 9.348µ ~ 0.100
SubtreeSizes/10k_tx_1024_per_subtree-4 4.761µ 4.689µ ~ 0.100
SubtreeSizes/10k_tx_2k_per_subtree-4 2.369µ 2.351µ ~ 0.200
BlockSizeScaling/10k_tx_64_per_subtree-4 75.52µ 74.89µ ~ 0.400
BlockSizeScaling/10k_tx_256_per_subtree-4 19.08µ 19.07µ ~ 1.000
BlockSizeScaling/10k_tx_1024_per_subtree-4 4.783µ 4.720µ ~ 0.100
BlockSizeScaling/50k_tx_64_per_subtree-4 400.8µ 398.9µ ~ 1.000
BlockSizeScaling/50k_tx_256_per_subtree-4 96.53µ 94.15µ ~ 0.100
BlockSizeScaling/50k_tx_1024_per_subtree-4 23.70µ 23.48µ ~ 0.200
SubtreeAllocations/small_subtrees_exists_check-4 163.7µ 160.2µ ~ 0.200
SubtreeAllocations/small_subtrees_data_fetch-4 170.2µ 167.8µ ~ 0.100
SubtreeAllocations/small_subtrees_full_validation-4 337.4µ 329.9µ ~ 0.200
SubtreeAllocations/medium_subtrees_exists_check-4 9.801µ 9.592µ ~ 0.100
SubtreeAllocations/medium_subtrees_data_fetch-4 10.054µ 9.982µ ~ 0.700
SubtreeAllocations/medium_subtrees_full_validation-4 19.54µ 19.24µ ~ 0.200
SubtreeAllocations/large_subtrees_exists_check-4 2.283µ 2.258µ ~ 0.200
SubtreeAllocations/large_subtrees_data_fetch-4 2.458µ 2.380µ ~ 0.100
SubtreeAllocations/large_subtrees_full_validation-4 4.850µ 4.767µ ~ 0.100
_BufferPoolAllocation/16KB-4 3.964µ 3.885µ ~ 0.700
_BufferPoolAllocation/32KB-4 10.228µ 8.046µ ~ 0.200
_BufferPoolAllocation/64KB-4 16.63µ 18.41µ ~ 0.400
_BufferPoolAllocation/128KB-4 34.78µ 34.92µ ~ 0.400
_BufferPoolAllocation/512KB-4 106.2µ 117.3µ ~ 0.100
_BufferPoolConcurrent/32KB-4 19.26µ 21.14µ ~ 0.400
_BufferPoolConcurrent/64KB-4 30.15µ 29.28µ ~ 0.700
_BufferPoolConcurrent/512KB-4 146.5µ 143.9µ ~ 0.100
_SubtreeDeserializationWithBufferSizes/16KB-4 677.5µ 640.3µ ~ 0.100
_SubtreeDeserializationWithBufferSizes/32KB-4 700.1µ 628.4µ ~ 0.100
_SubtreeDeserializationWithBufferSizes/64KB-4 679.5µ 626.4µ ~ 0.100
_SubtreeDeserializationWithBufferSizes/128KB-4 672.0µ 635.1µ ~ 0.100
_SubtreeDeserializationWithBufferSizes/512KB-4 615.4µ 587.7µ ~ 0.100
_SubtreeDataDeserializationWithBufferSizes/16KB-4 36.76m 37.08m ~ 0.100
_SubtreeDataDeserializationWithBufferSizes/32KB-4 36.67m 36.83m ~ 0.200
_SubtreeDataDeserializationWithBufferSizes/64KB-4 36.59m 36.93m ~ 0.200
_SubtreeDataDeserializationWithBufferSizes/128KB-4 36.61m 36.71m ~ 0.400
_SubtreeDataDeserializationWithBufferSizes/512KB-4 36.31m 36.56m ~ 0.200
_PooledVsNonPooled/Pooled-4 831.8n 672.3n ~ 0.100
_PooledVsNonPooled/NonPooled-4 7.670µ 7.703µ ~ 1.000
_MemoryFootprint/Current_512KB_32concurrent-4 6.728µ 6.689µ ~ 0.400
_MemoryFootprint/Proposed_32KB_32concurrent-4 9.556µ 9.618µ ~ 0.400
_MemoryFootprint/Alternative_64KB_32concurrent-4 9.092µ 9.111µ ~ 1.000
_prepareTxsPerLevel-4 309.6m 357.7m ~ 0.100
_prepareTxsPerLevelOrdered-4 2.969m 4.010m ~ 0.100
_prepareTxsPerLevel_Comparison/Original-4 308.3m 321.9m ~ 0.100
_prepareTxsPerLevel_Comparison/Optimized-4 2.998m 3.121m ~ 0.200
StoreBlock_Sequential/BelowCSVHeight-4 333.3µ 332.3µ ~ 0.700
StoreBlock_Sequential/AboveCSVHeight-4 336.6µ 346.2µ ~ 0.100
GetUtxoHashes-4 258.8n 257.2n ~ 1.000
GetUtxoHashes_ManyOutputs-4 43.23µ 43.06µ ~ 0.400
_NewMetaDataFromBytes-4 228.9n 229.0n ~ 1.000
_Bytes-4 402.4n 402.3n ~ 1.000
_MetaBytes-4 138.2n 139.0n ~ 0.700

Threshold: >10% with p < 0.05 | Generated: 2026-06-01 11:53 UTC

@sonarqubecloud

sonarqubecloud Bot commented Jun 1, 2026

Copy link
Copy Markdown

@oskarszoon

Copy link
Copy Markdown
Contributor Author

Closing — the change doesn't deliver what I expected.

Measured on the green re-run (commit 33e17d5):

  • No speedup: test job 547s vs 551s baseline. The ~326s tail inside go test is -coverpkg=./... finalizing full-codebase coverage counters per binary — present in both the text-profile and binary-coverage paths, so changing the output format doesn't help. I'd wrongly attributed that tail to the profile write.
  • coverage.out does shrink 469 MB → 3.9 MB (the file was ~115× duplicated), which only helps SonarQube ingest / artifact size — cosmetic.
  • It shifts reported coverage 58.71% → 62.87% (covdata drops no-test packages from the denominator), which flatters the number without more being covered.

Not worth changing the coverage figure for a cosmetic file-size win, and it doesn't move the PR gate (pr-smoketests, ~10.6m) regardless. The flaky services/blockassembly system tests that reddened the first run are tracked in #997.

@oskarszoon oskarszoon closed this Jun 1, 2026
@oskarszoon oskarszoon deleted the ci-speed/unit-coverage-covdata branch June 1, 2026 12:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant