Skip to content

fix(ci): prevent zombie sticky comments on CCR timeout or race#860

Merged
oskarszoon merged 1 commit into
bsv-blockchain:mainfrom
oskarszoon:fix/ccr-zombie-sticky-comments
May 13, 2026
Merged

fix(ci): prevent zombie sticky comments on CCR timeout or race#860
oskarszoon merged 1 commit into
bsv-blockchain:mainfrom
oskarszoon:fix/ccr-zombie-sticky-comments

Conversation

@oskarszoon

Copy link
Copy Markdown
Contributor

Problem

On PR #845, three stale "🤖 Claude Code Review — In Progress" comments accumulated over multiple CCR runs. Each had only the early checkboxes ticked; none reached "Complete". Two root causes in .github/workflows/claude-code-review.yml:

  1. No workflow concurrency. Overlapping CCR runs each independently ran the "find existing sticky" search at the LLM step, and within the race window neither saw the other's just-created comment — so both created new ones.
  2. No timeout cleanup. When the Claude job hit a timeout or runtime error after creating the sticky but before marking it Complete, the "In Progress" stub stayed forever. Pure prompt-level enforcement of "exactly one sticky" can't recover from this.

Fix

A. Workflow-level concurrency group

concurrency:
  group: claude-code-review-${{ github.event.workflow_run.pull_requests[0].number || github.event.workflow_run.head_branch }}
  cancel-in-progress: true

Ensures only one CCR run per PR is alive at a time. The replacement run finds the existing sticky and updates it in-place, so concurrency-cancellations don't leave zombies. head_branch fallback covers fork PRs where workflow_run.pull_requests is empty.

C. Post-step finalizer on failure
A failure()-gated step that queries for any "🤖 Claude Code Review" comments containing "In Progress" and PATCHes them to an "Aborted" body with a link to the failed run URL. Only true failures (timeout, API error, runner death) trigger it — concurrency-cancellations are deliberately excluded because the queued replacement run handles those.

Why not also B (pre-step ID resolution) and D (cleanup-at-start)?

A + C addresses the two observed failure modes. B and D were considered but skipped for now:

  • B (resolve sticky ID in bash before LLM runs) reduces prompt-drift risk but adds complexity and a second source of truth for the ID. Punt unless A+C proves insufficient.
  • D (delete duplicate stickies at start) is defensive against the same race A solves at the source. Redundant once A is in place.

Risks

  • head_branch fallback collisions on fork PRs. Two fork PRs from different forks but with the same branch name would share a concurrency group. Worst case: one run cancels an unrelated PR's review, which then retriggers on the next push. Acceptable for a first iteration; the alternative (collapsing fork PRs into the same group) is no worse than the status quo.
  • Finalizer race during concurrency cancellation. When run A is cancelled by run B, A's post-step could in theory run while B is updating the sticky. Mitigated by gating on failure() only — cancellation does not trigger the finalizer, so this race doesn't occur in practice.
  • gh api -f body= multi-line handling. gh converts -f key=value to JSON for PATCH; multi-line shell variables embed real newlines that gh serialises as \n. Confirmed working in similar workflows; if a future GH CLI release changes this, the worst case is an ugly-looking aborted comment, not a workflow failure (the || true guards against errors).

Verification

  • actionlint .github/workflows/claude-code-review.yml — clean.
  • YAML parses (ruby YAML.load_file).
  • Behaviour test (manual, post-merge): trigger a CCR run, push another commit before it completes. Expect: first run cancelled, second runs to completion, exactly one sticky comment.

Not in scope

  • .github/workflows/sonar-pr-analyze.yaml does not post sticky comments, so it doesn't share this bug. No change needed.

Two failure modes left "In Progress" Claude Code Review comments stuck on
PRs (e.g. PR bsv-blockchain#845 had 3 stale stubs):

1. No workflow concurrency — overlapping CCR runs each created their own
   sticky comment because each LLM independently ran the "find existing"
   search and missed the other's just-created comment.
2. No timeout cleanup — when the Claude job hit a timeout or runtime error
   after creating the sticky but before marking Complete, the stub stayed
   forever.

Fix:
- Add a workflow-level concurrency group keyed on PR number (with
  head_branch fallback for fork PRs that get an empty pull_requests array
  on workflow_run). cancel-in-progress: true ensures only one CCR run per
  PR is alive at a time; the replacement run finds the existing sticky
  and updates it in-place.
- Add a post-step that runs on failure() and PATCHes any remaining
  "In Progress" Claude sticky comments to an "Aborted" state, with a link
  to the failed workflow run. Concurrency-cancellations are intentionally
  excluded — the replacement run handles those.
@github-actions

github-actions Bot commented May 13, 2026

Copy link
Copy Markdown
Contributor

🤖 Claude Code Review

Status: Complete


Current Review:

No blocking issues found. The PR correctly addresses the zombie comment problem described in #845.

Minor observations (already acknowledged in PR description):

  1. Fork PR concurrency collision (line 30): Two fork PRs with identical branch names from different forks will share a concurrency group. The PR author correctly identifies this as low-risk — fork PRs typically use unique branch names, and worst case is a cancelled review that retriggers on next push.

  2. Cleanup scope (line 230): The failure cleanup marks ALL in-progress Claude comments as aborted, not just ones from the current run. This is acceptable given the concurrency control prevents overlapping runs in normal operation.

Code quality: The implementation is sound. The set -euo pipefail and || true guards handle errors appropriately. The printf multiline formatting will work correctly with gh api -f.

Verification: The PR description notes actionlint passes and YAML parses correctly. Manual testing post-merge is appropriately planned.

@sonarqubecloud

Copy link
Copy Markdown

@oskarszoon oskarszoon enabled auto-merge (squash) May 13, 2026 13:43
@github-actions

Copy link
Copy Markdown
Contributor

Benchmark Comparison Report

Baseline: main (unknown)

Current: PR-860 (ff1b74e)

Summary

  • Regressions: 0
  • Improvements: 0
  • Unchanged: 142
  • Significance level: p < 0.05
All benchmark results (sec/op)
Benchmark Baseline Current Change p-value
_NewBlockFromBytes-4 1.564µ 1.576µ ~ 0.400
SplitSyncedParentMap_SetIfNotExists/256_buckets-4 71.32n 71.95n ~ 0.800
SplitSyncedParentMap_SetIfNotExists/16_buckets-4 71.59n 71.28n ~ 0.400
SplitSyncedParentMap_SetIfNotExists/1_bucket-4 71.10n 71.09n ~ 0.400
SplitSyncedParentMap_ConcurrentSetIfNotExists/256_buckets... 35.86n 33.24n ~ 0.200
SplitSyncedParentMap_ConcurrentSetIfNotExists/16_buckets_... 58.58n 57.90n ~ 1.000
SplitSyncedParentMap_ConcurrentSetIfNotExists/1_bucket_pa... 138.7n 139.5n ~ 1.000
MiningCandidate_Stringify_Short-4 221.0n 222.9n ~ 0.400
MiningCandidate_Stringify_Long-4 1.683µ 1.667µ ~ 0.700
MiningSolution_Stringify-4 877.4n 862.0n ~ 0.100
BlockInfo_MarshalJSON-4 1.835µ 1.808µ ~ 0.400
NewFromBytes-4 128.7n 128.5n ~ 0.700
Mine_EasyDifficulty-4 62.05µ 62.28µ ~ 0.700
Mine_WithAddress-4 7.345µ 7.209µ ~ 0.600
BlockAssembler_AddTx-4 0.02758n 0.03115n ~ 0.400
AddNode-4 10.91 11.09 ~ 0.400
AddNodeWithMap-4 10.80 11.18 ~ 0.400
DiskTxMap_SetIfNotExists-4 3.628µ 3.669µ ~ 0.400
DiskTxMap_SetIfNotExists_Parallel-4 3.268µ 3.417µ ~ 0.100
DiskTxMap_ExistenceOnly-4 293.8n 304.5n ~ 0.100
Queue-4 198.8n 196.3n ~ 0.200
AtomicPointer-4 4.697n 4.729n ~ 1.000
ReorgOptimizations/DedupFilterPipeline/Old/10K-4 853.1µ 882.4µ ~ 0.100
ReorgOptimizations/DedupFilterPipeline/New/10K-4 844.5µ 864.7µ ~ 0.100
ReorgOptimizations/AllMarkFalse/Old/10K-4 113.0µ 117.7µ ~ 0.100
ReorgOptimizations/AllMarkFalse/New/10K-4 62.49µ 61.88µ ~ 0.100
ReorgOptimizations/HashSlicePool/Old/10K-4 63.18µ 69.59µ ~ 0.400
ReorgOptimizations/HashSlicePool/New/10K-4 11.70µ 10.74µ ~ 0.100
ReorgOptimizations/NodeFlags/Old/10K-4 5.327µ 5.514µ ~ 0.700
ReorgOptimizations/NodeFlags/New/10K-4 1.841µ 1.820µ ~ 1.000
ReorgOptimizations/DedupFilterPipeline/Old/100K-4 9.338m 10.994m ~ 0.100
ReorgOptimizations/DedupFilterPipeline/New/100K-4 9.278m 9.938m ~ 0.100
ReorgOptimizations/AllMarkFalse/Old/100K-4 1.082m 1.095m ~ 0.100
ReorgOptimizations/AllMarkFalse/New/100K-4 682.0µ 677.0µ ~ 0.400
ReorgOptimizations/HashSlicePool/Old/100K-4 683.4µ 686.4µ ~ 1.000
ReorgOptimizations/HashSlicePool/New/100K-4 337.7µ 303.0µ ~ 0.700
ReorgOptimizations/NodeFlags/Old/100K-4 54.50µ 56.53µ ~ 0.200
ReorgOptimizations/NodeFlags/New/100K-4 20.03µ 19.69µ ~ 0.700
TxMapSetIfNotExists-4 51.37n 51.34n ~ 1.000
TxMapSetIfNotExistsDuplicate-4 38.31n 38.29n ~ 0.800
ChannelSendReceive-4 600.6n 602.2n ~ 0.700
DirectSubtreeAdd/4_per_subtree-4 60.66n 60.20n ~ 0.700
DirectSubtreeAdd/64_per_subtree-4 29.31n 29.47n ~ 1.000
DirectSubtreeAdd/256_per_subtree-4 28.00n 27.95n ~ 0.300
DirectSubtreeAdd/1024_per_subtree-4 26.77n 26.76n ~ 0.700
DirectSubtreeAdd/2048_per_subtree-4 26.28n 26.34n ~ 0.700
SubtreeProcessorAdd/4_per_subtree-4 280.2n 282.5n ~ 1.000
SubtreeProcessorAdd/64_per_subtree-4 273.1n 280.5n ~ 0.200
SubtreeProcessorAdd/256_per_subtree-4 275.2n 274.6n ~ 0.700
SubtreeProcessorAdd/1024_per_subtree-4 268.5n 266.8n ~ 0.100
SubtreeProcessorAdd/2048_per_subtree-4 273.4n 267.1n ~ 0.100
SubtreeProcessorRotate/4_per_subtree-4 273.8n 272.0n ~ 0.700
SubtreeProcessorRotate/64_per_subtree-4 271.1n 272.0n ~ 1.000
SubtreeProcessorRotate/256_per_subtree-4 271.4n 270.6n ~ 0.800
SubtreeProcessorRotate/1024_per_subtree-4 271.6n 269.1n ~ 0.500
SubtreeNodeAddOnly/4_per_subtree-4 55.63n 56.20n ~ 0.700
SubtreeNodeAddOnly/64_per_subtree-4 36.41n 36.41n ~ 1.000
SubtreeNodeAddOnly/256_per_subtree-4 35.32n 35.28n ~ 1.000
SubtreeNodeAddOnly/1024_per_subtree-4 34.70n 34.59n ~ 0.200
SubtreeCreationOnly/4_per_subtree-4 112.6n 112.9n ~ 0.500
SubtreeCreationOnly/64_per_subtree-4 357.0n 362.1n ~ 0.400
SubtreeCreationOnly/256_per_subtree-4 1.238µ 1.251µ ~ 0.300
SubtreeCreationOnly/1024_per_subtree-4 3.836µ 3.898µ ~ 0.400
SubtreeCreationOnly/2048_per_subtree-4 7.033µ 7.259µ ~ 0.100
SubtreeProcessorOverheadBreakdown/64_per_subtree-4 271.4n 272.4n ~ 0.100
SubtreeProcessorOverheadBreakdown/1024_per_subtree-4 274.4n 272.9n ~ 1.000
ParallelGetAndSetIfNotExists/1k_nodes-4 546.5µ 556.7µ ~ 0.100
ParallelGetAndSetIfNotExists/10k_nodes-4 1.363m 1.366m ~ 0.700
ParallelGetAndSetIfNotExists/50k_nodes-4 6.630m 6.720m ~ 0.700
ParallelGetAndSetIfNotExists/100k_nodes-4 13.42m 13.16m ~ 0.100
SequentialGetAndSetIfNotExists/1k_nodes-4 630.8µ 618.7µ ~ 0.400
SequentialGetAndSetIfNotExists/10k_nodes-4 2.891m 2.829m ~ 0.400
SequentialGetAndSetIfNotExists/50k_nodes-4 11.07m 11.18m ~ 0.400
SequentialGetAndSetIfNotExists/100k_nodes-4 21.41m 21.63m ~ 0.100
ProcessOwnBlockSubtreeNodesParallel/1k_nodes-4 603.4µ 598.1µ ~ 1.000
ProcessOwnBlockSubtreeNodesParallel/10k_nodes-4 4.606m 4.631m ~ 0.400
ProcessOwnBlockSubtreeNodesParallel/100k_nodes-4 17.17m 17.01m ~ 0.100
ProcessOwnBlockSubtreeNodesSequential/1k_nodes-4 666.7µ 688.3µ ~ 0.100
ProcessOwnBlockSubtreeNodesSequential/10k_nodes-4 6.244m 6.226m ~ 0.400
ProcessOwnBlockSubtreeNodesSequential/100k_nodes-4 39.71m 39.36m ~ 0.200
CalcBlockWork-4 498.2n 499.3n ~ 0.400
CalculateWork-4 676.5n 682.3n ~ 0.400
BuildBlockLocatorString_Helpers/Size_10-4 1.294µ 1.300µ ~ 0.300
BuildBlockLocatorString_Helpers/Size_100-4 12.35µ 12.50µ ~ 0.100
BuildBlockLocatorString_Helpers/Size_1000-4 154.2µ 146.4µ ~ 1.000
CatchupWithHeaderCache-4 104.2m 104.3m ~ 1.000
_BufferPoolAllocation/16KB-4 3.483µ 5.074µ ~ 0.100
_BufferPoolAllocation/32KB-4 9.000µ 7.754µ ~ 0.400
_BufferPoolAllocation/64KB-4 15.39µ 15.60µ ~ 1.000
_BufferPoolAllocation/128KB-4 32.33µ 27.49µ ~ 0.100
_BufferPoolAllocation/512KB-4 111.6µ 104.7µ ~ 0.100
_BufferPoolConcurrent/32KB-4 20.35µ 18.13µ ~ 0.100
_BufferPoolConcurrent/64KB-4 31.80µ 27.02µ ~ 0.100
_BufferPoolConcurrent/512KB-4 149.1µ 145.7µ ~ 1.000
_SubtreeDeserializationWithBufferSizes/16KB-4 604.4µ 622.5µ ~ 0.100
_SubtreeDeserializationWithBufferSizes/32KB-4 602.9µ 625.1µ ~ 0.100
_SubtreeDeserializationWithBufferSizes/64KB-4 600.4µ 619.0µ ~ 0.100
_SubtreeDeserializationWithBufferSizes/128KB-4 600.7µ 622.2µ ~ 0.100
_SubtreeDeserializationWithBufferSizes/512KB-4 620.3µ 635.8µ ~ 0.100
_SubtreeDataDeserializationWithBufferSizes/16KB-4 35.56m 35.96m ~ 0.700
_SubtreeDataDeserializationWithBufferSizes/32KB-4 35.47m 36.07m ~ 0.400
_SubtreeDataDeserializationWithBufferSizes/64KB-4 35.41m 35.80m ~ 0.400
_SubtreeDataDeserializationWithBufferSizes/128KB-4 35.61m 35.54m ~ 1.000
_SubtreeDataDeserializationWithBufferSizes/512KB-4 35.08m 35.26m ~ 0.400
_PooledVsNonPooled/Pooled-4 742.2n 740.9n ~ 0.700
_PooledVsNonPooled/NonPooled-4 7.429µ 7.288µ ~ 0.100
_MemoryFootprint/Current_512KB_32concurrent-4 6.720µ 6.734µ ~ 0.400
_MemoryFootprint/Proposed_32KB_32concurrent-4 9.827µ 9.858µ ~ 0.200
_MemoryFootprint/Alternative_64KB_32concurrent-4 9.763µ 10.042µ ~ 0.200
_prepareTxsPerLevel-4 415.1m 407.0m ~ 0.200
_prepareTxsPerLevelOrdered-4 3.611m 3.458m ~ 0.700
_prepareTxsPerLevel_Comparison/Original-4 410.6m 418.6m ~ 0.700
_prepareTxsPerLevel_Comparison/Optimized-4 3.518m 3.597m ~ 0.400
SubtreeSizes/10k_tx_4_per_subtree-4 1.390m 1.400m ~ 1.000
SubtreeSizes/10k_tx_16_per_subtree-4 337.6µ 322.6µ ~ 0.200
SubtreeSizes/10k_tx_64_per_subtree-4 78.34µ 77.46µ ~ 0.100
SubtreeSizes/10k_tx_256_per_subtree-4 19.58µ 19.34µ ~ 0.700
SubtreeSizes/10k_tx_512_per_subtree-4 9.884µ 9.689µ ~ 0.100
SubtreeSizes/10k_tx_1024_per_subtree-4 4.901µ 4.816µ ~ 0.100
SubtreeSizes/10k_tx_2k_per_subtree-4 2.477µ 2.410µ ~ 0.100
BlockSizeScaling/10k_tx_64_per_subtree-4 77.90µ 75.86µ ~ 0.100
BlockSizeScaling/10k_tx_256_per_subtree-4 19.62µ 19.13µ ~ 0.100
BlockSizeScaling/10k_tx_1024_per_subtree-4 4.899µ 4.843µ ~ 0.100
BlockSizeScaling/50k_tx_64_per_subtree-4 412.0µ 399.2µ ~ 0.200
BlockSizeScaling/50k_tx_256_per_subtree-4 97.63µ 96.53µ ~ 0.200
BlockSizeScaling/50k_tx_1024_per_subtree-4 24.41µ 23.76µ ~ 0.100
SubtreeAllocations/small_subtrees_exists_check-4 167.6µ 161.7µ ~ 0.100
SubtreeAllocations/small_subtrees_data_fetch-4 168.6µ 172.7µ ~ 0.100
SubtreeAllocations/small_subtrees_full_validation-4 341.7µ 333.9µ ~ 0.100
SubtreeAllocations/medium_subtrees_exists_check-4 10.009µ 9.737µ ~ 0.100
SubtreeAllocations/medium_subtrees_data_fetch-4 10.03µ 10.14µ ~ 0.100
SubtreeAllocations/medium_subtrees_full_validation-4 19.95µ 19.42µ ~ 0.100
SubtreeAllocations/large_subtrees_exists_check-4 2.405µ 2.314µ ~ 0.100
SubtreeAllocations/large_subtrees_data_fetch-4 2.453µ 2.501µ ~ 0.100
SubtreeAllocations/large_subtrees_full_validation-4 5.032µ 4.916µ ~ 0.200
StoreBlock_Sequential/BelowCSVHeight-4 318.9µ 306.2µ ~ 0.100
StoreBlock_Sequential/AboveCSVHeight-4 318.6µ 315.2µ ~ 0.700
GetUtxoHashes-4 274.4n 280.1n ~ 0.400
GetUtxoHashes_ManyOutputs-4 46.99µ 49.11µ ~ 0.100
_NewMetaDataFromBytes-4 231.7n 231.6n ~ 1.000
_Bytes-4 606.0n 611.1n ~ 0.700
_MetaBytes-4 565.9n 571.3n ~ 1.000

Threshold: >10% with p < 0.05 | Generated: 2026-05-13 13:43 UTC

@oskarszoon oskarszoon merged commit c852197 into bsv-blockchain:main May 13, 2026
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants