Skip to content

ci: move SonarCloud branch scans out of the merge queue, serialize per branch#21669

Merged
taratorio merged 2 commits into
mainfrom
taratorio/sonar-serialize-merge-queue-scans
Jun 8, 2026
Merged

ci: move SonarCloud branch scans out of the merge queue, serialize per branch#21669
taratorio merged 2 commits into
mainfrom
taratorio/sonar-serialize-merge-queue-scans

Conversation

@taratorio

Copy link
Copy Markdown
Member

Problem

SonarCloud rejects a branch analysis dated older than the branch's latest processed one. Every merge-queue entry analyzes the target branch, so when queue entry N's report uploads after entry N+1's, SonarCloud rejects N's with "Date of analysis cannot be older than the date of the last known analysis" and flips that commit's SonarCloud Code Analysis check to cancelled.

Seen on 5027d259c2: three stacked queue entries (#21648, #21584, #21640) scanned concurrently; #21584 started 6s later but uploaded 21s earlier, so #21648's analysis was rejected as out of order. No impact on the gate (the scan step is continue-on-error and the merge succeeded), but it leaves a spurious failed check on a merged commit.

Fix

Make it structurally impossible to submit branch analyses out of order:

  • merge_group runs keep make test-sonar-coverage as a gate check but no longer scan. They upload coverage-test-all.out as a 7-day artifact.
  • A new SonarCloud Branch Scan workflow (push on main + release/**, matching the branches SonarCloud tracks) downloads that artifact and runs only the scanner (~7 min vs ~25). The merge queue fast-forwards the target branch to the already-tested merge commit, so the pushed SHA equals the merge-queue run's head SHA and the artifact maps to the commit exactly. If no artifact exists (direct push, or expired retention), it falls back to running the tests itself.
  • Branch scans are serialized per branch FIFO via concurrency.queue: max (GitHub Actions, 2026-05), so every merged commit is analyzed, in order, with no runs cancelled during merge bursts.
  • PR scans (Sonar PR analyses — separate from main-branch date ordering) and cache-warming runs are unchanged.

A failed analysis upload no longer touches the gate at all: scanner failures now surface on a post-merge run, while PR runs keep exercising the scanner so config/scanner breakage is still caught before merge.

Notes

  • actionlint (≤ v1.7.12, the latest release) does not know the queue concurrency key yet, so lint.yml gets a narrow --ignore alongside the existing workspace one. Removable once actionlint learns the key.
  • concurrency.queue: max is ~1 month old — first real merge-burst after this lands is where FIFO ordering gets exercised live.

Testing

  • make lint, actionlint (CI flags), and zizmor (repo config) all clean; the new workflow is finding-free.
  • The artifact-lookup script was shellchecked and run against live API data: correctly resolves the merge-queue run by head SHA, filters expired artifacts, and degrades to the test-running fallback on a missing artifact or total API failure (never fails the job).

…r branch

SonarCloud rejects a branch analysis dated older than the branch's
latest processed one. All merge-queue entries analyze the target branch,
so when entry N's report uploaded after entry N+1's, SonarCloud rejected
it and flipped the commit's SonarCloud check to cancelled ("Date of
analysis cannot be older than the date of the last known analysis").
Seen on 5027d25, where the scans of three stacked queue entries
(#21648, #21584, #21640) raced and #21648's analysis lost.

Restructure so branch analyses can never be submitted out of order:

- merge_group runs keep `make test-sonar-coverage` as a gate check but
  no longer scan; they upload coverage-test-all.out as a short-retention
  artifact instead.
- A new push-triggered SonarCloud Branch Scan workflow (main and
  release/**, matching the branches SonarCloud tracks) downloads that
  artifact — the merge queue fast-forwards the target branch to the
  already-tested merge commit, so the pushed SHA equals the merge-queue
  run's head SHA and the artifact maps to the commit exactly — and runs
  only the scanner (~7 min instead of ~25). If no artifact exists
  (direct push, expired retention), it falls back to running the tests.
- Scans are serialized per branch FIFO via `concurrency.queue: max`
  (GitHub Actions, 2026-05), so every merged commit is analyzed, in
  order, with no runs cancelled in merge bursts.
- PR scans (Sonar PR analyses, no main-branch date ordering) and
  cache-warming runs are unchanged.

actionlint (≤ v1.7.12) does not know the `queue` key yet, so lint.yml
gains a narrow --ignore alongside the existing workspace one.

A failed analysis upload also no longer involves the gate: scanner
failures now surface on a post-merge run instead, while PR runs keep
exercising the scanner so breakage is still caught before merge.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR restructures SonarCloud branch analysis so merge-queue runs generate and upload coverage data but do not submit branch scans, while a dedicated post-merge workflow performs the branch scan in a per-branch FIFO-serialized manner to prevent SonarCloud’s “out-of-order analysis date” rejections.

Changes:

  • Add a scan-only mode to the reusable Sonar workflow to reuse a merge-queue coverage artifact (with a fallback to running coverage tests if missing).
  • Skip SonarCloud scanning during merge_group runs and instead upload coverage-test-all.out as a short-retention artifact.
  • Introduce a SonarCloud Branch Scan workflow triggered on pushes to main/release/**, serialized via concurrency.queue: max, and adjust actionlint ignores accordingly.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
.github/workflows/sonar.yml Adds scan-only artifact reuse logic; skips scanning on merge-queue runs; uploads coverage artifact for post-merge scans.
.github/workflows/sonar-branch-scan.yml New push-triggered workflow to run branch scans serially per branch using the reusable sonar workflow.
.github/workflows/lint.yml Extends actionlint ignores to tolerate the new concurrency.queue key.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +63 to +74
{
echo "found=$found"
echo "run-id=$run_id"
} >> "$GITHUB_OUTPUT"

- name: Download merge-queue coverage artifact
if: inputs.scan-only && steps.coverage-artifact.outputs.found == 'true'
uses: actions/download-artifact@v8
with:
name: sonar-coverage
run-id: ${{ steps.coverage-artifact.outputs.run-id }}
github-token: ${{ github.token }}

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a false positive — GitHub Actions' expression parser allows hyphens in property names accessed via dot notation. The proof is in this same file: the pre-existing, CI-proven steps.engine-cache.outputs.cache-hit != 'true' (a hyphenated step id and a hyphenated output), and the canonical actions/cache steps.<id>.outputs.cache-hit pattern used by millions of workflows. So outputs.run-id resolved correctly.

That said, I've renamed the output to run_id (cbe966a) — underscore is more idiomatic for user-defined outputs and removes any tooling/reader doubt.

@taratorio

Copy link
Copy Markdown
Member Author

@claude review

@claude

claude Bot commented Jun 8, 2026

Copy link
Copy Markdown

Claude finished @taratorio's task in 2m 18s —— View job


Code Review

  • Gather context and understand the PR
  • Read changed files
  • Analyze logic and correctness
  • Post review feedback

The design is sound. Moving branch scans out of the merge queue and serializing them via queue: max is the right fix for the out-of-order SonarCloud rejection problem. The fallback path (run tests if no artifact) makes the workflow resilient to direct pushes and expired artifacts. A few findings:


sonar.yml — The run-id output key

Copilot flagged steps.coverage-artifact.outputs.run-id as potentially being parsed as subtraction (run - id). This concern is a false positive: GitHub Actions' expression parser treats dot-separated segments as property names and does allow hyphens — step IDs like coverage-artifact are themselves hyphenated and work fine in dot notation. The issue at line 73 resolves correctly.

That said, using run_id instead of run-id is more conventional for GitHub Actions outputs (underscores are the idiomatic choice) and eliminates any tooling confusion. Worth changing for clarity.


actions/upload-artifact@v7 vs actions/download-artifact@v8 mismatch

.github/workflows/sonar.yml line 70 uses actions/download-artifact@v8 but line 87 uses actions/upload-artifact@v7. Upload and download need to agree on the artifact format. A major version mismatch here could silently corrupt the artifact download. Verify that v7 upload → v8 download is a tested/documented compatible path, or pin both to the same version.

Fix this →


sonar.ymlif-no-files-found: error on the coverage upload

if-no-files-found: error

This is the right choice: if make test-sonar-coverage somehow produces no output file, the merge-queue run should fail hard rather than silently uploading nothing. The subsequent branch scan would then fall back to running tests, which is the correct degradation path. ✓


Key assumption: merge queue fast-forward

The artifact lookup pivots on GITHUB_SHA from the push event equalling the merge-queue run's head SHA. This holds when GitHub's merge queue fast-forwards the target branch (which it does by default for this repo). If a future config change introduces merge commits, the SHA lookup will miss and the workflow degrades gracefully to running tests — acceptable. Worth a comment in the PR description or the workflow if not already there.


Shell script robustness in Locate merge-queue coverage artifact

run_id=$(...) || run_id=""
count=$(...) || count=0
[ "${count:-0}" -gt 0 ]

The error handling is solid: API failures default to safe values and never fail the job. The 2>/dev/null suppression plus || fallback pattern correctly distinguishes "API failed" from "API returned empty result". ✓


sonar-branch-scan.yml permissions

actions: read is correctly added to allow cross-run artifact download via download-artifact's run-id input. ✓


lint.yml — actionlint --ignore for queue key

The added ignore pattern 'unexpected key "queue" for "concurrency" section' is appropriately scoped and documented with a comment indicating it's removable once actionlint learns the key. ✓


Overall: One concrete issue to address (upload/download artifact version mismatch), one style preference (run-idrun_id), rest looks correct.

Underscore is the idiomatic form for user-defined step outputs. No
behavior change (hyphenated outputs work with dot notation, as the
pre-existing steps.engine-cache.outputs.cache-hit reference shows); the
rename just preempts reviewer/tooling confusion.
@taratorio

Copy link
Copy Markdown
Member Author

Addressed the review feedback (thanks @claude / Copilot):

1. run-idrun_id output (both reviews). Renamed in cbe966a. The correctness concern Copilot raised was a false positive — hyphenated outputs work with dot notation (see the pre-existing steps.engine-cache.outputs.cache-hit in this same file, and the canonical actions/cache pattern) — but underscore is the idiomatic form for user-defined outputs, so the rename is a clarity win and preempts repeat flags.

2. upload-artifact@v7 vs download-artifact@v8 "mismatch". Not a mismatch — these are each action's respective latest major (upload-artifact has no v8; download-artifact has no v9), and they interoperate on the shared post-v4 artifact backend; the divergent numbers are independent release cadences. They're also the repo's standard (63× upload@v7, 12× download@v8) and are already round-tripped together in CI by release.yml, test-kurtosis-assertoor.yml, and backups-dashboards.yml. Left as-is to match repo convention.

3. Merge-queue fast-forward assumption. Already documented — see the comment at the Locate merge-queue coverage artifact step ("The merge queue fast-forwards the target branch … so the pushed SHA equals a merge-queue run's head SHA") and the PR description. If a future config introduces merge commits, the SHA lookup misses and the workflow degrades to running tests, which is the intended fallback.

make lint + actionlint clean after the change.

@claude

claude Bot commented Jun 8, 2026

Copy link
Copy Markdown

Claude finished @taratorio's task in 0s —— View job


I'll analyze this and get back to you.

@taratorio taratorio requested a review from yperbasis June 8, 2026 05:33
@taratorio taratorio added this pull request to the merge queue Jun 8, 2026
@taratorio taratorio removed this pull request from the merge queue due to a manual request Jun 8, 2026
@taratorio taratorio added this pull request to the merge queue Jun 8, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Jun 8, 2026
@taratorio taratorio added this pull request to the merge queue Jun 8, 2026
@yperbasis yperbasis added the QA label Jun 8, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Jun 8, 2026
@taratorio taratorio added this pull request to the merge queue Jun 8, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Jun 8, 2026
@taratorio taratorio added this pull request to the merge queue Jun 8, 2026
Merged via the queue into main with commit 32e1afe Jun 8, 2026
92 checks passed
@taratorio taratorio deleted the taratorio/sonar-serialize-merge-queue-scans branch June 8, 2026 09:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants