fix(ci): supply-chain scanner uses asymmetric two-dot diff, false-positives on stale-base PRs#23592
Closed
li0near wants to merge 1 commit into
Closed
Conversation
The content scan and the two file-name regex checks all use `git diff "$BASE".."$HEAD"` (two dots, asymmetric). When a PR's base SHA falls behind upstream main between push and CI run, GitHub re-resolves `pull_request.base.sha` to current upstream HEAD — the asymmetric diff then includes every commit upstream landed in between, and the install-hook regex matches files like `hermes_cli/setup.py` that upstream added, not the PR. Switch all three uses to three-dot (merge-base..HEAD) so we only see what THIS PR contributed. This is the standard "PR-only diff" form and what GitHub itself uses for the diff view in the PR UI. Verified against PR NousResearch#21929 (head 0f816e8, base re-resolved to a63a2b7 at CI trigger time): two-dot: 71 files diffed, 1 install-hook match (false positive) three-dot: 2 files diffed, 0 install-hook matches (correct) Threat-model coverage is preserved: - adding a payload in this PR: still on path from merge-base to HEAD, still caught - force-pushing malicious code: head SHA changes, CI re-runs, merge-base recomputed, caught - merging upstream into the branch: merge commits on the branch path, caught
Collaborator
|
Duplicate of #13411 — same fix (two-dot → three-dot diff in supply-chain-audit.yml to prevent false positives from upstream drift). |
Closed
19 tasks
12 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The supply-chain audit workflow scans diffs with
git diff "$BASE".."$HEAD"(asymmetric two-dot) in three places. When a PR's base SHA is older than
upstream main at CI run time — which is the common case, since GitHub
re-resolves
pull_request.base.shato current upstream HEAD on everytrigger — the asymmetric diff includes every commit upstream landed
between the contributor's last push and CI. The install-hook regex then
matches files like
hermes_cli/setup.pythat upstream added, not the PR,and the scan exits 1 as a CRITICAL finding.
Evidence
PR #21929 (a 2-file change to
hermes_constants.py+ its tests) wasrebased onto upstream
d62808c373. By the time CI ran, upstream hadadvanced 48 commits to
a63a2b7c78. The webhook recordedBASE = a63a2b7c,HEAD = 0f816e8a(the rebased branch tip).Reproduced locally against those exact SHAs:
The same shape applies to PR #21747 and #21428 (both currently showing the
false-positive on their CI). Any external contributor whose PR sits open
long enough for upstream to advance will hit it.
Root cause
git diff A..B(two dots) shows changes between the trees at A and B —including any commits A has that B doesn't.
git diff A...B(three dots)shows changes from the merge-base of A and B to B — i.e. only what the
PR's branch path actually introduced. For a security scanner whose stated
job is to flag what the contributor added, three-dot is the correct
semantic. It's also what the GitHub PR UI uses for its diff view.
Fix
Single character on each of the three
git diffinvocations:..→....Added an inline comment explaining the choice so the next reader doesn't
"simplify" it back.
Why three-dot is strictly safer for this scanner
.pth/setup.py/ base64+exec payload in this PR —still on the branch path from merge-base to HEAD. Still caught.
re-runs, merge-base recomputed, new commits caught.
from merge-base to HEAD. Still caught.
There is no scenario where two-dot catches something three-dot misses
under this threat model. Two-dot's only "extra" coverage is upstream
commits the PR doesn't include — the noise this PR removes.
Workflow consistency
The workflow's own header comment notes that low-signal heuristics were
deliberately removed because "they fired on nearly every PR and trained
reviewers to ignore the scanner." The current two-dot bug actively works
against that goal — every external contributor whose PR isn't merged
within minutes hits it.
Test plan
Workflow shell logic isn't easily unit-testable in isolation. Manual
reproduction:
that touches a
setup.py/.pth/ install-hook file.Verified for PR #21929 SHAs (head
0f816e8a, base re-resolved toa63a2b7cat CI trigger time on 2026-05-10).Platforms tested
N/A — workflow runs on
ubuntu-latestonly.Related
Fixes the false-positive currently red on PRs #21428, #21747, #21929 (and
likely most other open external PRs with stale bases).