Skip to content

ci(supply-chain): use merge-base diff (A...B) so upstream drift doesn't false-positive#13411

Closed
briandevans wants to merge 1 commit into
NousResearch:mainfrom
briandevans:fix/supply-chain-audit-use-merge-base-diff
Closed

ci(supply-chain): use merge-base diff (A...B) so upstream drift doesn't false-positive#13411
briandevans wants to merge 1 commit into
NousResearch:mainfrom
briandevans:fix/supply-chain-audit-use-merge-base-diff

Conversation

@briandevans

@briandevans briandevans commented Apr 21, 2026

Copy link
Copy Markdown
Contributor

What does this PR do?

The supply-chain scan has been failing on every fork PR since 19db7fa3 (Apr 19) — not because of real attack patterns, but because the diff computation includes all of main's upstream drift since the PR branch was forked.

BASE..HEAD (two-dot) shows the full diff between the two trees in both directions. When main moves forward between when the PR was opened and when CI runs, files changed upstream appear in the scan as though the PR had touched them. Concrete measurement on one of my open fork PRs: two-dot returns 551 files; three-dot returns 2 — the PR's actual changes. The install-hook regex then matches hermes_cli/setup.py (modified upstream) and the workflow fails.

Why this only just surfaced: before 19db7fa3 the fail condition was critical == 'true'. WARNING-tier checks bumped found=true but not critical=true, so main-drift false positives surfaced only as a PR comment. 19db7fa3 simplified the fail to found == 'true', unmasking the pre-existing BASE..HEAD bug.

The fix: three-character edit — change "$BASE".."$HEAD" to "$BASE"..."$HEAD" in 3 places. A...B = diff(merge-base(A, B), B), which matches what GitHub's "Files changed" tab shows.

Related Issue

Fixes #13411 / surfaces from the failing supply-chain scan on every open fork PR.

Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)
  • ✨ New feature (non-breaking change that adds functionality)
  • 🔒 Security fix
  • 📝 Documentation update
  • ✅ Tests (adding or improving test coverage)
  • ♻️ Refactor (no behavior change)
  • 🎯 New skill (bundled or hub)

Changes Made

  • .github/workflows/supply-chain-audit.yml — change git diff "$BASE".."$HEAD" to "$BASE"..."$HEAD" in the DIFF=, PTH_FILES=, and SETUP_HITS= lines. The other two scan checks (B64_EXEC_HITS, PROC_HITS) build on the already-corrected $DIFF variable so they inherit the fix without their own edit. Detection regexes unchanged. Warning-vs-critical split unchanged. Fork-PR comment fallback unchanged. BASE/HEAD env derivation unchanged.

How to Test

  1. Re-run supply-chain-audit.yml on any currently-red fork PR — scan now reports zero matches (e.g. fix(mcp): auto-reconnect + retry once when the transport session expires (#13383) #13406, fix(telegram): log document/video send failures instead of printing to stdout (#13356) #13387, test(interrupt-propagation): seed provider attr on bare agent to match __init__ #12908, fix(slack): key thread-context cache by team_id to prevent cross-workspace leak (#12421) #12889 all match nothing under A...B).
  2. Replay the four scan regexes against "$BASE"..."$HEAD" on a branch with synthetic .pth addition — matches as expected (no detection regression).
  3. Behaviour matrix (before → after):
    • PR adds real .pth file: fail → fail (correct)
    • PR adds setup.py: fail → fail (correct)
    • PR adds base64.b64decode() + exec(): fail → fail (correct)
    • PR is 2-file fix, main drifted, upstream touched hermes_cli/setup.py: FAIL (false positive) → pass
    • Arbitrary unrelated change: pass → pass

Checklist

Code

  • I've read the Contributing Guide
  • My commit messages follow Conventional Commits (fix(scope):, feat(scope):, etc.)
  • I searched for existing PRs to make sure this isn't a duplicate
  • My PR contains only changes related to this fix/feature (no unrelated commits)
  • I've run focused tests for the touched code and all pass — N/A, workflow shell; correctness verifiable from git diff documented semantics
  • I've added tests for my changes (required for bug fixes, strongly encouraged for features) — N/A; no existing harness for the workflow itself
  • I've tested on my platform: macOS 15.x (replayed against open fork PRs)

Documentation & Housekeeping

  • I've updated relevant documentation (README, docs/, docstrings) — N/A
  • I've updated cli-config.yaml.example if I added/changed config keys — N/A
  • I've updated CONTRIBUTING.md or AGENTS.md if I changed architecture or workflows — N/A, behavior of the workflow narrows false-positives only
  • I've considered cross-platform impact (Windows, macOS) per the compatibility guide — runs on ubuntu-latest
  • I've updated tool descriptions/schemas if I changed tool behavior — N/A

Related / Positioning

  • Same issue was briefly raised on other GitHub Actions that compute PR diffs across the ecosystem; the standard fix is three-dot. Git docs: git diff A..B vs git diff A...Bhttps://git-scm.com/docs/gitrevisions#_dotted_range_notations
  • Pre-empted review questions: A...B does not skip attack-shaped commits introduced on the PR branch; it only excludes main's own commits, which are main's responsibility. Behaviour is identical to A..B when the PR branch is up-to-date with main (merge-base equals base).

Audited siblings: confirmed the other two scan checks (B64_EXEC_HITS, PROC_HITS) consume $DIFF and inherit the three-dot fix without separate edits. No widening needed.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the supply-chain audit workflow’s diff computation to use merge-base (A...B) semantics so fork PR scans only evaluate changes introduced by the PR branch, avoiding false positives caused by upstream drift on main.

Changes:

  • Switch git diff "$BASE".."$HEAD" to git diff "$BASE"..."$HEAD" for the main diff used by pattern scanners.
  • Switch .pth and install-hook filename detection to also use the merge-base diff range.
  • Preserve existing detection regexes and scan/fail behavior (critical-only).
Comments suppressed due to low confidence (2)

.github/workflows/supply-chain-audit.yml:60

  • git diff --name-only will list files that were deleted as well as added/modified. Currently this means a PR that removes a .pth file would still trigger the “added or modified” finding. Consider either filtering the diff to additions/modifications only (e.g., via a diff-filter) or adjusting the finding text to explicitly include deletions.
          PTH_FILES=$(git diff --name-only "$BASE"..."$HEAD" | grep '\.pth$' || true)
          if [ -n "$PTH_FILES" ]; then
            FINDINGS="${FINDINGS}
          ### 🚨 CRITICAL: .pth file added or modified
          Python \`.pth\` files in \`site-packages/\` execute automatically when the interpreter starts — no import required.

.github/workflows/supply-chain-audit.yml:103

  • Same as the .pth check: git diff --name-only will include deleted files. If a PR deletes setup.py/setup.cfg/sitecustomize.py/etc., this will still trigger the “added or modified” critical finding. Either filter out deletions or update the finding wording so it matches what’s being detected.
          SETUP_HITS=$(git diff --name-only "$BASE"..."$HEAD" | grep -E '(^|/)(setup\.py|setup\.cfg|sitecustomize\.py|usercustomize\.py|__init__\.pth)$' || true)
          if [ -n "$SETUP_HITS" ]; then
            FINDINGS="${FINDINGS}
          ### 🚨 CRITICAL: Install-hook file added or modified
          These files can execute code during package installation or interpreter startup.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@briandevans

Copy link
Copy Markdown
Contributor Author

Thanks for the review @copilot-pull-request-reviewer — the main overview verdict is clean, which is the important signal.

For the low-confidence / suppressed comment about git diff --name-only listing deletions: good observation, and you're right that a PR removing a .pth file would still trigger the "added or modified" finding. But that's a pre-existing behavior (the same was true with A..B before this PR) — adding a --diff-filter=AM to skip deletions is a separate semantic decision about whether .pth removals should also be flagged (arguably yes, because an attacker could replace sitecustomize.py with a .pth and the removal of the old .pth would be a signal). Out of scope for this PR, which is purely about the A..BA...B false-positive fix.

Happy to file a follow-up if maintainers want stricter diff-filter semantics.

@briandevans

Copy link
Copy Markdown
Contributor Author

Rebased onto current origin/main (commit f49de849e) — the change is now a clean 3-line surgical edit against the active workflow.

Fresh evidence this still bites: PR #24582 just got its Scan PR for critical supply chain risks step set to FAILURE on the 2026-05-15 19:14Z re-run, even though the PR only touches plugins/platforms/google_chat/adapter.py and tests/gateway/test_google_chat.py. Reproducing the scan locally with the workflow's exact BASE..HEAD query:

$ git diff --name-only "$BASE".."$HEAD" | grep -E '(^|/)(setup\.py|setup\.cfg|sitecustomize\.py|usercustomize\.py|__init__\.pth)$'
hermes_cli/setup.py

The match is upstream's edit to hermes_cli/setup.py (touched ~20 times in the last 14 days), not the PR's diff. With this PR's BASE...HEAD form the same query returns empty — i.e. the only true-positive class for this PR is 0, which is what the scanner should report. Same fix as @li0near's #23592 (alt-glitch flagged it as a duplicate of this one).

…'t false-positive

GitHub re-resolves `pull_request.base.sha` to the current upstream HEAD
at workflow-trigger time, not at branch-off time. The supply-chain
scanner's `git diff "$BASE".."$HEAD"` (two-dot) therefore includes
every commit upstream merged after the PR contributor's last push.

Concrete failure mode the scanner is currently hitting: if upstream
modifies `hermes_cli/setup.py` (which happens often — ~20 commits in
the last 14 days alone), every open PR re-scanned after that point
will be flagged with `### 🚨 CRITICAL: Install-hook file added or
modified`, even though the PR itself doesn't touch `setup.py`. The
"Fail on critical findings" step then blocks the PR until the next
scanner run.

Switching to three-dot (`A...B`, which git rewrites to
`git diff $(merge-base A B)..B`) restricts the diff to changes the
PR branch actually introduced. The two file-list scans (`.pth` and
install-hook) and the line-level DIFF all get the same fix, so
B64_EXEC_HITS / PROC_HITS see consistent input.

Evidence on a live PR (NousResearch#24582):
  2-dot: hermes_cli/setup.py  (false positive — touched by upstream)
  3-dot: (empty)              (correct — PR doesn't touch any install-hook)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/config Config system, migrations, profiles P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants