Skip to content

CI: harden release-check script execution paths#67920

Open
hxy91819 wants to merge 6 commits intomainfrom
codex/actions-matrix-lane-code-injection
Open

CI: harden release-check script execution paths#67920
hxy91819 wants to merge 6 commits intomainfrom
codex/actions-matrix-lane-code-injection

Conversation

@hxy91819
Copy link
Copy Markdown
Member

@hxy91819 hxy91819 commented Apr 17, 2026

Summary

  • This PR now keeps only the remaining script/test hardening after the workflow overlap was absorbed by fix(ci): harden release checks workflow inputs #66884.
  • scripts/openclaw-cross-os-release-checks.ts now derives execution mode from the selected suite, uses resolveNpmRunner for the npm install path, and adds typed command invocation options for the Windows-safe npm execution flow.
  • test/scripts/openclaw-cross-os-release-checks.test.ts now covers resolveExecutionModeForSuite().

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

Root Cause (if applicable)

  • The original workflow fix overlapped with fix(ci): harden release checks workflow inputs #66884.
  • The remaining independent hardening is in the script layer: keep suite-to-mode derivation inside the script, and avoid the unsafe Windows npm.cmd execution path by reusing resolveNpmRunner.

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file: test/scripts/openclaw-cross-os-release-checks.test.ts
  • Scenario the test should lock in: resolveExecutionModeForSuite() keeps the expected fresh/upgrade mapping and throws for unknown suites.
  • Existing test that already covers this (if any): the existing runner-matrix test continues to cover the matrix shape.

User-visible / Behavior Changes

None.

Security Impact (required)

  • New permissions/capabilities? (Yes/No) No
  • Secrets/tokens handling changed? (Yes/No) No
  • New/changed network calls? (Yes/No) No
  • Command/tool execution surface changed? (Yes/No) Yes
  • Data access scope changed? (Yes/No) No
  • If any Yes, explain risk + mitigation: this narrows the command execution surface by deriving mode inside the script and using the repo's Windows-safe npm runner helper instead of shelling through npm.cmd.

Repro + Verification

Environment

  • OS: Linux
  • Runtime/container: local repo worktree
  • Model/provider: N/A
  • Integration/channel (if any): GitHub Actions reusable workflow / release-check script
  • Relevant config (redacted): none

Steps

  1. Run pnpm test test/scripts/openclaw-cross-os-release-checks.test.ts.
  2. Inspect scripts/openclaw-cross-os-release-checks.ts for the resolveExecutionModeForSuite() helper and resolveNpmRunner wiring.

Expected

  • resolveExecutionModeForSuite() maps suites to the expected execution mode and rejects unsupported suites.
  • Windows package installation uses resolveNpmRunner rather than relying on the default .cmd shell path.

Actual

  • The script now derives mode from suite when needed.
  • The npm install path now goes through resolveNpmRunner.
  • pnpm test test/scripts/openclaw-cross-os-release-checks.test.ts passed locally.

Human Verification (required)

  • Verified scenarios: narrowed the worktree to remove the workflow overlap, compared the workflow file to current main, and ran pnpm test test/scripts/openclaw-cross-os-release-checks.test.ts.
  • Edge cases checked: unknown suite throws, Windows npm execution path uses the typed resolveNpmRunner invocation.
  • What you did not verify: I did not run a live GitHub Actions workflow from this branch.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? (Yes/No) Yes
  • Config/env changes? (Yes/No) No
  • Migration needed? (Yes/No) No
  • If yes, exact upgrade steps:

Risks and Mitigations

  • Risk: suite-to-mode derivation could drift from release-check suite semantics.
    • Mitigation: keep the mapping in one helper and cover it with the targeted test file.

hxy91819 and others added 2 commits April 17, 2026 10:23
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@openclaw-barnacle openclaw-barnacle Bot added scripts Repository scripts size: XL maintainer Maintainer-authored PR labels Apr 17, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 17, 2026

Greptile Summary

This PR hardens the release-check script by deriving execution mode from the selected suite via resolveExecutionModeForSuite, routing Windows npm installs through resolveNpmRunner instead of the unsafe .cmd shell path, expanding agent-output OK detection to fall back to the log file, and moving workflow inline template expressions into env: variables to prevent shell-injection. One unmentioned behavior change in isPackagedDistPath (removal of PACKAGED_QA_RUNTIME_PATHS) silently drops dist/extensions/qa-channel/runtime-api.js and dist/extensions/qa-lab/runtime-api.js from the packaged-dist inventory — worth confirming whether this is deliberate.

Confidence Score: 5/5

Safe to merge; the one flagged item is a P2 clarification question about an unmentioned filter change, not a blocking defect.

All core changes (npm runner wiring, suite-to-mode derivation, OK marker fallback, workflow injection hardening) are correct. The only finding is a P2 query about whether the PACKAGED_QA_RUNTIME_PATHS removal was intentional — it does not affect script correctness if those files are no longer shipped.

scripts/openclaw-cross-os-release-checks.ts — confirm the isPackagedDistPath change at lines 508-510 is intentional.

Prompt To Fix All With AI
This is a comment left during a code review.
Path: scripts/openclaw-cross-os-release-checks.ts
Line: 508-510

Comment:
**Unmentioned behavior change removes QA runtime paths from inventory**

`PACKAGED_QA_RUNTIME_PATHS` previously allowed `dist/extensions/qa-channel/runtime-api.js` and `dist/extensions/qa-lab/runtime-api.js` to pass through the QA extension omission filter and be included in the packaged-dist inventory snapshot. Now every path under `OMITTED_QA_EXTENSION_PREFIXES` unconditionally returns `false`, silently dropping those two files from the inventory. The PR description doesn't mention this as an intentional change — was this a deliberate cleanup (confirming those files are no longer shipped), or did it slip in unintentionally?

How can I resolve this? If you propose a fix, please make it concise.

Reviews (2): Last reviewed commit: "CI: narrow release-check hardening diff" | Re-trigger Greptile

Comment thread scripts/openclaw-cross-os-release-checks.ts Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f085306031

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread skills/mason-workflow/scripts/bootstrap_mason_worktree.py Outdated
hxy91819 and others added 2 commits April 17, 2026 10:32
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@aisle-research-bot
Copy link
Copy Markdown

aisle-research-bot Bot commented Apr 17, 2026

🔒 Aisle Security Analysis

We found 1 potential security issue(s) in this PR:

# Severity Title
1 🟡 Medium Integrity check bypass: OK marker accepted via regex scraping of untrusted stdout/log content
1. 🟡 Integrity check bypass: OK marker accepted via regex scraping of untrusted stdout/log content
Property Value
Severity Medium
CWE CWE-345
Location scripts/openclaw-cross-os-release-checks.ts:2445-2490

Description

The cross-OS release checks validate agent behavior by looking for an OK marker in the agent output. The new implementation broadens where OK can be sourced from and introduces a regex-based fallback that can be spoofed.

Key issues:

  • Non-JSON output is now accepted: when JSON.parse(stdout) fails, parseAgentPayloadTexts() scrapes any stdout text with a regex and extracts the value of any "finalAssistantVisibleText", "finalAssistantRawText", or "text" fields.
  • Marker can come from unrelated log lines: agentOutputHasExpectedOkMarker() falls back to parsing the entire captured log file when stdout lacks the marker.
  • Spoofing is trivial: a compromised/malicious agent (or any component that can write to stdout/stderr or the log) can emit a substring such as "text":"OK" inside error messages, stack traces, or arbitrary logging without producing the expected structured JSON payload, causing the check to pass.

Vulnerable code:

// fallback when stdout is not JSON
...stdout.matchAll(/"(?:finalAssistantVisibleText|finalAssistantRawText|text)"\s*:\s*"([^"]*)"/gu)// accepts OK from stdout or from reading logPath
const logTexts = parseAgentPayloadTexts(readFileSync(options.logPath, "utf8"));
return logTexts.some((text) => text.trim() === "OK");

Recommendation

Treat the OK marker as a strongly-typed, structured signal and only accept it from a single canonical location in a successfully parsed JSON payload.

Suggested fix:

  1. Remove regex scraping fallback entirely.
  2. Do not parse the full log file for the marker; only validate the child process stdout captured for that command invocation (or require a separate machine-readable side-channel, e.g., an exit code or a dedicated JSON line).
  3. Validate the JSON schema and require exact output.

Example hardened approach:

function agentOutputHasExpectedOkMarker(stdout: string): boolean {
  let payload: any;
  try {
    payload = JSON.parse(stdout);
  } catch {
    return false; // no fallbacks
  }// Require a canonical field and exact match.
  const text = payload?.finalAssistantVisibleText;
  return typeof text === "string" && text.trim() === "OK";
}

If log-based recovery is required for platform reasons, write a dedicated JSON artifact (single-purpose file) that is not mixed with other logs and validate its full JSON content and schema before accepting OK.


Analyzed PR: #67920 at commit 4f96fb0

Last updated on: 2026-04-25T14:03:14Z

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

@visionik visionik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix! Overlap with #66884 means only part of this PR is still needed once #66884 merges (which is imminent). Could you:

  1. Rebase on latest main after #66884 lands, and
  2. Drop the .github/workflows/openclaw-cross-os-release-checks-reusable.yml changes from this PR — they will already be resolved by #66884 using the same env-var-substitution pattern, so keeping them here would create a conflict.

What remains then is your scripts/openclaw-cross-os-release-checks.ts refactor (the resolveExecutionModeForSuite helper, the typed CommandInvocation / RunCommandOptions, and the resolveNpmRunner wiring in installPackageSpec) plus the matching tests. That is genuinely separate work and useful on its own — happy to review once narrowed. See #68428 for the full template-injection tracking context.

@hxy91819
Copy link
Copy Markdown
Member Author

@visionik Thanks, makes sense.

I checked #66884 and it is still open right now, so I'll wait for that PR to land first.

Once #66884 merges, I'll rebase this branch onto the latest main and drop the .github/workflows/openclaw-cross-os-release-checks-reusable.yml diff from this PR so it is narrowed to the scripts/openclaw-cross-os-release-checks.ts changes and the matching tests only.

@hxy91819 hxy91819 marked this pull request as draft April 24, 2026 15:05
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@hxy91819 hxy91819 changed the title CI: fix matrix lane shell interpolation in release checks CI: harden release-check script execution paths Apr 25, 2026
@hxy91819
Copy link
Copy Markdown
Member Author

@visionik #66884 is merged now, and I narrowed this branch accordingly.

The workflow file in this PR now matches current main, so the remaining diff is only the script/test hardening:

  • resolveExecutionModeForSuite()
  • typed CommandInvocation / RunCommandOptions
  • resolveNpmRunner wiring in installPackageSpec()
  • matching test coverage

@hxy91819 hxy91819 requested a review from visionik April 25, 2026 14:00
@hxy91819 hxy91819 marked this pull request as ready for review April 25, 2026 14:01
@hxy91819
Copy link
Copy Markdown
Member Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4f96fb09f7

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +2453 to +2455
try {
const logTexts = parseAgentPayloadTexts(readFileSync(options.logPath, "utf8"));
return logTexts.some((text) => text.trim() === "OK");
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Restrict OK fallback check to current command output

This fallback reads and scans the entire log file, so it can return a false green if that file already contains an older OK marker. Because command logs are opened in append mode and the output directory is not cleared before each run, reused runner temp paths (common on self-hosted runners) can leave stale *-agent.log content; a current failed/empty agent response can still pass this check by matching a prior run's OK.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

maintainer Maintainer-authored PR scripts Repository scripts size: M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants