feat: ci send nightly-e2e scorecard to slack#4308
Conversation
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughUpdates the nightly-e2e workflow to emit structured scorecard JSON, adds a Slack Block Kit builder module and tests, and posts rich Slack messages on schedule or for manual dispatch when ChangesNightly scorecard Slack Block Kit integration
Sequence Diagram(s)sequenceDiagram
participant Workflow as nightly-e2e
participant ScoreJob as scorecard job
participant Builder as build-slack-blocks.js
participant SlackAPI as Slack Webhook
Workflow->>ScoreJob: trigger (schedule or dispatch)
ScoreJob->>ScoreJob: generate scorecard JSON (scorecardData)
ScoreJob->>Builder: require() and pass SCORECARD_DATA
Builder->>ScoreJob: return blocks + fallback + color + channel
ScoreJob->>SlackAPI: POST block kit JSON payload
SlackAPI-->>ScoreJob: 200 OK
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
E2E Advisor RecommendationRequired E2E: None Full advisor summaryE2E Recommendation AdvisorBase: Required E2E
Optional E2E
New E2E recommendations
|
E2E Scenario Advisor RecommendationRequired scenario E2E: None Full scenario advisor summaryE2E Scenario AdvisorBase: Required scenario E2E
Optional scenario E2E
Relevant changed files
|
PR Review AdvisorFindings: 3 needs attention, 11 worth checking, 1 nice ideas Review findings🛠️ Needs attention
🔎 Worth checking
🌱 Nice ideas
Since last review detailsCurrent findings:
This is an automated advisory review. A human maintainer must make the final merge decision. |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/nightly-e2e.yaml:
- Around line 2343-2345: The inline comment above the SLACK_WEBHOOK_URL
environment line is stale and misleading; update or remove it so it accurately
reflects current configuration: either delete the three-line promotion note or
reword it to state that the step already uses secrets.SLACK_WEBHOOK_URL
(referring to the SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL }} line) to
avoid operator confusion.
- Around line 2159-2164: Add with: persist-credentials: false to the "Checkout
scorecard builder" step (the step using
actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd) so the checkout does
not persist Git credentials; keep the existing sparse-checkout and
sparse-checkout-cone-mode keys and only add the persist-credentials: false entry
under with to make the checkout read-only and reduce token exposure.
In `@test/scorecard-blocks.test.ts`:
- Around line 6-11: Replace the top-level direct require with Node's
createRequire pattern: import { createRequire } from "module" at the top of the
test, call const requireFrom = createRequire(import.meta.url), then load the
CommonJS module with const { buildBlocks, buildFallbackText, getStatusColor } =
requireFrom("../scripts/scorecard/build-slack-blocks.js"); and remove the
eslint-disable comment and the direct require(...) call so the test uses
createRequire(import.meta.url) to import those symbols.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 4389e4fb-0835-48ca-9d73-27c499342578
📒 Files selected for processing (3)
.github/workflows/nightly-e2e.yamlscripts/scorecard/build-slack-blocks.jstest/scorecard-blocks.test.ts
Selective E2E Results — ✅ All requested jobs passedRun: 26507252539
|
Selective E2E Results — ✅ All requested jobs passedRun: 26566729263
|
Selective E2E Results — ✅ All requested jobs passedRun: 26566787955
|
Selective E2E Results — ✅ All requested jobs passedRun: 26567721732
|
There was a problem hiding this comment.
🧹 Nitpick comments (1)
.github/workflows/nightly-e2e.yaml (1)
2415-2424: 💤 Low valueConsider wrapping fetch in try/catch for network resilience.
If
fetchthrows due to a network error (timeout, DNS failure), the step fails and marks the job as failed—even though the core scorecard summary was already written. Adding a try/catch ensures transient network issues don't obscure the actual E2E results.Suggested change
- const resp = await fetch(webhookUrl, { - method: 'POST', - headers: { 'Content-Type': 'application/json' }, - body: JSON.stringify(payload), - }); - - if (!resp.ok) { - core.warning(`Slack webhook (${channel}) returned ${resp.status}: ${await resp.text()}`); - } else { - core.info(`Scorecard posted to Slack (${channel})`); - } + try { + const resp = await fetch(webhookUrl, { + method: 'POST', + headers: { 'Content-Type': 'application/json' }, + body: JSON.stringify(payload), + }); + + if (!resp.ok) { + core.warning(`Slack webhook (${channel}) returned ${resp.status}: ${await resp.text()}`); + } else { + core.info(`Scorecard posted to Slack (${channel})`); + } + } catch (err) { + core.warning(`Slack webhook (${channel}) failed: ${err.message}`); + }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/nightly-e2e.yaml around lines 2415 - 2424, The fetch to webhookUrl that posts payload (resp = await fetch(...)) should be wrapped in a try/catch to prevent network errors from failing the job; catch any error from fetch, call core.warning with a clear message including channel and the error.message (and optionally stack), and then continue so the step does not hide the actual E2E results; keep the existing resp.ok check inside the try block and on error fall back to core.warning instead of letting the exception bubble.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In @.github/workflows/nightly-e2e.yaml:
- Around line 2415-2424: The fetch to webhookUrl that posts payload (resp =
await fetch(...)) should be wrapped in a try/catch to prevent network errors
from failing the job; catch any error from fetch, call core.warning with a clear
message including channel and the error.message (and optionally stack), and then
continue so the step does not hide the actual E2E results; keep the existing
resp.ok check inside the try block and on error fall back to core.warning
instead of letting the exception bubble.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 815ee20e-dfca-4485-adeb-5f40386a6013
📒 Files selected for processing (3)
.github/workflows/nightly-e2e.yamlscripts/scorecard/build-slack-blocks.jstest/scorecard-blocks.test.ts
🚧 Files skipped from review as they are similar to previous changes (2)
- scripts/scorecard/build-slack-blocks.js
- test/scorecard-blocks.test.ts
cv
left a comment
There was a problem hiding this comment.
This could spam the Slack channel too much. Let's discuss it first!
<!-- markdownlint-disable MD041 --> ## Summary Polish the nightly E2E scorecard Slack message: rename routing secrets, inject mode + actor into the title and run-mode line, and switch failed-job names + counts to the GitHub Actions API for reliable hyperlinks and accurate listings on re-runs. ## Related Issue (none — incremental polish on top of #4308) ## Changes - Rename Slack channel tags + secrets to event-type names (`SITUATION_ROOM` / `CI` → `DAILY` / `FULLRUN`) (due to just pushing notification to a unified channel as requirement) - Append `(by *<actor>*)` to the run-mode line for `Manual full run` / `Selective dispatch`. Schedule runs skip it (actor reflects file committer, not trigger). - Inject mode + actor into the title (top-level fallback text). - Use `listJobsForWorkflowRun` (paginated) as source of truth for failed jobs and counts. Strips ` / <suffix>` so reusable-workflow names match caller IDs and hyperlinks render. Last-write-wins dedup so re-runs reflect final conclusion. Falls back to `needs` context on API failure. ## Type of Change - [x] Code change (feature, bug fix, or refactor) - [ ] Code change with doc updates - [ ] Doc only (prose changes, no code sample modifications) - [ ] Doc only (includes code sample changes) ## Verification <!-- Check each item you ran and confirmed. Leave unchecked items you skipped. Doc-only changes do not require npm test unless you ran it. --> - [ ] `npx prek run --all-files` passes - [ ] `npm test` passes - [ ] Tests added or updated for new or changed behavior - [ ] No secrets, API keys, or credentials committed - [ ] Docs updated for user-facing behavior changes - [ ] `npm run docs` builds without warnings (doc changes only) - [ ] Doc pages follow the [style guide](https://github.com/NVIDIA/NemoClaw/blob/main/docs/CONTRIBUTING.md) (doc changes only) - [ ] New doc pages include SPDX header and frontmatter (new pages only) --- <!-- DCO sign-off required by CI. Run: git config user.name && git config user.email --> Signed-off-by: Hung Le <hple@nvidia.com> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Chores** * Improved nightly scorecard reliability by deriving a canonical job set (with API-based fallback) and improving failed-job reporting; scorecard output now includes the workflow actor. * Reorganized Slack routing to use separate webhooks for daily vs full runs while preserving preview gating. * **New Features** * Slack notifications can show the workflow initiator for clearer run context; fallback titles are richer. * **Tests** * Updated tests for new routing and actor-suffix behavior. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
Summary
Implements the Slack notification piece of the nightly E2E scorecard
(issue #2614). Run-mode-aware routing posts the Block Kit scorecard to
different Slack channels:
#nemoclaw-situation-room(operational alerts)workflow_dispatchwith emptyjobs) →#nemoclaw-ci(team-wide visibility)Related Issue
Refs #2612.
Closes #2614.
Changes
scripts/scorecard/build-slack-blocks.js— pure CommonJS builder for the Slack payload (buildBlocks,buildFallbackText,getStatusColor,getSlackChannel); noactions/github-scriptdependencies so the module stays unit-testable.test/scorecard-blocks.test.ts— 22 unit tests across perfect / failure / selective-dispatch scenarios, fallback text, color mapping, channel routing, and total-ran prefix..github/workflows/nightly-e2e.yaml:scorecardjob sparse-checks outscripts/scorecardso the Slack step canrequire()the builder.Generate nightly scorecardstep emits a structuredscorecardDataJSON output alongside the existing Markdown summary, including per-jobhtml_urlfor failed-jobs hyperlinks (best-effort vialistJobsForWorkflowRun).Post scorecard to Slackstep wraps blocks in a legacy attachment for a coloured left-edge bar (green / red / yellow) signalling run health at a glance.runModecomputed by the generate step — single source of truth, aligns with the trim+filter classification Yaunches already established.<url|name>) pointing directly to each failed job's logs.Total ran: <ran>/<total>for quick eyeball of skip-vs-run ratio.post_to_slackworkflow_dispatchinput — only relevant for selective dispatches (production runs ignore it).An instance
Type of Change
Verification
npx prek run --all-filespassesnpm testpassesnpm run docsbuilds without warnings (doc changes only)Signed-off-by: Hung Le hple@nvidia.com
Summary by CodeRabbit
New Features
Tests