Skip to content

Scale Test CI#575

Merged
shayasoolin merged 24 commits into
ai-dynamo:mainfrom
shayasoolin:scale-test-ci
May 13, 2026
Merged

Scale Test CI#575
shayasoolin merged 24 commits into
ai-dynamo:mainfrom
shayasoolin:scale-test-ci

Conversation

@shayasoolin

@shayasoolin shayasoolin commented May 3, 2026

Copy link
Copy Markdown
Contributor

What type of PR is this?

/kind feature

What this PR does / why we need it:

Adds the first version of Grove scale-test CI history tracking.

This PR adds:

  • operator/hack/scale-history.py for storing scale-test results in a history branch.
  • A compact index/metrics.ndjson metric index plus raw scale-test-results.json storage.
  • A minimal static dashboard under operator/hack/scale-dashboard/.
  • A scheduled/manual GitHub Actions workflow that runs the scale test and stores successful results.
  • README docs for local ingestion and dashboard validation.

The workflow disables routine profiling/pprof collection, compiles the scale test before creating the KWOK cluster, and uploads short-retention diagnostics on failure.

Which issue(s) this PR fixes:

Fixes #550

Special notes for your reviewer:

The workflow is intentionally manual/scheduled only for v1. It does not run on PR labels.

The history branch is expected to be public and served by GitHub Pages. Routine runs store only timing history and raw scale-test JSON, not profiling artifacts.

Validation done:

  • Parsed workflow YAML.
  • Ran git diff --check.
  • Verified scale-history.py --help.
  • Ran scale-history.py local on a sample result and confirmed duplicate ingestion is skipped.

Does this PR introduce a API change?

NONE

Additional documentation e.g., enhancement proposals, usage docs, etc.:

operator/hack/README.md documents scale-history.py local/branch usage and local dashboard validation.

@copy-pr-bot

copy-pr-bot Bot commented May 3, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@shayasoolin shayasoolin changed the title Scale test ci Scale Test CI May 3, 2026
Comment thread operator/hack/README.md
Comment thread operator/hack/scale-dashboard/app.js Outdated
danbar2
danbar2 previously approved these changes May 6, 2026

@gflarity gflarity left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Comment thread operator/hack/scale-history.py
Comment thread .github/workflows/scale-test-ci.yaml
Comment thread operator/hack/scale-history.py Outdated
Comment thread operator/hack/scale-history.py Outdated
Comment thread operator/hack/scale-history.py Outdated
@shayasoolin shayasoolin merged commit cd742f0 into ai-dynamo:main May 13, 2026
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Scheduled CI: scale-test trend tracking and comparison

5 participants