feat(e2e): add scale test measurement infrastructure

**Is your feature request related to a problem? Please describe.**
Grove needs repeatable, structured measurement of operator reconciliation speed at scale.
Without a baseline, there is no way to detect performance regressions or compare the
impact of different operator configurations (e.g. controller concurrency, K8s client rate limits).

**Describe the solution you'd like**
A reusable measurement framework for Grove e2e scale tests, along with a first 5000-pod
scale test (`Test_ScaleTest_5000_MoE`).

The framework tracks phases and milestones (e.g. "pods-ready", "pcs-available") with
wall-clock timestamps, exports structured JSON artifacts for archiving, and captures
operator configuration alongside timing data for full context.

**Describe alternatives you've considered**
Ad-hoc logging and manual timestamp comparison — not repeatable or archivable.

**Additional context**
Design doc: `docs/designs/scale-test.md`

---
By submitting this issue, I agree to follow Grove's [Code of Conduct](https://github.com/ai-dynamo/grove/blob/main/CODE_OF_CONDUCT.md) and [Contributing Guidelines](https://github.com/ai-dynamo/grove/blob/main/CONTRIBUTING.md).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(e2e): add scale test measurement infrastructure #483

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

feat(e2e): add scale test measurement infrastructure #483

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions