Is your feature request related to a problem? Please describe.
Grove needs repeatable, structured measurement of operator reconciliation speed at scale.
Without a baseline, there is no way to detect performance regressions or compare the
impact of different operator configurations (e.g. controller concurrency, K8s client rate limits).
Describe the solution you'd like
A reusable measurement framework for Grove e2e scale tests, along with a first 5000-pod
scale test (Test_ScaleTest_5000_MoE).
The framework tracks phases and milestones (e.g. "pods-ready", "pcs-available") with
wall-clock timestamps, exports structured JSON artifacts for archiving, and captures
operator configuration alongside timing data for full context.
Describe alternatives you've considered
Ad-hoc logging and manual timestamp comparison — not repeatable or archivable.
Additional context
Design doc: docs/designs/scale-test.md
By submitting this issue, I agree to follow Grove's Code of Conduct and Contributing Guidelines.
Is your feature request related to a problem? Please describe.
Grove needs repeatable, structured measurement of operator reconciliation speed at scale.
Without a baseline, there is no way to detect performance regressions or compare the
impact of different operator configurations (e.g. controller concurrency, K8s client rate limits).
Describe the solution you'd like
A reusable measurement framework for Grove e2e scale tests, along with a first 5000-pod
scale test (
Test_ScaleTest_5000_MoE).The framework tracks phases and milestones (e.g. "pods-ready", "pcs-available") with
wall-clock timestamps, exports structured JSON artifacts for archiving, and captures
operator configuration alongside timing data for full context.
Describe alternatives you've considered
Ad-hoc logging and manual timestamp comparison — not repeatable or archivable.
Additional context
Design doc:
docs/designs/scale-test.mdBy submitting this issue, I agree to follow Grove's Code of Conduct and Contributing Guidelines.