Pruner e2e test#3797
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Dependency ReviewThe following issues were found:
|
There was a problem hiding this comment.
Pull request overview
This PR adds comprehensive end-to-end testing infrastructure for the Rooch pruner functionality. The implementation includes TypeScript test utilities for collecting and validating Prometheus metrics, a Move test contract for simulating object lifecycle operations, and detailed documentation.
Key Changes
- Added Prometheus metrics client and test reporter utilities for E2E validation of pruner behavior
- Created a Move contract (
pruner_test::object_lifecycle) to test object creation, updates, and deletion scenarios that trigger pruning - Integrated pruner metrics into the RPC server and refactored metrics from
StateDBMetricsto a dedicatedPrunerMetricsmodule
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
| sdk/typescript/test-suite/src/utils/prometheus-client.ts | New Prometheus client for parsing and extracting pruner metrics with histogram, gauge, and counter support |
| sdk/typescript/test-suite/src/utils/test-reporter.ts | Report generator and formatter for pruner E2E test results |
| sdk/typescript/test-suite/src/testbox.ts | Extended to support server arguments and metrics port configuration for pruner testing |
| sdk/typescript/test-suite/src/e2e/pruner-e2e.spec.ts | Complete E2E test suite exercising counter churn and object lifecycle scenarios |
| examples/pruner_test/sources/object_lifecycle.move | Move module for testing object CRUD operations to generate prunable state |
| examples/pruner_test/Move.toml | Package configuration for the pruner test module |
| moveos/moveos-store/src/state_store/metrics.rs | Removed pruner metrics (moved to dedicated module) |
| crates/rooch-rpc-server/src/lib.rs | Initialized and wired PrunerMetrics to the state pruner |
| crates/rooch-pruner/src/pruner.rs | Updated bloom filter metric to use proper label value |
| package.json | Downgraded pnpm version (needs verification) |
| docs/dev-guide/pruner_e2e_testing_guide.md | Comprehensive 730-line testing guide with examples and architecture diagrams |
Comments suppressed due to low confidence (1)
sdk/typescript/test-suite/src/testbox.ts:149
- [nitpick] The
metricsPortis generated but never passed as a command-line argument to the Rooch server. The metrics port is only set via an environment variableMETRICS_HOST_PORT=${metricsPort}at line 148. While this may work if the Rooch server reads this environment variable, it would be more explicit and maintainable to also pass it as a command-line argument like--metrics-port ${metricsPort}if the server supports it. Verify that the environment variable approach is the intended configuration method.
const metricsPort = await getUnusedPort()
const cmds = ['server', 'start', '-n', 'local', '-d', 'TMP', '--port', port.toString()]
if (serverArgs.length > 0) {
cmds.push(...serverArgs)
}
if (this.bitcoinContainer) {
cmds.push(
...[
'--btc-rpc-url',
this.bitcoinContainer.getRpcUrl(),
'--btc-rpc-username',
this.bitcoinContainer.getRpcUser(),
'--btc-rpc-password',
this.bitcoinContainer.getRpcPass(),
'--btc-sync-block-interval',
'1',
],
)
}
cmds.push('--traffic-per-second', '1')
cmds.push('--traffic-burst-size', '5000')
const result: string = await this.roochAsyncCommand(
cmds,
`JSON-RPC HTTP Server start listening 0.0.0.0:${port}`,
[`METRICS_HOST_PORT=${metricsPort}`],
)
| for (const part of labelStr.split(',')) { | ||
| const [k, v] = part.split('=') | ||
| if (k && v) { | ||
| labels[k.trim()] = v.replace(/^"|"$/g, '').trim() | ||
| } |
There was a problem hiding this comment.
The label matching in findSample should use === for strict equality, but it currently does. However, there's a potential bug: the Prometheus format typically uses double quotes around label values (e.g., phase="SweepExpired"), but the regex at line 137 removes quotes. If a metric has a label value that naturally contains quotes or special characters, this simple replacement may not work correctly. Consider a more robust label parsing approach.
| for (const part of labelStr.split(',')) { | |
| const [k, v] = part.split('=') | |
| if (k && v) { | |
| labels[k.trim()] = v.replace(/^"|"$/g, '').trim() | |
| } | |
| // Robustly parse label key="value" pairs, handling escaped quotes | |
| const labelRegex = /(\w+)\s*=\s*"((?:[^"\\]|\\.)*)"/g; | |
| let match; | |
| while ((match = labelRegex.exec(labelStr)) !== null) { | |
| const key = match[1]; | |
| // Unescape any escaped characters in the value | |
| const rawValue = match[2]; | |
| const value = rawValue.replace(/\\"/g, '"').replace(/\\\\/g, '\\'); | |
| labels[key] = value; |
Core Changes: - Add atomic snapshot manager to ensure consistency across prune phases - Implement error recovery with exponential backoff - Add comprehensive validation and testing framework - Integrate atomic snapshot into StatePruner lifecycle - Add E2E tests with aggressive configuration Technical Details: - AtomicSnapshotManager coordinates snapshot creation, locking, and validation - Three-phase pruning (BuildReach, SweepExpired, IncrementalSweep) now use consistent state - ErrorRecoveryManager handles failures with retry logic and system health checks - ValidationTests provide comprehensive test suite for snapshot integrity - E2E tests verify 0-day protection window with atomic safety Performance Improvements: - 4x faster cleanup intervals (60s → 15s) - Zero "Missing node" errors with aggressive configuration - Atomic snapshots protect 163+ reachable nodes during cleanup - Eliminates time window race conditions Testing: - Comprehensive unit tests for atomic snapshot functionality - E2E validation with real contract deployment and execution - Performance benchmarks showing consistency guarantees - Error recovery and system health verification 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
- Fix metrics port handling in PrometheusClient (accept dynamic port) - Improve Prometheus regex pattern to handle escaped quotes properly - Add comprehensive error handling for metrics fetch with HTTP status validation - Simplify gauge metric fallback to avoid wrong metric matches - Fix package version downgrade: update pnpm@9.15.4 consistently - Fix TypeScript/E2E test dependencies: use npx wait-on for cross-platform compatibility - Resolve all clippy warnings and compilation errors across Rust codebase - Restore and fix atomic_snapshot_test.rs: remove unsafe code, create practical unit tests - Add proper error messages and debugging information 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Summary
Summary about this PR