feat: Add OceanBase Performance Monitoring and Health Check Integration#12886
Conversation
- Add comprehensive performance metrics to OBConnection class: * Connection latency measurement * Storage space usage (used/total) * Query throughput (QPS) estimation * Slow query statistics * Connection pool statistics - Add get_oceanbase_status() function following ES/Infinity pattern - Add check_oceanbase_health() function with detailed metrics - Add /oceanbase/status API endpoint for health monitoring - Add comprehensive unit tests (340+ lines) covering: * Health check success/failure scenarios * Performance metrics retrieval * Error handling and edge cases * Connection pool statistics * Storage information retrieval This implementation provides operations teams with detailed OceanBase health status and performance metrics for troubleshooting and system maintenance, fulfilling the requirements in issue infiniflow#12772. Fixes infiniflow#12772
|
Appreciations! |
- Remove unused check_oceanbase_health import from system_app.py - Remove unused MagicMock and default_timer imports from test file
Hi, Thank you for your review. |
- Add directory existence check before copying logs - Make log collection step resilient to missing directories - Prevent CI failures when ragflow-logs directory doesn't exist - Apply fix to both ES and Infinity log collection steps
- Fix mock configuration in TestOceanBaseHealthCheck to properly return mock objects - Fix TestOBConnectionPerformanceMetrics to create mock_client inside tests - Properly configure mock side_effects for different SQL queries - Remove unused fixture parameters that were causing AttributeErrors
- Fix check_oceanbase_health to return 'unhealthy' when connection is disconnected - Use @patch.object to properly mock OBConnection.__init__ for singleton class - Ensure all test methods properly create mock instances with actual methods
- Fix health check logic to return 'unhealthy' when connection is disconnected - Use types.MethodType to properly bind OBConnection methods to mock objects - Avoid singleton decorator issues by creating mock objects with real methods attached
- Get the actual OBConnection class from the singleton wrapper's closure - Use __closure__[0].cell_contents to access the original class - Bind real methods to mock objects for testing
- Iterate through all closure cells to find the class - Use inspect.isclass to identify the correct closure cell - Handle case where class might not be in first closure cell
- Remove duplicate inspect import inside _create_mock_connection - Use the top-level inspect import instead
Description
This PR implements comprehensive OceanBase performance monitoring and health check functionality as requested in issue #12772. The implementation follows the existing ES/Infinity health check patterns and provides detailed metrics for operations teams.
Problem
Currently, RAGFlow lacks detailed health monitoring for OceanBase when used as the document engine. Operations teams need visibility into:
Solution
1. Enhanced OBConnection Class (
rag/utils/ob_conn.py)Added comprehensive performance monitoring methods:
get_performance_metrics()- Main method returning all performance metrics_get_storage_info()- Retrieves database storage usage_get_connection_pool_stats()- Gets connection pool statistics_get_slow_query_count()- Counts queries exceeding threshold_estimate_qps()- Estimates queries per secondhealth()method with connection status2. Health Check Utilities (
api/utils/health_utils.py)Added two new functions following ES/Infinity patterns:
get_oceanbase_status()- Returns OceanBase status with health and performance metricscheck_oceanbase_health()- Comprehensive health check with detailed metrics3. API Endpoint (
api/apps/system_app.py)Added new endpoint:
GET /v1/system/oceanbase/status- Returns OceanBase health status and performance metrics4. Comprehensive Unit Tests (
test/unit_test/utils/test_oceanbase_health.py)Added 340+ lines of unit tests covering:
Metrics Provided
Testing
Code Statistics
Acceptance Criteria Met
✅
/system/oceanbase/statusAPI returns OceanBase health status✅ Monitoring metrics accurately reflect OceanBase running status
✅ Clear error messages when health checks fail
✅ Response time optimized (metrics cached where possible)
✅ Follows existing ES/Infinity health check patterns
✅ Comprehensive test coverage
Related Files
rag/utils/ob_conn.py- OceanBase connection classapi/utils/health_utils.py- Health check utilitiesapi/apps/system_app.py- System API endpointstest/unit_test/utils/test_oceanbase_health.py- Unit testsFixes #12772