-
Notifications
You must be signed in to change notification settings - Fork 613
[BUG][PERFORMANCE]: RBAC middleware holds DB sessions for entire request lifecycle causing pool exhaustion #2318
Copy link
Copy link
Closed
Labels
bugSomething isn't workingSomething isn't workingdatabaseperformancePerformance related itemsPerformance related itemspythonPython / backend development (FastAPI)Python / backend development (FastAPI)rbacRole-based Access ControlRole-based Access Control
Milestone
Description
Summary
The get_current_user_with_permissions() function acquires a database session via Depends(get_db) at request start and stores it in the returned user context dict. This session is held for the entire request lifecycle, including during slow async operations (MCP backend calls, template rendering, plugin hooks). Under high load, this causes "idle in transaction" connections to accumulate, leading to connection pool exhaustion and cascading failures.
Symptoms Under Load (4000 concurrent users)
| Metric | Before Failure | During Failure |
|---|---|---|
| RPS | 2000+ | 280 |
| Failure rate | 0.1% | 10.71% |
| Gateway health | 3/3 healthy | 1/3 healthy |
| Idle-in-transaction | 40-50 | 433 |
| Max transaction age | 1-2s | 243 seconds |
Error Messages
QueuePool limit of size 20 overflow 10 reached, connection timed out, timeout 60.00
OperationalError: (psycopg.errors.ProtocolViolation) idle transaction timeout
Root Cause
The RBAC middleware holds database sessions for entire request lifetimes:
# mcpgateway/middleware/rbac.py
async def get_current_user_with_permissions(..., db: Session = Depends(get_db)):
return {
"db": db, # Session stored in user context, held until request completes
}Session lifecycle:
- Request arrives → session acquired via
Depends(get_db) - Permission check runs (fast, ~10ms)
- Session stays open during: slow backend calls (1-30s), template rendering
- Session released only when request completes
- Under load: sessions accumulate in "idle in transaction" state
- Pool exhaustion: new requests timeout waiting for connections
Proposed Solution
- Remove
dbfromget_current_user_with_permissions()return value - Update permission decorators to use
fresh_db_session()for short-lived permission checks - Sessions released immediately after permission check, before slow operations
Files Affected
mcpgateway/middleware/rbac.py- Core fixmcpgateway/admin.py- Remove user["db"] reference- Test fixtures - Remove "db" from mock user contexts
Testing
- Unit tests: Verify permission decorators use fresh sessions
- Integration tests: Verify permission checks still work
- Load tests: Verify stable idle-in-transaction count under 4000 users
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingdatabaseperformancePerformance related itemsPerformance related itemspythonPython / backend development (FastAPI)Python / backend development (FastAPI)rbacRole-based Access ControlRole-based Access Control