-
Notifications
You must be signed in to change notification settings - Fork 614
[PERFORMANCE]: Fix PostgreSQL 'Idle in Transaction' Connection Issue #1862
Copy link
Copy link
Open
Labels
SHOULDP2: Important but not vital; high-value items that are not crucial for the immediate releaseP2: Important but not vital; high-value items that are not crucial for the immediate releasedatabaseperformancePerformance related itemsPerformance related items
Milestone
Description
Summary
Under load, 50-60+ database connections are stuck in "idle in transaction" state, holding transactions open unnecessarily. This wastes connection pool capacity and can cause lock contention.
Problem
Observed during high-concurrency load testing:
SELECT state, count(*) FROM pg_stat_activity WHERE datname = 'mcp' GROUP BY state;
state | count
---------------------+-------
idle in transaction | 62 -- ⚠️ Holding transactions open
idle | 44
active | 4Wait event: ClientRead - Waiting for Python to send query or close transaction
Affected queries (all SELECTs):
SELECT email_team_members...SELECT a2a_agents...SELECT gateways...SELECT prompts...
Root Cause
SQLAlchemy is configured with autocommit=False:
# mcpgateway/db.py:415
SessionLocal = sessionmaker(
autocommit=False, # ← Every query starts a transaction
autoflush=False,
expire_on_commit=False,
bind=engine
)When a request handler executes a SELECT query, an implicit BEGIN starts. The transaction stays open until the request completes.
Impact
| Issue | Impact |
|---|---|
| Connection pool waste | 62 of 100 connections tied up |
| Lock contention | Long transactions hold row locks |
| Replication lag | Open transactions prevent WAL cleanup |
Proposed Solutions
Option 1: Use AUTOCOMMIT for Read-Only Operations (Recommended)
read_engine = create_engine(
settings.database_url,
isolation_level="AUTOCOMMIT", # No transactions for SELECTs
)
def get_read_db() -> Generator[Session, Any, None]:
"""Read-only session with AUTOCOMMIT."""
db = ReadOnlySessionLocal()
try:
yield db
finally:
db.close()Option 2: Set PostgreSQL idle_in_transaction_session_timeout
postgres:
command: >
postgres
-c idle_in_transaction_session_timeout=30000Recommended Implementation
- Immediate: Add
idle_in_transaction_session_timeout=30000to PostgreSQL - Short-term: Create
get_read_db()dependency for read-only endpoints - Audit: Migrate read-only endpoints to use
get_read_db()
Acceptance Criteria
-
idle in transactionconnections reduced from 60+ to <5 - No increase in query errors
- PostgreSQL
idle_in_transaction_session_timeoutconfigured -
get_read_db()dependency created for read-only operations - Major read endpoints migrated to use
get_read_db()
Files to Modify
| File | Change |
|---|---|
docker-compose.yml |
Add PostgreSQL timeout setting |
mcpgateway/db.py |
Add read-only session factory |
mcpgateway/routers/*.py |
Migrate GET endpoints to get_read_db() |
Performance Impact
| Metric | Before | After (Expected) |
|---|---|---|
| Idle in transaction | 60+ | <5 |
| Available connections | ~40 | ~95 |
| Query throughput | Limited | +20-30% |
Related
- [PERFORMANCE]: Connection Pool Health Monitoring and Readiness Integration #1856 - Connection Pool Health Monitoring
- [PERFORMANCE]: Async Database Logging to Prevent Feedback Loop Under Load #1857 - Async Database Logging
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
SHOULDP2: Important but not vital; high-value items that are not crucial for the immediate releaseP2: Important but not vital; high-value items that are not crucial for the immediate releasedatabaseperformancePerformance related itemsPerformance related items