Skip to content

[PERFORMANCE]: Reduce CPU cost of detailed request logging #1865

@crivetimihai

Description

@crivetimihai

Summary

Detailed request logging (log_detailed_requests / settings.log_requests) performs expensive operations on every request:

  • Recursive payload masking via mask_sensitive_data()
  • Pretty-printed JSON serialization with json.dumps(..., indent=2)
  • Full request body reads and JSON parsing with stdlib json

Additionally, user identity resolution happens before skip checks and before the detailed logging gate, so it can run even when a request should be skipped (health/static) or when gateway logging is disabled.

Critical Issues Identified

Issue 1: User Identity Resolution Before Skip/Gating (CRITICAL)

Location: mcpgateway/middleware/request_logging_middleware.py:206

The _resolve_user_identity method (lines 146-182) can trigger a database query via get_current_user() if there's no cached user in request.state. It currently runs before:

  • should_skip_request_logging(path) filtering
  • the log_detailed_requests check
  • and even when gateway logging is disabled

Impact: Potential database query for every request (except when request.state.user is already set), plus unnecessary work for skipped paths.

Issue 2: Wrong Configuration Used for Body Size

Location: mcpgateway/main.py:1346

log_max_size_mb (config.py:816) is meant for log file rotation, not for request body truncation.

Setting Default Intended Use Actual Use
log_max_size_mb 1 MB Log file rotation size Body truncation limit
Middleware default 4096 bytes Body truncation Overridden to 1MB

When detailed logging is enabled, up to 1MB of request body can be parsed, masked recursively, and pretty-printed per request.

Issue 3: Uses stdlib json Instead of orjson

Location: mcpgateway/middleware/request_logging_middleware.py:16, 282-285

  • json.loads: ~1.5-2x slower than orjson
  • json.dumps(indent=2): ~5-6x slower than orjson

Issue 4: Recursive Masking Without Depth Limit

Location: mcpgateway/middleware/request_logging_middleware.py:44-57

  • O(n) traversal where n = total nodes in nested structure
  • No depth limit protection (could stack overflow on deeply nested payloads)

Impact Analysis

Factor Impact Severity
User identity resolution DB query on most requests; avoidable for skipped paths CRITICAL
Body size (1MB default) Parsing/masking up to 1MB per request HIGH
JSON parsing 1.5-2x slower than orjson MEDIUM
JSON serialization (indent=2) 5-6x slower than orjson HIGH
Recursive masking O(n) per request MEDIUM

Proposed Solution

Phase 1: Critical Fixes (High Priority)

  1. Move user identity resolution after checks - Only resolve when actually logging
  2. Add dedicated body size config - log_detailed_max_body_size (default 16KB)
  3. Switch to orjson - Replace stdlib json with orjson

Phase 2: Additional Optimizations (Medium Priority)

  1. Size-based fast path - Check content-length header before reading large bodies
  2. Configurable endpoint exclusions - log_detailed_skip_endpoints
  3. Fix db_query_logging.py - Also uses stdlib json

Phase 3: Masking Optimization (Lower Priority)

  1. Add depth limit to mask_sensitive_data()
  2. Use frozenset for SENSITIVE_KEYS

New Configuration Options

log_detailed_max_body_size: int = Field(default=16384)  # 16KB
log_detailed_skip_endpoints: List[str] = Field(default=[])
log_resolve_user_identity: bool = Field(default=False)  # Opt-in DB fallback
log_detailed_sample_rate: float = Field(default=1.0)  # Optional sampling

Files Requiring Changes

File Priority
mcpgateway/middleware/request_logging_middleware.py CRITICAL/HIGH
mcpgateway/config.py HIGH
mcpgateway/main.py HIGH
mcpgateway/middleware/db_query_logging.py MEDIUM
tests/unit/mcpgateway/middleware/test_request_logging_middleware.py HIGH

Expected Performance Improvement

Metric Before After Improvement
DB queries (logging disabled) 1 per request 0 100% reduction
Serialization (large) 100ms 16-18ms 5-6x faster
Deserialization 50ms 23-33ms 1.5-2x faster
Max body processed 1MB 16KB 64x less data

Overall: 50-80% reduction in logging overhead depending on configuration.

Acceptance Criteria

  • User identity resolution moved after logging checks
  • Add log_detailed_max_body_size config option (default 16KB)
  • Fix main.py to use new config instead of log_max_size_mb
  • Replace json with orjson in request_logging_middleware.py
  • Add size-based fast path for very large bodies
  • Add depth limit to mask_sensitive_data()
  • Replace json.dumps with orjson.dumps in db_query_logging.py
  • Update unit tests
  • Document new config options in .env.example, docs, and Helm chart

References

  • ADR-017: docs/docs/architecture/adr/017-adopt-orjson-json-serialization.md
  • Benchmark: scripts/benchmark_json_serialization.py
  • Detailed analysis: todo/1808-reduce-detailed-logging.md

Metadata

Metadata

Labels

databaseperformancePerformance related itemspythonPython / backend development (FastAPI)

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions