Skip to content

Query and code performance optimizations in services#1527

Merged
madhav165 merged 24 commits intomainfrom
query_performance_optimizations
Dec 2, 2025
Merged

Query and code performance optimizations in services#1527
madhav165 merged 24 commits intomainfrom
query_performance_optimizations

Conversation

@kevalmahajan
Copy link
Copy Markdown
Member

@kevalmahajan kevalmahajan commented Dec 1, 2025

🐛 Bug-fix PR

Closes #1522
Closes #1523

📌 Summary

This PR resolves major performance degradation across gateway_service.py, tool_service.py, and server_service.py.
Implemented concurrent Gateway health checks with batches and configurable batch sizes.

🐞 Root Cause

  • N+1 DB query patterns when resolving team names and related entities.
  • One query per item for tools, resources, prompts, and servers.
  • Sequential health checks for gateways (O(n * t) execution).
  • Multiple redundant aggregation queries for metrics (7–8 per request).

💡 Fix Description

  1. gateway_service.py
  • Eliminated N+1 Queries:Batch-fetched team names in list_gateways() and list_gateways_for_user(), reducing lookups from O(n) to O(1).
  • Refactored _update_or_create_tools(), _update_or_create_resources(), and _update_or_create_prompts() to use bulk fetching (IN clause), reducing queries from O(n) to O(1) for entity creation/updates.
  • Refactored check_health_of_gateways() to Use asyncio.gather() for Parallel Execution
    Updated the check_health_of_gateways() function to leverage asyncio.gather() for executing health checks concurrently.
    Introduced a dynamic concurrency_limit that adapts based on system capabilities. The new limit is calculated as the minimum of the configured MAX_CONCURRENT_HEALTH_CHECKS and an adaptive value based on the system's CPU count.
    This ensures that the system doesn't overload when the specified concurrent checks exceed the system's capacity.
concurrency_limit = min(settings.max_concurrent_health_checks, max(10, os.cpu_count() * 5))  # adaptive concurrency
  1. tool_service.py:
  • aggregate_metrics(): Reduced from 8 separate queries to a single, aggregated SQL query (87.5% reduction in network round-trips).
  • Batch-fetched team names in list_tools() and list_tools_for_user(), reducing queries from ~101 to ~2 for 100 tools (98% reduction).
  1. server_service.py
  • Batch-fetched team names in list_servers() and list_servers_for_user(), reducing queries from O(n) to O(1) (up to ~100x faster).
  • aggregate_metrics(): Reduced from 7 queries to a single query (85.7% reduction, 7x faster).
  • Implemented bulk validation/update queries in register_server() and update_server(), resulting in ~4.5 speedup for typical cases.
  • Optimized _convert_server_to_read() for single-pass metrics calculation, reducing iterations from 8 to 1 (~8x faster).

🧪 Verification

Check Command Status
Lint suite make lint
Unit tests make test
Coverage ≥ 90 % make coverage
Manual regression no longer fails steps / screenshots

📐 MCP Compliance (if relevant)

  • Matches current MCP spec
  • No breaking change to MCP clients

✅ Checklist

  • Code formatted (make black isort pre-commit)
  • No secrets/credentials committed

@kevalmahajan kevalmahajan marked this pull request as draft December 1, 2025 15:09
@kevalmahajan kevalmahajan marked this pull request as ready for review December 1, 2025 16:53
Copy link
Copy Markdown
Collaborator

@madhav165 madhav165 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normal functionality works as expected

Signed-off-by: Keval Mahajan <mahajankeval23@gmail.com>
Signed-off-by: Keval Mahajan <mahajankeval23@gmail.com>
Signed-off-by: Keval Mahajan <mahajankeval23@gmail.com>
Signed-off-by: Keval Mahajan <mahajankeval23@gmail.com>
Signed-off-by: Keval Mahajan <mahajankeval23@gmail.com>
Signed-off-by: Keval Mahajan <mahajankeval23@gmail.com>
Signed-off-by: Keval Mahajan <mahajankeval23@gmail.com>
Signed-off-by: Keval Mahajan <mahajankeval23@gmail.com>
Signed-off-by: Keval Mahajan <mahajankeval23@gmail.com>
Signed-off-by: Keval Mahajan <mahajankeval23@gmail.com>
Signed-off-by: Keval Mahajan <mahajankeval23@gmail.com>
Signed-off-by: Keval Mahajan <mahajankeval23@gmail.com>
Signed-off-by: Keval Mahajan <mahajankeval23@gmail.com>
Signed-off-by: Keval Mahajan <mahajankeval23@gmail.com>
Signed-off-by: Keval Mahajan <mahajankeval23@gmail.com>
Signed-off-by: Keval Mahajan <mahajankeval23@gmail.com>
Signed-off-by: Keval Mahajan <mahajankeval23@gmail.com>
Signed-off-by: Keval Mahajan <mahajankeval23@gmail.com>
Signed-off-by: Keval Mahajan <mahajankeval23@gmail.com>
Signed-off-by: Keval Mahajan <mahajankeval23@gmail.com>
Signed-off-by: Keval Mahajan <mahajankeval23@gmail.com>
Signed-off-by: Keval Mahajan <mahajankeval23@gmail.com>
Signed-off-by: Keval Mahajan <mahajankeval23@gmail.com>
Signed-off-by: Keval Mahajan <mahajankeval23@gmail.com>
@kevalmahajan kevalmahajan force-pushed the query_performance_optimizations branch from d15df51 to e6b44fa Compare December 2, 2025 11:46
@madhav165 madhav165 merged commit cd6d98d into main Dec 2, 2025
45 checks passed
@madhav165 madhav165 deleted the query_performance_optimizations branch December 2, 2025 11:57
kcostell06 pushed a commit to kcostell06/mcp-context-forge that referenced this pull request Feb 24, 2026
Query and code performance optimizations in services
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG][PERFORMANCE]: Severe performance degradation due to N+1 queries [BUG][PERFORMANCE]: Implement concurrent health checks for gateways

2 participants