Skip to content

[TESTING]: Locust load test reports false failures for 409 Conflict on state change endpoints #2566

@crivetimihai

Description

@crivetimihai

✅ Test Summary

The Locust load test (tests/loadtest/locustfile.py) incorrectly reports failures when state change endpoints return 409 Conflict. This is expected behavior under high concurrency due to optimistic locking, not an actual failure.


🧪 Test Type

  • Integration / end-to-end tests
  • Other: Load testing (Locust)

🧬 Scope & Affected Components

  • mcpgateway core (API logic, handlers)
  • Other: Load testing infrastructure

🐞 Problem

When running make load-test-ui with 4000 concurrent users, the test reports ~100+ failures like:

CatchResponseError('Expected [200, 403, 404], got 409')

Affected endpoints:

Endpoint Occurrences
/servers/[id]/state ~91
/tools/[id]/state ~25
/resources/[id]/state ~3

🔍 Root Cause

The 409 errors occur due to race conditions when multiple users try to toggle the same entity's state simultaneously:

Time    User A                      User B                      Result
────────────────────────────────────────────────────────────────────────
T1      POST /servers/abc/state     POST /servers/abc/state     
T2      Read: enabled=true          Read: enabled=true          
T3      Write: enabled=false ✅     (waiting for lock)          
T4                                  Write conflict → 409 ❌      

The server correctly returns 409 Conflict to prevent lost updates (optimistic locking). This is correct behavior, not a bug.

🔧 Fix

The state change functions in locustfile.py need to include 409 in their allowed response codes:

Current code (lines 1261, 1276, 1291, 1306, 1321):

self._validate_json_response(response, allowed_codes=[200, 403, 404])

Fixed code:

self._validate_json_response(response, allowed_codes=[200, 403, 404, 409])

Functions to update:

  • set_server_state() - line 1261
  • set_tool_state() - line 1276
  • set_resource_state() - line 1291
  • set_prompt_state() - line 1306
  • set_gateway_state() - line 1321

📋 Acceptance Criteria

  • Add 409 to allowed_codes for all 5 state change functions
  • Update comments to explain why 409 is acceptable (concurrent state changes)
  • Load test with 4000 users shows no false failures for state endpoints
  • Code passes make verify

📓 Additional Context

Load test metrics showing the issue:

Total Requests: 825,874
Total Failures: 1,000 (0.121%)
  - 409 Conflict on state changes: ~116 (11.6% of failures)

The 409 errors represent only 0.014% of total requests - this is expected and healthy behavior under concurrent load, not a system failure.


🧠 Environment Info

Key Value
Gateway version main branch
Python version 3.12
Load test tool Locust
Concurrent users 4000
Platform Docker Compose (3 gateway replicas)

📎 Related

  • File: tests/loadtest/locustfile.py
  • Functions: set_server_state, set_tool_state, set_resource_state, set_prompt_state, set_gateway_state

Metadata

Metadata

Assignees

Labels

COULDP3: Nice-to-have features with minimal impact if left out; included if time permitstestingTesting (unit, e2e, manual, automated, etc)

Type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions