Fix Orphaned Gateway Registration Bug along with Isolating Structured log Session and Handling Flush Errors#3001
Conversation
6a41c5e to
a2e8a6e
Compare
a2e8a6e to
9aa2761
Compare
…gger session - Add db.rollback() to exception handlers in gateway_service.py to prevent orphaned gateway records when registration fails after flush - Isolate structured logger DB persistence using fresh_db_session() to avoid accidentally committing caller's dirty session state - Propagate flush exceptions in before_commit_handler instead of swallowing them so commit failures trigger proper rollback - Remove db parameter from structured_logger.log() signature since the logger now manages its own independent session - Update tests to verify new session isolation and exception propagation Closes #2987 Signed-off-by: Keval Mahajan <mahajankeval23@gmail.com> Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
9aa2761 to
5c48bb6
Compare
crivetimihai
left a comment
There was a problem hiding this comment.
Review: Approved
The core fix is correct and well-reasoned. The root cause chain — missing db.rollback() → structured logger committing caller's dirty session → before_commit_handler swallowing flush errors — is accurately identified and all three links are properly addressed.
Changes made during rebase
Squashed the 4 original commits into a single clean commit preserving original author, and applied the following fixes:
1. Completed db=db cleanup (12 remaining call sites)
The original PR removed db=db from structured_logger.log() callers in several files but missed 12 call sites across 5 files. Since log() now accepts **kwargs, leftover db=db would leak a SQLAlchemy Session object into the log entry dict via entry.update(kwargs). Removed from:
mcpgateway/admin.py— 3 calls (plugin marketplace error handlers)mcpgateway/services/gateway_service.py— 4 calls (get_gateway,delete_gatewaysuccess + exception handlers)mcpgateway/services/prompt_service.py— 2 calls (bulk registration,get_prompt)mcpgateway/services/resource_service.py— 2 calls (bulk registration,get_resource)mcpgateway/services/tool_service.py— 1 call (get_tool)
Verified zero remaining via AST scan.
2. Fixed stale rollback test + added missing rollback assertions
test_register_gateway_exception_rollback: Updated stale comment ("method doesn't actually call rollback") and changedassert_called()→assert_called_once()for strictness.- Added
test_db.rollback = Mock()setup andtest_db.rollback.assert_called_once()assertions totest_register_gateway_value_error,test_register_gateway_runtime_error, andtest_register_gateway_integrity_error— these exercise the core bug fix paths.
3. Removed vacuous test assertions
Two tests (test_logging_service_comprehensive.py, test_structured_logger.py) created caller_db = MagicMock() but never passed it to _persist_to_database(), making caller_db.add.assert_not_called() always-true. Replaced with mock_fresh.assert_called_once() to validate fresh session usage.
What was verified
- All 482+ affected tests pass (gateway_service, structured_logger, logging_service_comprehensive, test_db)
- AST scan confirms zero
structured_loggercalls withdb=dbremain fresh_db_session()correctly auto-commits viaget_db()context manager — no explicitcommit()neededbefore_commit_handlerexception propagation is safe —get_db()has proper nested try/except/finally with rollback + invalidate fallback
…gger session (#3001) - Add db.rollback() to exception handlers in gateway_service.py to prevent orphaned gateway records when registration fails after flush - Isolate structured logger DB persistence using fresh_db_session() to avoid accidentally committing caller's dirty session state - Propagate flush exceptions in before_commit_handler instead of swallowing them so commit failures trigger proper rollback - Remove db parameter from structured_logger.log() signature since the logger now manages its own independent session - Update tests to verify new session isolation and exception propagation Closes #2987 Signed-off-by: Keval Mahajan <mahajankeval23@gmail.com> Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> Signed-off-by: Vishu Bhatnagar <vishu.bhatnagar@ibm.com>
…gger session (#3001) - Add db.rollback() to exception handlers in gateway_service.py to prevent orphaned gateway records when registration fails after flush - Isolate structured logger DB persistence using fresh_db_session() to avoid accidentally committing caller's dirty session state - Propagate flush exceptions in before_commit_handler instead of swallowing them so commit failures trigger proper rollback - Remove db parameter from structured_logger.log() signature since the logger now manages its own independent session - Update tests to verify new session isolation and exception propagation Closes #2987 Signed-off-by: Keval Mahajan <mahajankeval23@gmail.com> Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
🐛 Bug-fix PR
Closes #2987
📌 Summary
Fixes a bug where gateway registration failures (e.g., tool schema validation errors) leave orphaned gateway records in the database, blocking subsequent registration attempts with the same name.
The root cause is missing db.rollback() calls in exception handlers, along with two other contributing issues:
Key changes:
🔁 Reproduction Steps
🐞 Root Cause
Three issues combine to cause the bug:
Primary cause: Missing
db.rollback()in exception handlers:In
register_gateway(), afterdb.add(db_gateway)anddb.flush()succeed (persisting the gateway row), tool validation fails duringdb.flush()of tool records. The except* handlers catch the error but never calldb.rollback(), leaving the gateway object in the session's dirty state.Contributing factor 1:
before_commit_handlerswallows exceptions:db.py's
before_commit_handlerwrapssession.flush()in a bare except Exception: pass, silently discarding any flush failure during commit. This means even whendb.commit()is eventually called, the failed flush doesn't cause the commit to fail.Contributing factor 2: Structured logger commits caller's session:
structured_logger._persist_to_database()was called with the caller's db session and calleddb.add(log_entry)+db.commit()on it. This inadvertently committed the orphaned gateway record along with the log entry.The chain:
db.flush()fails on tools -> exception handler skips rollback ->structured_logger.log(db=db)callsdb.commit()on same session -> before_commit_handler swallows re-flush failure -> gateway record persisted without tools -> name permanently reserved.💡 Fix Description
Added
db.rollback()to all exception handlers (gateway_service.py)Added explicit
db.rollback()calls before any logging or re-raising in all exception handlers across 4 methods:update_gateway- 3 exception handlers (name conflict, not found, integrity error)set_gateway_state- 1 exception handler (permission error)fetch_tools_after_oauth- 2 exception handlers (connection error, general exception)db.rollback()is always safe to call - it's a no-op on a clean session with no pending changes.Isolated structured logger persistence (structured_logger.py)
Changed
_persist_to_database()to usefresh_db_session()instead of the caller's session. This ensures:fresh_db_sessionhandles commit/rollback/close)Propagate flush exceptions in before_commit_handler (db.py)
Removed the bare except Exception: pass from before_commit_handler. The handler now simply calls
session.flush()without catching exceptions, so flush failures properly cause the commit to fail, triggering rollback inget_db().Updated tests
test_structured_logger.py- Updated 7 tests to mock fresh_db_session instead of mock_db; replaced 2 tests with fresh-session-aware equivalentstest_logging_service_comprehensive.py- Updated 2 tests for fresh_db_session patterntest_db.py- Updated before_commit_handler test to verify exception propagation instead of swallowing🧪 Verification
make lintmake testmake coverage📐 MCP Compliance (if relevant)
✅ Checklist
make black isort pre-commit)