Skip to content

[ENHANCEMENT][OBSERVABILITY]: Preserve timeout error message through ExceptionGroup unwrapping in tool invocations #2782

@crivetimihai

Description

@crivetimihai

Context

Reported in #2781 — when a tool invocation times out via StreamableHTTP (or SSE) transport, the error response returned to the client contains an empty message:

Tool invocation failed:

The descriptive timeout message ("Tool invocation timed out after 60s") is generated but lost during exception propagation.

Root Cause

The timeout handling in tool_service.py correctly raises ToolTimeoutError("Tool invocation timed out after {effective_timeout}s") at line 3364. However, this raise occurs inside nested async with context managers (streamablehttp_client, ClientSession). During __aexit__ cleanup, the MCP SDK's internal TaskGroup may raise additional exceptions from cancelled tasks, causing Python 3.11+ to wrap everything in a BaseExceptionGroup.

In the outer invoke_tool handler:

  1. except ToolTimeoutError (line 3577) does not match a BaseExceptionGroup containing a ToolTimeoutError
  2. Falls through to except BaseException (line 3586)
  3. Root cause extraction (lines 3589-3592) unwraps the group but finds the original bare TimeoutError (no message) rather than the descriptive ToolTimeoutError
  4. str(TimeoutError()) returns "" → empty error message at line 3614

Proposed Enhancement

Improve the except BaseException handler to detect timeout root causes and provide a descriptive fallback message:

except BaseException as e:
    root_cause = e
    if isinstance(e, BaseExceptionGroup):
        while isinstance(root_cause, BaseExceptionGroup) and root_cause.exceptions:
            root_cause = root_cause.exceptions[0]
    error_message = str(root_cause)
    # Preserve timeout context when message is lost during ExceptionGroup wrapping
    if not error_message and isinstance(root_cause, (TimeoutError, asyncio.TimeoutError)):
        error_message = f"Tool invocation timed out after {effective_timeout}s"

Additionally, consider using except* (Python 3.11+ ExceptionGroup matching) to catch ToolTimeoutError even when wrapped in an ExceptionGroup.

Affected Code

  • mcpgateway/services/tool_service.py — lines 3586-3614 (invoke_tool outer exception handler)
  • Same pattern exists in the SSE transport path (lines ~3176-3217) and REST path (lines ~3900-3939)

Impact

Low severity — the timeout enforcement works correctly, only the error message returned to clients is empty. Users cannot easily diagnose that a timeout occurred or what the timeout value was without checking server logs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    COULDP3: Nice-to-have features with minimal impact if left out; included if time permitsenhancementNew feature or requestobservabilityObservability, logging, monitoringpythonPython / backend development (FastAPI)

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions