Make proxy response timeout configurable#4063
Conversation
3a51b39 to
5e34bdd
Compare
5e34bdd to
0d267f7
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #4063 +/- ##
==========================================
- Coverage 68.65% 68.62% -0.03%
==========================================
Files 454 458 +4
Lines 46051 46243 +192
==========================================
+ Hits 31616 31735 +119
- Misses 11994 12064 +70
- Partials 2441 2444 +3 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
0d267f7 to
6071d98
Compare
6071d98 to
04f948a
Compare
aponcedeleonch
left a comment
There was a problem hiding this comment.
Looks good, thanks for the contribution!
Suggestion: Per-MCP-Server Request Timeout (follow-up)
The global env var solves the immediate problem well. As a natural next step, it would be valuable to support a per-server --proxy-request-timeout flag on thv run, since different MCP servers can have very different response characteristics. The env var would remain as a global fallback.
This can be done as a separate follow-up — just leaving breadcrumbs on how it fits into the existing architecture.
Precedence
--proxy-request-timeout flag > TOOLHIVE_PROXY_REQUEST_TIMEOUT env var > 60s default
Implementation sketch
The pattern already exists for other per-server settings (--trust-proxy-headers, --endpoint-prefix). It touches ~7 files with small additions each:
pkg/runner/config.go — add field to RunConfig:
ProxyRequestTimeout time.Duration `json:"proxy_request_timeout,omitempty" yaml:"proxy_request_timeout,omitempty"`pkg/runner/config_builder.go — add builder option:
func WithProxyRequestTimeout(d time.Duration) RunConfigBuilderOption {
return func(c *RunConfig) { c.ProxyRequestTimeout = d }
}cmd/thv/app/run_flags.go — add CLI flag:
cmd.Flags().DurationVar(&f.ProxyRequestTimeout, "proxy-request-timeout", 0,
"Request timeout for the streamable HTTP proxy used by stdio servers (e.g. 2m, 120s). 0 uses global default.")pkg/transport/proxy/streamable/streamable_proxy.go — accept per-server value, fall back to env then default:
func resolveRequestTimeout(perServer time.Duration) time.Duration {
if perServer > 0 {
return perServer
}
if v := os.Getenv(proxyRequestTimeoutEnv); v != "" {
if d, err := time.ParseDuration(v); err == nil && d > 0 {
return d
}
}
return defaultRequestTimeout
}The remaining files (types/transport.go, runner.go, http.go) just propagate the value through the existing config plumbing.
🤖 Generated with Claude Code
04f948a to
0d28989
Compare
|
I rebased to check if the errors in the e2e tests were fixed by some recent merged PRs on |
0d28989 to
084cb2c
Compare
|
@reyortiz3 @aponcedeleonch Sorry for the approval reset — I had to push a fix to a test. The only change since your reviews:
-func TestNewHTTPProxyDefaultTimeout(t *testing.T) {
- t.Parallel()
+// Not parallel because t.Setenv is needed to clear any ambient
+// TOOLHIVE_PROXY_REQUEST_TIMEOUT from the test runner's environment.
+func TestNewHTTPProxyDefaultTimeout(t *testing.T) { //nolint:paralleltest
+ t.Setenv(proxyRequestTimeoutEnv, "")
proxy := NewHTTPProxy("localhost", 0, nil)
assert.Equal(t, defaultRequestTimeout, proxy.requestTimeout)
}No production code changes — just the one test. CI is running now. Would appreciate a re-approval when you get a chance. |
The streamable HTTP proxy used by stdio transport had a hardcoded 30-second timeout. Any MCP tool call exceeding 30 seconds returned a timeout error, even though the backend was still processing. Bump the default to 60s (matching the MCP SDK convention) and add TOOLHIVE_PROXY_REQUEST_TIMEOUT environment variable for users who need a different value. This follows the existing TOOLHIVE_REMOTE_HEALTHCHECKS pattern for env var overrides. Fixes stacklok#4061 Signed-off-by: Greg Katz <gkatz@indeed.com>
084cb2c to
99bdfbf
Compare
Signed-off-by: Greg Katz <gkatz@indeed.com>
Summary
TOOLHIVE_PROXY_RESPONSE_TIMEOUTenvironment variable for users who need a different value. Follows the existingTOOLHIVE_REMOTE_HEALTHCHECKSpattern for env var overrides.Fixes #4061
Type of change
Test plan
task test)task lint-fix)Built a custom stdio MCP server with a 35-second delay on tool calls.
Verified the fix in both directions:
TOOLHIVE_PROXY_RESPONSE_TIMEOUT=30s+ 35s toolTOOLHIVE_PROXY_RESPONSE_TIMEOUT=1us+ any toolChanges
pkg/transport/proxy/streamable/streamable_proxy.goresponseTimeoutfield, addresolveResponseTimeout(), update 3 usage sitespkg/transport/proxy/streamable/streamable_proxy_test.godocs/arch/03-transport-architecture.mdTOOLHIVE_PROXY_RESPONSE_TIMEOUTenv varDoes this introduce a user-facing change?
The default proxy response timeout increases from 30 seconds to 5 minutes. Users who were hitting 504 timeouts on slow MCP tools will no longer see them without any configuration change. Users who need a different timeout can set
TOOLHIVE_PROXY_RESPONSE_TIMEOUT(e.g.,10m,120s).Special notes for reviewers
pkg/transport/proxy/streamable/). The transparent proxy used by--transport sseand--transport streamable-httphas no such timeout.NewHTTPProxy) and stored as a struct field, avoiding repeatedos.Getenvcalls on every request.