Skip to content

fix(mcp): validate remote URLs up-front with a clear error#18132

Closed
teknium1 wants to merge 1 commit into
mainfrom
opencode-port/invalid-mcp-url
Closed

fix(mcp): validate remote URLs up-front with a clear error#18132
teknium1 wants to merge 1 commit into
mainfrom
opencode-port/invalid-mcp-url

Conversation

@teknium1

@teknium1 teknium1 commented May 1, 2026

Copy link
Copy Markdown
Contributor

Summary

A typo in config.yaml (missing scheme, wrong scheme, empty string) now fails fast with a clear error naming the offending MCP server, instead of burning _MAX_INITIAL_CONNECT_RETRIES × exponential backoff inside the transport layer with an opaque exception.

Ported from anomalyco/opencode#25019 ("fix: handle invalid mcp urls").

Root cause

tools/mcp_tool.py _run_http() passes config["url"] verbatim to httpx.URL(url) and the streamable-http client. A malformed URL raises a generic exception deep in the transport layer which goes through the reconnect-backoff loop (while True: try: ... except Exception:) — so a bad URL costs about a minute of pointless retries with a confusing error before the server is finally marked failed.

Changes

File What
tools/mcp_tool.py New InvalidMcpUrlError(ValueError) exception. New _validate_remote_mcp_url(server_name, url) helper (scheme must be http/https, host must be non-empty, must be a string). Wired into MCPServerTask.run() before the retry loop — malformed URLs short-circuit immediately with _error set and _ready fired so start() re-raises the clean error.
tests/tools/test_mcp_invalid_url.py 21 new tests: valid URL acceptance (incl. IPv6, ports, query), every rejection path, InvalidMcpUrlError is a ValueError, error message names the server.

Validation

Before After
Bad URL (not-a-valid-url) ~60s of silent retries, opaque error 0.001s, Invalid MCP URL for 'broken-example': scheme must be http or https…
Valid URL Works Works (unchanged)
stdio server (command, no url) Skipped Still skipped (_is_http() is False)
  • scripts/run_tests.sh tests/tools/test_mcp_invalid_url.py → 21/21 passed.
  • scripts/run_tests.sh tests/tools/test_mcp_stability.py tests/tools/test_mcp_tool.py → 198/198 passed (no regression in existing MCP tests).
  • Live E2E (subprocess import with fresh HERMES_HOME): bad URL → InvalidMcpUrlError in <1ms.

Architectural note

OpenCode's fix returned { client: undefined, status: { status: "failed", error: "Invalid MCP URL…" } }. Hermes uses the _error / _ready pattern for the same outcome — the caller start() awaits _ready and re-raises _error, so hermes mcp list shows the server as failed with the clear message. No new public API surface.

Source

anomalyco/opencode#25019

Port from anomalyco/opencode#25019 ("fix: handle invalid mcp urls").

Previously: a typo in `config.yaml` (missing scheme, wrong scheme,
empty string, non-string value) slipped past `_is_http()` and hit
`httpx.URL(url)` or `streamablehttp_client(url, ...)` deep in the
transport layer. That raised a generic exception which went through
the reconnect-backoff loop, so a bad URL caused _MAX_INITIAL_CONNECT_RETRIES
attempts with doubling backoff — about a minute of pointless retries
plus an opaque error — before the server was marked failed.

Now: we validate the URL once, at the top of `run()`, before
entering the retry loop. A malformed URL raises `InvalidMcpUrlError`
(a `ValueError` subclass) with a message that names the offending
server and explains exactly what was wrong. `_ready` is set and
`_error` is populated, so `start()` re-raises and the server shows
up as failed in `hermes mcp list` without any backoff burn.

Validation rules:
- Must be a string (rejects None, dict, int)
- Must be non-empty (rejects '' and whitespace-only)
- Scheme must be http or https (rejects file://, ws://, stdio://)
- Must have a non-empty host (rejects http:///, http://:8080)

Tests (21 new cases in tests/tools/test_mcp_invalid_url.py):
- TestValidUrlsAccepted: http, https, IPv6, ports, paths, query strings
- TestInvalidUrlsRejected: every rejection path above + clear error text
- TestErrorIsValueError: downstream code catching ValueError still works

E2E verified: a misconfigured server with `url: not-a-valid-url`
now fails in <0.001s with the clear error, instead of minutes of retries.

Doesn't touch stdio servers (they use `command`, not `url`) — the
validator only fires when `_is_http()` returns True.
@alt-glitch alt-glitch added type/bug Something isn't working tool/mcp MCP client and OAuth P3 Low — cosmetic, nice to have labels May 1, 2026
@teknium1

Copy link
Copy Markdown
Contributor Author

Closing in favor of the merged salvage PR. Conflict resolution: main added MCP image-block helpers between this PR's branch and now, at the same insertion point; both blocks kept since they're independent concerns. 21 new tests pass, 193 regression tests pass, E2E confirms <0.01ms rejection vs the ~60s of pointless retries the old path caused.

@teknium1 teknium1 closed this May 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

P3 Low — cosmetic, nice to have tool/mcp MCP client and OAuth type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants