fix(agent): honor HERMES_TLS_MAX_VERSION to cap provider TLS handshakes#44392
fix(agent): honor HERMES_TLS_MAX_VERSION to cap provider TLS handshakes#44392AIalliAI wants to merge 4 commits into
Conversation
Some CDN edges and middleboxes accept TLS 1.2 handshakes but kill TLS 1.3 ClientHellos, surfacing as [SSL: UNEXPECTED_EOF_WHILE_READING] ~15s into every request while curl (OS TLS stack) works fine. NousResearch#44365 hit this on Windows desktop against api.deepseek.com: TLS1.2-only connects in 0.33s, TLS1.3 dies after 15s. Setting HERMES_TLS_MAX_VERSION=1.2 now caps the handshake on the primary chat client. The ssl context is applied to the keepalive HTTPTransport directly (httpx ignores client-level verify when an explicit transport is passed) and to the Client so internally-built proxy mounts inherit the same cap. The context honors the existing CA-bundle overrides (HERMES_CA_BUNDLE > REQUESTS_CA_BUNDLE > SSL_CERT_FILE). 1.0/1.1 are deliberately rejected; invalid values log a warning and fall back to defaults. Fixes NousResearch#44365 Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
✅ Code Review — CleanReviewed the full diff (3 files: What I verified:
No issues found. LGTM. |
…onfig.yaml Per the AGENTS.md contribution rubric, behavioral settings belong in config.yaml, not new HERMES_* env vars. The user-facing knob is now network.tls_max_version (sibling of network.force_ipv4 — same "connectivity workarounds" section), bridged onto the internal HERMES_TLS_MAX_VERSION env var at process startup by hermes_constants.apply_tls_max_version(), following the established gateway.strict -> HERMES_MEDIA_DELIVERY_STRICT bridge pattern. The env var remains the mechanism (agent/process_bootstrap has no config access at client-build time, and spawned agent subprocesses must inherit the cap) and an explicitly exported value still wins over config.yaml for one-off shell overrides. Bridged in both entrypoints that already apply network.force_ipv4: the hermes_cli/main.py early raw-yaml block (covers CLI, desktop dashboard spawns, TUI gateway) and the gateway/run.py bootstrap. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
|
Reworked in 1cb8df1 per the AGENTS.md contribution rubric ("behavioral settings go in config.yaml, not new |
check-attribution flagged the email introduced by this branch's 2026-06-12 follow-up commit. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
# Conflicts: # scripts/release.py
|
Requesting maintainer review — this is ready to land from my side. Just merge-synced with current main (the only conflict was a trivial AUTHOR_MAP keep-both in scripts/release.py); the PR's touched test files pass locally on the merged head. Standalone fork CI is pending first-run approval here; the rollup branch in #44061 carrying this session's batch is fully green on upstream CI. |
Summary
#44365 reports intermittent DeepSeek connection failures on Windows desktop: the bundled Python/OpenSSL dies with
[SSL: UNEXPECTED_EOF_WHILE_READING]~15s into every TLS 1.3 handshake againstapi.deepseek.com, while a TLS 1.2-only handshake succeeds in 0.33s (andcurl/PowerShell work because they use the OS TLS stack, not OpenSSL). This is a known class of CDN-edge/middlebox behavior — the server (or something in the path) accepts TLS 1.2 ClientHellos but kills TLS 1.3 ones.This adds the escape hatch the issue asks for, as a config.yaml setting:
Implementation
hermes_cli/config.py: newnetwork.tls_max_versionkey (default""= OpenSSL default), sibling ofnetwork.force_ipv4in the existing "connectivity workarounds" section. Per the AGENTS.md contribution rubric, the user-facing surface is config.yaml — the env var below is an internal bridge.hermes_constants.py:apply_tls_max_version()bridges the config value onto the internalHERMES_TLS_MAX_VERSIONenv var (same pattern asgateway.strict→HERMES_MEDIA_DELIVERY_STRICT). The env-var hop is needed becauseagent/process_bootstrap.pyhas no config access at HTTP-client build time, and spawned agent subprocesses must inherit the cap. An explicitly exported env var wins over config.yaml, so one-off shell overrides keep working.network.force_ipv4: thehermes_cli/main.pyearly raw-yaml block (covers CLI, desktop dashboard spawns, TUI gateway — no extra config.yaml read) and thegateway/run.pybootstrap.agent/process_bootstrap.py:_get_tls_ssl_context()parses the value (accepts1.2/1.3, optionaltls/tlsvprefix, case-insensitive; yaml floats fine) and builds anssl.SSLContextwithmaximum_versioncapped. It honors the CLI's existing CA-bundle override convention (HERMES_CA_BUNDLE>REQUESTS_CA_BUNDLE>SSL_CERT_FILE, same precedence as_resolve_requests_verifyinagent/model_metadata.py). Unset/invalid values fall back to httpx defaults with a logged warning;1.0/1.1are deliberately rejected — the knob exists to dodge broken TLS 1.3 paths, not to enable deprecated protocols.run_agent.py_build_keepalive_http_client: passes the context to both the keepaliveHTTPTransport(httpx ignores client-levelverifywhen an explicittransportis passed) and thehttpx.Client(so the proxy mount built internally fromproxy=inherits the same cap — otherwise proxied users would silently keep the broken default).Tests
tests/run_agent/test_keepalive_tls_max_version.py(12 tests): parsing (unset/blank, prefixes, invalid + 1.0/1.1 rejection), integration pins that the capped context lands on the transport pool and on theHTTPProxymount whenHTTPS_PROXYis set, that the default path keepsMAXIMUM_SUPPORTED, and the config bridge (sets the env var, never overrides an explicitly exported one, no-op on empty, handles yaml-float1.2and whitespace). Adjacent suites (test_create_openai_client_proxy_env,test_openai_client_lifecycle,test_ipv4_preference,hermes_cli/test_config.py) all pass.Fixes #44365
🤖 Generated with Claude Code