Skip to content

fix(agent): honor configured proxy in OpenAI client path (#11609)#11733

Open
fangbinwei wants to merge 1 commit into
NousResearch:mainfrom
fangbinwei:fix/11609-keepalive-proxy-mounts
Open

fix(agent): honor configured proxy in OpenAI client path (#11609)#11733
fangbinwei wants to merge 1 commit into
NousResearch:mainfrom
fangbinwei:fix/11609-keepalive-proxy-mounts

Conversation

@fangbinwei

Copy link
Copy Markdown

What does this PR do?

Fixes #11609 where HTTPS_PROXY / HTTP_PROXY / macOS SystemConfiguration proxy / Windows registry proxy were silently ignored after #11277 landed. The custom httpx.HTTPTransport injected for TCP keepalives (#10324) made httpx skip env-proxy mount construction, so requests went direct to the destination host regardless of proxy configuration — breaking WSL users, users behind corporate proxies, and users routing outbound traffic through local proxies.

This PR detects any configured proxy via urllib.request.getproxies() — the same function httpx's trust_env=True path uses internally — and skips the keepalive injection in that case, letting the SDK build a default httpx.Client with full trust_env proxy support, including system-level proxy configuration on macOS and Windows.

The proxy path intentionally drops the keepalive injection. Re-implementing httpx's proxy resolution (env + NO_PROXY with IPv6 / CIDR / scheme-qualified entries + macOS SystemConfiguration + Windows registry) by hand proved too fragile; the httpx socket in proxy mode is a loopback/LAN hop to the local proxy process, and httpx's read timeout still fires if that side stalls. Keepalive remains applied on the direct-connect path, so #10324 stays addressed there.

Additionally widens _force_close_tcp_sockets to walk both _transport and _mounts. Once the proxied client path activates, live sockets live under mount pools rather than the default transport, so the existing CLOSE-WAIT cleanup would otherwise silently miss them on every client rebuild.

Related Issue

Fixes #11609

Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)
  • ✅ Tests (adding or improving test coverage)

Changes Made

  • run_agent.py:
    • add AIAgent._has_proxy_configured() — detects configured proxy via urllib.request.getproxies(), matching httpx trust_env semantics
    • gate the keepalive httpx.Client injection on not _has_proxy_configured() so proxied setups delegate to httpx trust_env
    • add AIAgent._iter_client_sockets() helper walking both _transport and _mounts
    • refactor _force_close_tcp_sockets and _cleanup_dead_connections to use the new iterator
  • tests/run_agent/test_create_openai_client_reuse.py:
    • 7 new regression tests, alongside the existing client-reuse guards, covering proxy-env / each-proxy-key / direct-path-keepalive / _has_proxy_configured / system-proxy / mount iteration / mount force-close
    • hermetic proxy stubbing via urllib.request.getproxies to insulate tests from the developer's machine config

How to Test

Scoped tests that exercise the change:

source venv/bin/activate
python -m pytest tests/run_agent/test_create_openai_client_reuse.py -q

Expected: 9 passed (no failures, no skips).

End-to-end reproduction:

  1. Run hermes-agent from a shell with HTTPS_PROXY=http://127.0.0.1:<port> pointing at a local proxy, or with a macOS System Settings → Network proxy configured.
  2. Before this PR: requests may bypass the proxy and direct-connect to the destination host, which can surface as connection failures, timeouts, or unexpected routing behavior depending on the network.
  3. After this PR: lsof -i -nP | grep python shows outbound sockets terminating at the proxy address, not the destination host.

Note: pytest tests/ -q currently reports pre-existing platform/environment failures on upstream main (Discord / Matrix / WSL / file-staleness suites); the scoped regression tests for this change pass cleanly.

Checklist

Code

  • I've read the Contributing Guide
  • My commit messages follow Conventional Commits
  • I searched for existing PRs to make sure this isn't a duplicate
  • My PR contains only changes related to this fix
  • I've run pytest tests/ -q and all tests pass
  • I've added tests for my changes
  • I've tested on my platform: macOS 26.4

Documentation & Housekeeping

  • I've updated relevant documentation — N/A (no user-facing config change)
  • I've updated cli-config.yaml.example — N/A
  • I've updated CONTRIBUTING.md or AGENTS.md — N/A
  • I've considered cross-platform impact — getproxies() transparently reads env vars on all platforms, macOS SystemConfiguration, and Windows registry; TCP_KEEPIDLE / TCP_KEEPALIVE platform branch already in place
  • I've updated tool descriptions/schemas — N/A

@trevorgordon981

Copy link
Copy Markdown

LGTM. Thorough fix for #11609. The key design decision is sound: when a proxy is configured, let httpx own the entire proxy resolution (env vars, NO_PROXY, macOS SystemConfiguration, Windows registry) via trust_env=True rather than trying to replicate that logic around an injected transport.

Using urllib.request.getproxies() for detection aligns with httpx's internal resolution path, which avoids the "we detect from env but httpx reads system config too" mismatch.

The _iter_client_sockets refactor to cover proxy mounts is a necessary follow-on: without it, _force_close_tcp_sockets and _cleanup_dead_connections would miss sockets on the httpx-to-proxy leg, leaking CLOSE-WAIT on proxied setups.

Tests are extensive and pin the tradeoff explicitly (keepalive on direct path, delegation on proxy path). The _stub_proxies helper that patches getproxies() directly rather than env vars is the right approach for hermetic tests.

@fangbinwei fangbinwei force-pushed the fix/11609-keepalive-proxy-mounts branch from a0af18a to 5561df4 Compare April 17, 2026 20:36
…h#11609)

Injecting a custom httpx.HTTPTransport for TCP keepalives (NousResearch#10324)
made httpx skip env-proxy mount construction, so HTTPS_PROXY / system
proxy settings were silently ignored — requests went direct to the
destination host instead of the user's proxy.

Detect any configured proxy via urllib.request.getproxies() (the same
function httpx's trust_env path uses internally) and skip the
keepalive injection in that case, letting the SDK build a default
httpx.Client with full trust_env proxy support. The proxy path
intentionally drops the keepalive injection; that socket is a
loopback/LAN hop to the local proxy process, and httpx's read timeout
remains as a dead-peer backstop. Keepalive is still applied on the
direct-connect path, so NousResearch#10324 remains addressed there.

Also widen _force_close_tcp_sockets to walk both _transport and
_mounts. When the proxied client path activates, live sockets live
under mount pools; without this, CLOSE-WAIT cleanup silently misses
them.
@fangbinwei fangbinwei force-pushed the fix/11609-keepalive-proxy-mounts branch from 5561df4 to 6f2ebbb Compare April 17, 2026 20:40
Zijie-Tian added a commit to Zijie-Tian/hermes-agent that referenced this pull request Apr 18, 2026
The main OpenAI client path was injecting a custom httpx transport that
bypassed proxy-aware defaults, while the raw Codex path ignored
runtime-resolved base URLs and openai-codex callers could still pin
chat_completions. This packages the minimal local fixes and regression
tests needed to make Hermes reliably use Codex gpt-5.4 behind the
user's configured proxy.

Constraint: Must preserve unrelated local edits in the repo
Constraint: Must keep Hermes on the Codex Responses path for openai-codex
Rejected: Commit workspace-level ~/.hermes/.env changes | outside repo scope
Rejected: Wait for upstream PR merges | user needs a working fork branch now
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: If upstream merges NousResearch#11733/NousResearch#10044/NousResearch#5988, rebase this branch against their final versions before reusing it
Tested: pytest tests/run_agent/test_create_openai_client_reuse.py -q
Tested: pytest tests/run_agent/test_run_agent_codex_responses.py -q
Tested: pytest tests/agent/test_auxiliary_client.py -q
Tested: hermes chat --provider openai-codex -m gpt-5.4 -Q --max-turns 1 -q "Reply with exactly OK."
Tested: hermes chat -Q --max-turns 1 -q "Reply with exactly OK."
Not-tested: gateway / IM platform flows
Related: NousResearch#11609
Related: NousResearch#11733
Related: NousResearch#10044
Related: NousResearch#5988
@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder labels Apr 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] All API calls timeout on WSL with HTTP proxy — custom httpx transport bypasses proxy env vars in general agent path

3 participants