fix(codex): avoid custom keepalive transport on chatgpt backend by liuhao1024 · Pull Request #12953 · NousResearch/hermes-agent

liuhao1024 · 2026-04-20T09:55:34Z

What does this PR do?

This draft fixes a Codex-specific transport regression in _create_openai_client().

Hermes currently injects a custom httpx.HTTPTransport(socket_options=...) to enable TCP keepalives. That path is desirable for regular OpenAI-compatible endpoints, but it is not compatible with the ChatGPT Codex backend at https://chatgpt.com/backend-api/codex: the TLS handshake gets reset before the request reaches normal API validation.

This PR keeps the existing keepalive behavior for normal OpenAI-compatible endpoints, but skips custom transport injection for the Codex backend.

Closes #12952

Related Issue

N/A

Type of Change

🐛 Bug fix (non-breaking change that fixes an issue)

Changes Made

See commit messages for detailed changes

How to Test

Run pytest tests/ -q — all tests should pass
Verify the specific scenario described above is resolved

Checklist

Code

I've read the Contributing Guide
My commit messages follow Conventional Commits (fix(scope):, feat(scope):, etc.)
I searched for existing PRs to make sure this isn't a duplicate
My PR contains only changes related to this fix/feature (no unrelated commits)
I've run pytest tests/ -q and all tests pass
I've added tests for my changes (required for bug fixes, strongly encouraged for features)
I've tested on my platform: macOS 26.4.1

Documentation & Housekeeping

I've updated relevant documentation (README, docs/, docstrings) — or N/A
I've updated cli-config.yaml.example if I added/changed config keys — or N/A
I've updated CONTRIBUTING.md or AGENTS.md if I changed architecture and workflows — or N/A
I've considered cross-platform impact (Windows, macOS) per the compatibility guide — or N/A

liuhao1024 · 2026-04-20T10:03:22Z

Maintainer note:

This is intentionally not a duplicate of the earlier transport regressions.

fix: always create fresh httpx.Client to prevent use-after-close (#10324 regression) #11033 / Regression: gateway reuses mutated OpenAI http_client kwargs and accumulates stale connections #11070 were about stale or reused injected clients/transports
fix(codex): preserve proxy env when creating OpenAI client #12008 was about proxy env handling when a custom transport is injected
this PR addresses the remaining Codex-specific incompatibility where chatgpt.com/backend-api/codex resets the TLS handshake when the keepalive transport is injected at all

So the fix here is narrower: keep the current keepalive path for normal OpenAI-compatible endpoints, but bypass it for the Codex backend only.

ratacat · 2026-04-25T18:37:45Z

Supportive +1 from a real cron incident on macOS/launchd using openai-codex.

This PR looks like the right shape to me: skip the custom keepalive transport entirely for the ChatGPT Codex backend rather than trying to make the socket-options path work for an endpoint that appears sensitive to it. In our incident, a Codex cron run appeared to stream final output successfully but then the cron runner never returned/saved/released scheduler state. Avoiding the custom transport for Codex is a cleaner mitigation than piling more special-case timeout handling onto that transport path.

Small elegance suggestions, non-blocking:

Consider extracting the Codex-backend test into a tiny helper, e.g. _is_chatgpt_codex_backend(provider, base_url), so the provider/base_url rule is named and reusable if other client construction paths need the same bypass later.
It may be worth adding one assertion that regular OpenAI-compatible endpoints still pass configured timeout/proxy behavior through when the custom client is injected. This PR correctly focuses on Codex, but the broader class of bugs here is “custom http_client subtly changes OpenAI SDK semantics,” so pinning timeout preservation alongside proxy preservation would make the transport boundary more robust.

Overall: I think this is the right fix for #12952 and likely addresses the provider-side cause of the cron wedge we saw.

Address review suggestion to centralize the Codex backend detection logic into _is_chatgpt_codex_backend() for reuse across client construction paths. Improves maintainability and readability. The helper checks both provider ('openai-codex') and base_url (https://chatgpt.com/backend-api/codex) to identify the ChatGPT Codex backend, which is incompatible with custom socket_options in HTTPTransport. Fixes: NousResearch#12952

Add regression test to ensure timeout is forwarded correctly when the custom keepalive http_client is injected for regular OpenAI endpoints. This addresses reviewer feedback that the broader class of bugs is "custom http_client subtly changes OpenAI SDK semantics" and pinning timeout preservation makes the transport boundary more robust. The test verifies that a configured timeout (300.0s) is passed through to the OpenAI client constructor even when http_client is injected, confirming that custom transport injection does not silently drop timeout configuration.

Problem: PR NousResearch#12953 conflicted with upstream main in the OpenAI client keepalive/proxy construction path and its regression tests. Resolution: keep main's base_url-aware NO_PROXY handling while skipping the custom keepalive http_client for the ChatGPT Codex backend; merge the regression coverage for regular OpenAI, Codex, NO_PROXY, and timeout behavior.

liuhao1024 · 2026-04-28T03:24:40Z

@ratacat thanks for the suggestions. I updated the PR to address both points:

Extracted _is_chatgpt_codex_backend(provider, base_url) and use it to gate the ChatGPT Codex keepalive bypass.
Added regression coverage to ensure regular OpenAI-compatible endpoints still preserve proxy behavior and configured timeout when the custom keepalive client is injected.

I also synced with upstream main and resolved the merge conflicts.

Local targeted verification:
scripts/run_tests.sh tests/run_agent/test_create_openai_client_proxy_env.py tests/run_agent/test_create_openai_client_kwargs_isolation.py tests/run_agent/test_create_openai_client_reuse.py

Result: 15 passed.

fix(codex): skip keepalive transport for chatgpt backend

aa69d0f

liuhao1024 mentioned this pull request Apr 20, 2026

Bug: custom keepalive transport breaks chatgpt codex backend #12952

Open

test(codex): pin keepalive builder bypass for chatgpt backend

4782d0a

liuhao1024 marked this pull request as ready for review April 20, 2026 10:03

liuhao1024 changed the title ~~fix(codex): skip keepalive transport for chatgpt backend~~ fix(codex): avoid custom keepalive transport on chatgpt backend Apr 20, 2026

alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder provider/openai OpenAI / Codex Responses API labels Apr 22, 2026

This was referenced Apr 25, 2026

feat(cron): run independent cron jobs in parallel with safe non-overlap #7158

Open

fix: update activity tracker during Codex Responses streaming #8027

Open

liuhao1024 added 2 commits April 26, 2026 06:14

liuhao1024 force-pushed the fix/codex-skip-keepalive-transport branch from 2f3bcae to c7c0cf8 Compare April 28, 2026 03:21

liuhao1024 closed this Apr 29, 2026

liuhao1024 reopened this Apr 29, 2026

liuhao1024 closed this Apr 29, 2026

liuhao1024 reopened this Apr 29, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(codex): avoid custom keepalive transport on chatgpt backend#12953

fix(codex): avoid custom keepalive transport on chatgpt backend#12953
liuhao1024 wants to merge 5 commits into
NousResearch:mainfrom
liuhao1024:fix/codex-skip-keepalive-transport

liuhao1024 commented Apr 20, 2026 •

edited

Loading

Uh oh!

liuhao1024 commented Apr 20, 2026

Uh oh!

ratacat commented Apr 25, 2026

Uh oh!

liuhao1024 commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

liuhao1024 commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Related Issue

Type of Change

Changes Made

How to Test

Checklist

Code

Documentation & Housekeeping

Uh oh!

liuhao1024 commented Apr 20, 2026

Uh oh!

ratacat commented Apr 25, 2026

Uh oh!

liuhao1024 commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

liuhao1024 commented Apr 20, 2026 •

edited

Loading