Skip to content

Codex Responses main agent path drops configured timeout #21644

@fuyizheng3120

Description

@fuyizheng3120

Bug Description

The main agent codex_responses path computes a request timeout via AIAgent._resolved_api_call_timeout(), but the timeout is not actually forwarded to the Codex Responses API request.

This is different from the auxiliary Codex adapter, which already preserves the chat-completions-style timeout contract and forwards timeout to responses.stream().

Related: #21444. That issue tracks a broader openai-codex/gpt-5.5 long-session silent hang. This issue is narrower: the main Codex Responses path drops the configured timeout before the request reaches responses.stream().

Affected Code Path

Observed on current main at faa13e49f / v0.13.0:

  1. run_agent.py has an early if self.api_mode == "codex_responses" branch in _build_api_kwargs().
  2. That branch calls ResponsesApiTransport.build_kwargs(...), but does not pass timeout=self._resolved_api_call_timeout().
  3. agent/transports/codex.py::ResponsesApiTransport.build_kwargs() does not accept/copy timeout into the Responses kwargs.
  4. agent/codex_responses_adapter.py::_preflight_codex_api_kwargs() does not allow/preserve timeout either.

By comparison, agent/transports/chat_completions.py explicitly forwards timeout, and agent/auxiliary_client.py has tests covering Codex auxiliary timeout forwarding.

Local Reproduction Evidence

A Feishu self-check request on a patched local Hermes instance after upgrading to v0.13.0 reached the gateway and Feishu successfully, but the model request failed:

API call failed after 3 retries. Request timed out. | provider=openai-codex model=gpt-5.5 msgs=11 tokens=~20,731
Auxiliary title generation failed: Request timed out or interrupted.

Gateway/Feishu were healthy; the failure was in the main model request path.

A no-network probe before the local fix showed the issue directly:

api_mode= codex_responses
provider= openai-codex
timeout= None
has_input= True

Expected Behavior

The main Codex Responses request should preserve the same timeout contract as chat completions:

  • default via HERMES_API_TIMEOUT / 1800s fallback
  • provider/model overrides via providers.<id>.request_timeout_seconds or providers.<id>.models.<model>.timeout_seconds
  • the final preflighted kwargs passed into responses.stream() should include a positive numeric timeout

Actual Behavior

The timeout is dropped in the main codex_responses path. As a result, the request depends on SDK/default transport behavior instead of Hermes' configured timeout policy.

Local Patch That Fixed The Narrow Issue

The local patch made these changes:

  1. run_agent.py: pass timeout=self._resolved_api_call_timeout() in the codex_responses branch.
  2. agent/transports/codex.py: forward params.get("timeout") into kwargs.
  3. agent/codex_responses_adapter.py: allow and preserve positive numeric timeout in _preflight_codex_api_kwargs().
  4. Tests:
    • tests/agent/transports/test_codex_transport.py: timeout survives transport build + preflight.
    • tests/run_agent/test_run_agent.py: actual AIAgent._build_api_kwargs() for openai-codex includes timeout.

After the local patch, the no-network probe reports:

api_mode= codex_responses
provider= openai-codex
timeout= 1800.0
has_input= True

A real short smoke also succeeded:

hermes chat -q "只回复 OK" -Q
# returned OK in 10.4s

Validation

Local targeted tests:

42 passed

Covered:

  • tests/agent/transports/test_codex_transport.py
  • tests/run_agent/test_run_agent.py::TestBuildApiKwargs::test_codex_responses_kwargs_include_timeout
  • tests/hermes_cli/test_timeouts.py
  • tests/agent/test_auxiliary_client.py::TestCodexAuxiliaryAdapterTimeout

Also passed:

git diff --check
python -m compileall -q run_agent.py agent/transports/codex.py agent/codex_responses_adapter.py

Notes

This fix may not fully resolve all of #21444. If gpt-5.5 on the ChatGPT Codex backend also needs payload sanitization, that should be handled separately. This issue only covers the missing timeout propagation in the main Codex Responses path.

Are you willing to submit a PR for this?

  • I'd like to fix this myself and submit a PR

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Medium — degraded but workaround existscomp/agentCore agent loop, run_agent.py, prompt builderprovider/openaiOpenAI / Codex Responses APIsweeper:implemented-on-mainSweeper: behavior already present on current maintype/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions