Skip to content

fix(plugins/memory/honcho): default Honcho SDK HTTP timeout to 30s#13623

Closed
twozle wants to merge 1 commit into
NousResearch:mainfrom
twozle:fix/honcho-client-default-timeout
Closed

fix(plugins/memory/honcho): default Honcho SDK HTTP timeout to 30s#13623
twozle wants to merge 1 commit into
NousResearch:mainfrom
twozle:fix/honcho-client-default-timeout

Conversation

@twozle

@twozle twozle commented Apr 21, 2026

Copy link
Copy Markdown
Contributor

Summary

get_honcho_client() currently passes no timeout kwarg to the Honcho SDK when nothing is explicitly configured (neither HonchoClientConfig.timeout, honcho.timeout / requestTimeout in the Hermes config, nor HONCHO_TIMEOUT). The underlying httpx client then has no cap, so a stalled request can hang indefinitely.

This is a silent-failure hazard on the post-response path of run_conversationmemory_manager.sync_all() / queue_prefetch_all() fire after the agent has already generated its final reply, so a stuck Honcho call blocks run_conversation from returning. The gateway never logs response ready and never delivers the reply to Telegram / WhatsApp / etc., even though the text is already saved to the session file.

This PR adds a module-level _DEFAULT_HTTP_TIMEOUT = 30.0 and applies it in get_honcho_client() only after all configured sources have been checked, so existing deployments that have tuned the value (including self-hosted instances that need longer) keep their current behavior. Unconfigured installs get a sensible ceiling instead of silent indefinite hangs.

Repro

Observed on v0.9.0 with the honcho plugin active and a Telegram session.

  1. Run any Telegram-origin agent turn.
  2. After the model has generated its final assistant text, make app.honcho.dev unreachable (network drop, firewall rule, or natural outage) while sync_all is running.
  3. Without this PR: _run_agent never returns, no response ready is logged, the Telegram reply is lost despite being present in the session file. The gateway process stays alive (Telegram long-poll is unaffected) but cannot complete the turn or respond to further messages in that chat until restarted. CLOSE-WAIT sockets to Honcho accumulate.
  4. With this PR: the call aborts after 30s, run_conversation returns, sync_turn failed: … is logged as before, and the gateway delivers the already-generated reply.

Test plan

  • pytest tests/honcho_plugin/ tests/test_honcho_client_config.py — 160 passed locally (Python 3.12, honcho-ai installed)
  • New regression test TestGetHonchoClient::test_defaults_to_30s_when_no_timeout_configured asserts the default is passed when neither HonchoClientConfig.timeout nor a Hermes config override is set
  • Existing test_passes_timeout_from_config, test_hermes_config_timeout_override_used_when_config_timeout_missing, test_hermes_request_timeout_alias_used still pass unchanged — an explicit value continues to win over the default
  • Manually verified on Linux (Debian/Ubuntu) against a live Telegram session

Platforms tested

Linux only. The change is pure Python dict kwarg construction, no platform-sensitive code paths, so Windows / macOS behavior should be identical.

When no explicit timeout is configured (HonchoClientConfig.timeout,
honcho.timeout / requestTimeout, or HONCHO_TIMEOUT), get_honcho_client
previously constructed the SDK with no timeout kwarg, letting the
underlying httpx client hang indefinitely if the Honcho backend
became unreachable mid-request.

This is a silent-failure hazard on the post-response path of
run_conversation: the memory_manager.sync_all() / queue_prefetch_all()
calls fire after the agent has already generated its final reply, so
a stalled Honcho request blocks run_conversation from returning.
The gateway never logs "response ready" and never delivers the
response to the platform (Telegram, etc.), even though the text is
already saved to the session file.

Repro: unplug the network or block app.honcho.dev mid-turn after
the model has produced its final message. Without this change,
_run_agent never returns. With it, the call aborts after 30s,
run_conversation returns, and the gateway delivers the response
(Honcho sync failure is logged and swallowed as before).

The default applies only when nothing is configured, so any
deployment that has explicitly set timeout / HONCHO_TIMEOUT /
honcho.timeout / honcho.requestTimeout keeps its existing value.
Self-hosted deployments that genuinely need a longer ceiling can
still override via any of those knobs.
@alt-glitch alt-glitch added type/bug Something isn't working comp/plugins Plugin system and bundled plugins labels Apr 21, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Related to #8527 and #10372 which also address Honcho timeout issues but appear to still be open/unmerged.

@teknium1

teknium1 commented May 5, 2026

Copy link
Copy Markdown
Contributor

Closing as already on main — commit 8220527 "fix(plugins/memory/honcho): default Honcho SDK HTTP timeout to 30s" by @twozle is on current main. Looks like a different PR (yours or a resubmit) already landed. Thanks!

@teknium1 teknium1 closed this May 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/plugins Plugin system and bundled plugins type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants