fix(auxiliary): evict cached client on timeout/connection error#23482
Merged
Conversation
A Codex auxiliary timeout closes the underlying OpenAI client (so the streaming hang doesn't sit until the user kills the session), but the cached wrapper kept pointing at the now-dead transport. Subsequent auxiliary calls (compression retry, memory flush, background review, title generation routed via provider: main) reused that closed client and failed fast with 'Connection error' until the gateway restarted — even though the main agent route was healthy the whole time. Sync `_get_cached_client` had no liveness check (async did, via loop identity), and the connection-error fallback in `call_llm` only fired on the auto provider path, so an explicit provider — including the common `auxiliary.compression.provider: main` shape — never evicted. Three fixes: * New `_evict_cached_client_instance(target)` helper that drops the cache entry whose stored client is target (or wraps it via `_real_client`, for `CodexAuxiliaryClient`). * `_CodexCompletionsAdapter._close_client_on_timeout` evicts the wrapper after closing the inner OpenAI client. * `call_llm` and `async_call_llm` evict on `_is_connection_error` before re-raising, regardless of whether the provider is auto. Net effect: one timeout costs one summary attempt + the existing 30s compressor cooldown; the next compaction rebuilds the client and works. Non-connection errors (4xx/5xx) do not evict, so cache hits stay stable. Closes #23432
Contributor
🔎 Lint report:
|
This was referenced May 11, 2026
JZKK720
pushed a commit
to JZKK720/hermes-agent
that referenced
this pull request
May 11, 2026
…Research#23482) A Codex auxiliary timeout closes the underlying OpenAI client (so the streaming hang doesn't sit until the user kills the session), but the cached wrapper kept pointing at the now-dead transport. Subsequent auxiliary calls (compression retry, memory flush, background review, title generation routed via provider: main) reused that closed client and failed fast with 'Connection error' until the gateway restarted — even though the main agent route was healthy the whole time. Sync `_get_cached_client` had no liveness check (async did, via loop identity), and the connection-error fallback in `call_llm` only fired on the auto provider path, so an explicit provider — including the common `auxiliary.compression.provider: main` shape — never evicted. Three fixes: * New `_evict_cached_client_instance(target)` helper that drops the cache entry whose stored client is target (or wraps it via `_real_client`, for `CodexAuxiliaryClient`). * `_CodexCompletionsAdapter._close_client_on_timeout` evicts the wrapper after closing the inner OpenAI client. * `call_llm` and `async_call_llm` evict on `_is_connection_error` before re-raising, regardless of whether the provider is auto. Net effect: one timeout costs one summary attempt + the existing 30s compressor cooldown; the next compaction rebuilds the client and works. Non-connection errors (4xx/5xx) do not evict, so cache hits stay stable. Closes NousResearch#23432
19 tasks
rmulligan
pushed a commit
to rmulligan/hermes-agent
that referenced
this pull request
May 11, 2026
…Research#23482) A Codex auxiliary timeout closes the underlying OpenAI client (so the streaming hang doesn't sit until the user kills the session), but the cached wrapper kept pointing at the now-dead transport. Subsequent auxiliary calls (compression retry, memory flush, background review, title generation routed via provider: main) reused that closed client and failed fast with 'Connection error' until the gateway restarted — even though the main agent route was healthy the whole time. Sync `_get_cached_client` had no liveness check (async did, via loop identity), and the connection-error fallback in `call_llm` only fired on the auto provider path, so an explicit provider — including the common `auxiliary.compression.provider: main` shape — never evicted. Three fixes: * New `_evict_cached_client_instance(target)` helper that drops the cache entry whose stored client is target (or wraps it via `_real_client`, for `CodexAuxiliaryClient`). * `_CodexCompletionsAdapter._close_client_on_timeout` evicts the wrapper after closing the inner OpenAI client. * `call_llm` and `async_call_llm` evict on `_is_connection_error` before re-raising, regardless of whether the provider is auto. Net effect: one timeout costs one summary attempt + the existing 30s compressor cooldown; the next compaction rebuilds the client and works. Non-connection errors (4xx/5xx) do not evict, so cache hits stay stable. Closes NousResearch#23432
teknium1
pushed a commit
that referenced
this pull request
May 11, 2026
…23482) #23482 fixed cache poisoning in the sync path: when a Codex auxiliary timeout closes the underlying OpenAI client, _evict_cached_client_instance walks CodexAuxiliaryClient wrappers via their _real_client attribute and drops the cache entry so the next aux call rebuilds. The cache key includes async_mode (see _client_cache_key), so the sync and async clients for the same provider live in two distinct entries pointing at the same underlying transport. The fix walked the sync wrapper's _real_client correctly but the async wrappers (AsyncCodexAuxiliaryClient, AsyncAnthropicAuxiliaryClient, AsyncGeminiNativeClient) never exposed _real_client at all, so the async entry survived eviction and kept handing out the poisoned client. Effect on async aux callers: one timeout now poisons every subsequent async aux call (compression, vision, session_search, title_generation) with 'Connection error' until gateway restart -- even while the sync route recovered as designed in #23482. Mirror the sync wrapper's _real_client onto each async wrapper so the existing eviction helper finds them. Three changes, one per wrapper: - AsyncCodexAuxiliaryClient: self._real_client = sync_wrapper._real_client (the underlying OpenAI client) - AsyncAnthropicAuxiliaryClient: same shape - AsyncGeminiNativeClient: self._real_client = sync_client (Gemini's native facade is itself the leaf; no OpenAI client beneath it) Update _evict_cached_client_instance docstring to reflect that it now covers both sync and async wrappers via the same attribute walk. Test: TestAuxiliaryClientPoisonedCacheEviction.test_evict_cached_client_instance_walks_async_wrapper seeds both sync and async cache entries pointing at the same leaf and asserts both are dropped on a single eviction call. Verified the test fails without the wrapper changes ("async cache entry survived eviction -- wrapper is missing _real_client") and passes with them. Refs #23482, #23432
teknium1
pushed a commit
that referenced
this pull request
May 11, 2026
…23482) #23482 fixed cache poisoning in the sync path: when a Codex auxiliary timeout closes the underlying OpenAI client, _evict_cached_client_instance walks CodexAuxiliaryClient wrappers via their _real_client attribute and drops the cache entry so the next aux call rebuilds. The cache key includes async_mode (see _client_cache_key), so the sync and async clients for the same provider live in two distinct entries pointing at the same underlying transport. The fix walked the sync wrapper's _real_client correctly but the async wrappers (AsyncCodexAuxiliaryClient, AsyncAnthropicAuxiliaryClient, AsyncGeminiNativeClient) never exposed _real_client at all, so the async entry survived eviction and kept handing out the poisoned client. Effect on async aux callers: one timeout now poisons every subsequent async aux call (compression, vision, session_search, title_generation) with 'Connection error' until gateway restart -- even while the sync route recovered as designed in #23482. Mirror the sync wrapper's _real_client onto each async wrapper so the existing eviction helper finds them. Three changes, one per wrapper: - AsyncCodexAuxiliaryClient: self._real_client = sync_wrapper._real_client (the underlying OpenAI client) - AsyncAnthropicAuxiliaryClient: same shape - AsyncGeminiNativeClient: self._real_client = sync_client (Gemini's native facade is itself the leaf; no OpenAI client beneath it) Update _evict_cached_client_instance docstring to reflect that it now covers both sync and async wrappers via the same attribute walk. Test: TestAuxiliaryClientPoisonedCacheEviction.test_evict_cached_client_instance_walks_async_wrapper seeds both sync and async cache entries pointing at the same leaf and asserts both are dropped on a single eviction call. Verified the test fails without the wrapper changes ("async cache entry survived eviction -- wrapper is missing _real_client") and passes with them. Refs #23482, #23432
JinyuID
pushed a commit
to JinyuID/hermes-agent
that referenced
this pull request
May 11, 2026
…Research#23482) A Codex auxiliary timeout closes the underlying OpenAI client (so the streaming hang doesn't sit until the user kills the session), but the cached wrapper kept pointing at the now-dead transport. Subsequent auxiliary calls (compression retry, memory flush, background review, title generation routed via provider: main) reused that closed client and failed fast with 'Connection error' until the gateway restarted — even though the main agent route was healthy the whole time. Sync `_get_cached_client` had no liveness check (async did, via loop identity), and the connection-error fallback in `call_llm` only fired on the auto provider path, so an explicit provider — including the common `auxiliary.compression.provider: main` shape — never evicted. Three fixes: * New `_evict_cached_client_instance(target)` helper that drops the cache entry whose stored client is target (or wraps it via `_real_client`, for `CodexAuxiliaryClient`). * `_CodexCompletionsAdapter._close_client_on_timeout` evicts the wrapper after closing the inner OpenAI client. * `call_llm` and `async_call_llm` evict on `_is_connection_error` before re-raising, regardless of whether the provider is auto. Net effect: one timeout costs one summary attempt + the existing 30s compressor cooldown; the next compaction rebuilds the client and works. Non-connection errors (4xx/5xx) do not evict, so cache hits stay stable. Closes NousResearch#23432
19 tasks
02356abc
pushed a commit
to 02356abc/hermes-agent
that referenced
this pull request
May 14, 2026
…Research#23482) A Codex auxiliary timeout closes the underlying OpenAI client (so the streaming hang doesn't sit until the user kills the session), but the cached wrapper kept pointing at the now-dead transport. Subsequent auxiliary calls (compression retry, memory flush, background review, title generation routed via provider: main) reused that closed client and failed fast with 'Connection error' until the gateway restarted — even though the main agent route was healthy the whole time. Sync `_get_cached_client` had no liveness check (async did, via loop identity), and the connection-error fallback in `call_llm` only fired on the auto provider path, so an explicit provider — including the common `auxiliary.compression.provider: main` shape — never evicted. Three fixes: * New `_evict_cached_client_instance(target)` helper that drops the cache entry whose stored client is target (or wraps it via `_real_client`, for `CodexAuxiliaryClient`). * `_CodexCompletionsAdapter._close_client_on_timeout` evicts the wrapper after closing the inner OpenAI client. * `call_llm` and `async_call_llm` evict on `_is_connection_error` before re-raising, regardless of whether the provider is auto. Net effect: one timeout costs one summary attempt + the existing 30s compressor cooldown; the next compaction rebuilds the client and works. Non-connection errors (4xx/5xx) do not evict, so cache hits stay stable. Closes NousResearch#23432
02356abc
pushed a commit
to 02356abc/hermes-agent
that referenced
this pull request
May 14, 2026
…ousResearch#23482) NousResearch#23482 fixed cache poisoning in the sync path: when a Codex auxiliary timeout closes the underlying OpenAI client, _evict_cached_client_instance walks CodexAuxiliaryClient wrappers via their _real_client attribute and drops the cache entry so the next aux call rebuilds. The cache key includes async_mode (see _client_cache_key), so the sync and async clients for the same provider live in two distinct entries pointing at the same underlying transport. The fix walked the sync wrapper's _real_client correctly but the async wrappers (AsyncCodexAuxiliaryClient, AsyncAnthropicAuxiliaryClient, AsyncGeminiNativeClient) never exposed _real_client at all, so the async entry survived eviction and kept handing out the poisoned client. Effect on async aux callers: one timeout now poisons every subsequent async aux call (compression, vision, session_search, title_generation) with 'Connection error' until gateway restart -- even while the sync route recovered as designed in NousResearch#23482. Mirror the sync wrapper's _real_client onto each async wrapper so the existing eviction helper finds them. Three changes, one per wrapper: - AsyncCodexAuxiliaryClient: self._real_client = sync_wrapper._real_client (the underlying OpenAI client) - AsyncAnthropicAuxiliaryClient: same shape - AsyncGeminiNativeClient: self._real_client = sync_client (Gemini's native facade is itself the leaf; no OpenAI client beneath it) Update _evict_cached_client_instance docstring to reflect that it now covers both sync and async wrappers via the same attribute walk. Test: TestAuxiliaryClientPoisonedCacheEviction.test_evict_cached_client_instance_walks_async_wrapper seeds both sync and async cache entries pointing at the same leaf and asserts both are dropped on a single eviction call. Verified the test fails without the wrapper changes ("async cache entry survived eviction -- wrapper is missing _real_client") and passes with them. Refs NousResearch#23482, NousResearch#23432
jsboige
pushed a commit
to jsboige/hermes-agent
that referenced
this pull request
May 14, 2026
…Research#23482) A Codex auxiliary timeout closes the underlying OpenAI client (so the streaming hang doesn't sit until the user kills the session), but the cached wrapper kept pointing at the now-dead transport. Subsequent auxiliary calls (compression retry, memory flush, background review, title generation routed via provider: main) reused that closed client and failed fast with 'Connection error' until the gateway restarted — even though the main agent route was healthy the whole time. Sync `_get_cached_client` had no liveness check (async did, via loop identity), and the connection-error fallback in `call_llm` only fired on the auto provider path, so an explicit provider — including the common `auxiliary.compression.provider: main` shape — never evicted. Three fixes: * New `_evict_cached_client_instance(target)` helper that drops the cache entry whose stored client is target (or wraps it via `_real_client`, for `CodexAuxiliaryClient`). * `_CodexCompletionsAdapter._close_client_on_timeout` evicts the wrapper after closing the inner OpenAI client. * `call_llm` and `async_call_llm` evict on `_is_connection_error` before re-raising, regardless of whether the provider is auto. Net effect: one timeout costs one summary attempt + the existing 30s compressor cooldown; the next compaction rebuilds the client and works. Non-connection errors (4xx/5xx) do not evict, so cache hits stay stable. Closes NousResearch#23432
jsboige
pushed a commit
to jsboige/hermes-agent
that referenced
this pull request
May 14, 2026
…ousResearch#23482) NousResearch#23482 fixed cache poisoning in the sync path: when a Codex auxiliary timeout closes the underlying OpenAI client, _evict_cached_client_instance walks CodexAuxiliaryClient wrappers via their _real_client attribute and drops the cache entry so the next aux call rebuilds. The cache key includes async_mode (see _client_cache_key), so the sync and async clients for the same provider live in two distinct entries pointing at the same underlying transport. The fix walked the sync wrapper's _real_client correctly but the async wrappers (AsyncCodexAuxiliaryClient, AsyncAnthropicAuxiliaryClient, AsyncGeminiNativeClient) never exposed _real_client at all, so the async entry survived eviction and kept handing out the poisoned client. Effect on async aux callers: one timeout now poisons every subsequent async aux call (compression, vision, session_search, title_generation) with 'Connection error' until gateway restart -- even while the sync route recovered as designed in NousResearch#23482. Mirror the sync wrapper's _real_client onto each async wrapper so the existing eviction helper finds them. Three changes, one per wrapper: - AsyncCodexAuxiliaryClient: self._real_client = sync_wrapper._real_client (the underlying OpenAI client) - AsyncAnthropicAuxiliaryClient: same shape - AsyncGeminiNativeClient: self._real_client = sync_client (Gemini's native facade is itself the leaf; no OpenAI client beneath it) Update _evict_cached_client_instance docstring to reflect that it now covers both sync and async wrappers via the same attribute walk. Test: TestAuxiliaryClientPoisonedCacheEviction.test_evict_cached_client_instance_walks_async_wrapper seeds both sync and async cache entries pointing at the same leaf and asserts both are dropped on a single eviction call. Verified the test fails without the wrapper changes ("async cache entry survived eviction -- wrapper is missing _real_client") and passes with them. Refs NousResearch#23482, NousResearch#23432
heybenn42
pushed a commit
to heybenn42/hermes-agent
that referenced
this pull request
May 15, 2026
* upstream/main: (1099 commits) chore: ruff auto-fix C401, C416, C408, PLR1722 (NousResearch#23940) feat(prompt-cache): cross-session 1h prefix cache for Claude on Anthropic / OpenRouter / Nous Portal (NousResearch#23828) chore: ruff auto-fix PLR6201 — tuple → set in membership tests (NousResearch#23937) chore(release): add AUTHOR_MAP entry for wuli666 fix(auxiliary): evict async wrappers on poisoned client (follow-up to NousResearch#23482) fix(cli,tui): align CJK / wide-char markdown tables (NousResearch#23863) chore: ruff auto-fixes — collapsible-else-if, if-stmt-min-max, dict.fromkeys (NousResearch#23926) fix(/model): surface Nous Portal models from remote catalog manifest (NousResearch#23912) fix(cli): defensive _slash_confirm_state access + AUTHOR_MAP fix: use TUI modal for slash confirmations rebuild model catalog fix(dashboard): validate dist exists when --skip-build is set fix(dashboard): fallback to stale dist, retry build, add --skip-build flag chore: AUTHOR_MAP entry for VinceZcrikl noreply (NousResearch#23647) fix: make web UI build output decoding robust on Windows fix(agent): catch ChatGPT-account Codex data-URL rejection so images are stripped instead of cascading to compression (NousResearch#23602) revert: roll back /goal checklist + /subgoal feature stack (NousResearch#23813) chore: AUTHOR_MAP entries for sudo-hardening salvage contributors fix(approval): catch sudo with stdin/askpass/shell privilege flags fix(terminal): block sudo -S password guessing when SUDO_PASSWORD is not set ...
AlexFoxD
pushed a commit
to AlexFoxD/hermes-agent
that referenced
this pull request
May 21, 2026
…Research#23482) A Codex auxiliary timeout closes the underlying OpenAI client (so the streaming hang doesn't sit until the user kills the session), but the cached wrapper kept pointing at the now-dead transport. Subsequent auxiliary calls (compression retry, memory flush, background review, title generation routed via provider: main) reused that closed client and failed fast with 'Connection error' until the gateway restarted — even though the main agent route was healthy the whole time. Sync `_get_cached_client` had no liveness check (async did, via loop identity), and the connection-error fallback in `call_llm` only fired on the auto provider path, so an explicit provider — including the common `auxiliary.compression.provider: main` shape — never evicted. Three fixes: * New `_evict_cached_client_instance(target)` helper that drops the cache entry whose stored client is target (or wraps it via `_real_client`, for `CodexAuxiliaryClient`). * `_CodexCompletionsAdapter._close_client_on_timeout` evicts the wrapper after closing the inner OpenAI client. * `call_llm` and `async_call_llm` evict on `_is_connection_error` before re-raising, regardless of whether the provider is auto. Net effect: one timeout costs one summary attempt + the existing 30s compressor cooldown; the next compaction rebuilds the client and works. Non-connection errors (4xx/5xx) do not evict, so cache hits stay stable. Closes NousResearch#23432
AlexFoxD
pushed a commit
to AlexFoxD/hermes-agent
that referenced
this pull request
May 21, 2026
…ousResearch#23482) NousResearch#23482 fixed cache poisoning in the sync path: when a Codex auxiliary timeout closes the underlying OpenAI client, _evict_cached_client_instance walks CodexAuxiliaryClient wrappers via their _real_client attribute and drops the cache entry so the next aux call rebuilds. The cache key includes async_mode (see _client_cache_key), so the sync and async clients for the same provider live in two distinct entries pointing at the same underlying transport. The fix walked the sync wrapper's _real_client correctly but the async wrappers (AsyncCodexAuxiliaryClient, AsyncAnthropicAuxiliaryClient, AsyncGeminiNativeClient) never exposed _real_client at all, so the async entry survived eviction and kept handing out the poisoned client. Effect on async aux callers: one timeout now poisons every subsequent async aux call (compression, vision, session_search, title_generation) with 'Connection error' until gateway restart -- even while the sync route recovered as designed in NousResearch#23482. Mirror the sync wrapper's _real_client onto each async wrapper so the existing eviction helper finds them. Three changes, one per wrapper: - AsyncCodexAuxiliaryClient: self._real_client = sync_wrapper._real_client (the underlying OpenAI client) - AsyncAnthropicAuxiliaryClient: same shape - AsyncGeminiNativeClient: self._real_client = sync_client (Gemini's native facade is itself the leaf; no OpenAI client beneath it) Update _evict_cached_client_instance docstring to reflect that it now covers both sync and async wrappers via the same attribute walk. Test: TestAuxiliaryClientPoisonedCacheEviction.test_evict_cached_client_instance_walks_async_wrapper seeds both sync and async cache entries pointing at the same leaf and asserts both are dropped on a single eviction call. Verified the test fails without the wrapper changes ("async cache entry survived eviction -- wrapper is missing _real_client") and passes with them. Refs NousResearch#23482, NousResearch#23432
gweeteve
pushed a commit
to gweeteve/hermes-agent
that referenced
this pull request
Jun 2, 2026
…Research#23482) A Codex auxiliary timeout closes the underlying OpenAI client (so the streaming hang doesn't sit until the user kills the session), but the cached wrapper kept pointing at the now-dead transport. Subsequent auxiliary calls (compression retry, memory flush, background review, title generation routed via provider: main) reused that closed client and failed fast with 'Connection error' until the gateway restarted — even though the main agent route was healthy the whole time. Sync `_get_cached_client` had no liveness check (async did, via loop identity), and the connection-error fallback in `call_llm` only fired on the auto provider path, so an explicit provider — including the common `auxiliary.compression.provider: main` shape — never evicted. Three fixes: * New `_evict_cached_client_instance(target)` helper that drops the cache entry whose stored client is target (or wraps it via `_real_client`, for `CodexAuxiliaryClient`). * `_CodexCompletionsAdapter._close_client_on_timeout` evicts the wrapper after closing the inner OpenAI client. * `call_llm` and `async_call_llm` evict on `_is_connection_error` before re-raising, regardless of whether the provider is auto. Net effect: one timeout costs one summary attempt + the existing 30s compressor cooldown; the next compaction rebuilds the client and works. Non-connection errors (4xx/5xx) do not evict, so cache hits stay stable. Closes NousResearch#23432
gweeteve
pushed a commit
to gweeteve/hermes-agent
that referenced
this pull request
Jun 2, 2026
…ousResearch#23482) NousResearch#23482 fixed cache poisoning in the sync path: when a Codex auxiliary timeout closes the underlying OpenAI client, _evict_cached_client_instance walks CodexAuxiliaryClient wrappers via their _real_client attribute and drops the cache entry so the next aux call rebuilds. The cache key includes async_mode (see _client_cache_key), so the sync and async clients for the same provider live in two distinct entries pointing at the same underlying transport. The fix walked the sync wrapper's _real_client correctly but the async wrappers (AsyncCodexAuxiliaryClient, AsyncAnthropicAuxiliaryClient, AsyncGeminiNativeClient) never exposed _real_client at all, so the async entry survived eviction and kept handing out the poisoned client. Effect on async aux callers: one timeout now poisons every subsequent async aux call (compression, vision, session_search, title_generation) with 'Connection error' until gateway restart -- even while the sync route recovered as designed in NousResearch#23482. Mirror the sync wrapper's _real_client onto each async wrapper so the existing eviction helper finds them. Three changes, one per wrapper: - AsyncCodexAuxiliaryClient: self._real_client = sync_wrapper._real_client (the underlying OpenAI client) - AsyncAnthropicAuxiliaryClient: same shape - AsyncGeminiNativeClient: self._real_client = sync_client (Gemini's native facade is itself the leaf; no OpenAI client beneath it) Update _evict_cached_client_instance docstring to reflect that it now covers both sync and async wrappers via the same attribute walk. Test: TestAuxiliaryClientPoisonedCacheEviction.test_evict_cached_client_instance_walks_async_wrapper seeds both sync and async cache entries pointing at the same leaf and asserts both are dropped on a single eviction call. Verified the test fails without the wrapper changes ("async cache entry survived eviction -- wrapper is missing _real_client") and passes with them. Refs NousResearch#23482, NousResearch#23432
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #23432.
Summary
After an auxiliary Codex timeout, later
provider: mainaux calls (memory flush, background review, next compaction) stop failing with staleConnection error. Compression no longer drops the user into the static fallback marker after a single timeout in a long-running gateway session.Root cause
_CodexCompletionsAdapter._close_client_on_timeoutcloses the inner OpenAI client to unstick a hung Responses stream. The cached wrapper still pointed at that now-dead transport. Sync_get_cached_clienthas no liveness check (async does, via loop identity), andcall_llm's connection-error fallback only fires whenis_auto— so an explicit provider (includingauxiliary.compression.provider: main→openai-codex) never evicts.Changes
agent/auxiliary_client.py: new_evict_cached_client_instance(target)walks_client_cacheand drops entries whose stored client istarget(or wraps it via_real_clientforCodexAuxiliaryClient).agent/auxiliary_client.py:_close_client_on_timeoutevicts the wrapper after closing the inner client.agent/auxiliary_client.py:call_llm+async_call_llmevict on_is_connection_error(first_err)before re-raising, independent ofis_auto.Validation
call_llm(explicit provider)Connection errorresolve_provider_client→ succeedsTestAuxiliaryClientPoisonedCacheEvictioncovering helper semantics, the timeout closer eviction, and the explicit-providercall_llm/async_call_llmpaths. All pass.tests/agent/test_auxiliary_client.py(147) green.close()called once → cache entry gone → nextcall_llmresolves a fresh client and returns the response. Non-connection error path verified to not evict.