Skip to content

fix(run-agent): rebuild anthropic client on stream retry#23678

Open
LeonSGP43 wants to merge 1 commit into
NousResearch:mainfrom
LeonSGP43:codex/23286-anthropic-stream-retry
Open

fix(run-agent): rebuild anthropic client on stream retry#23678
LeonSGP43 wants to merge 1 commit into
NousResearch:mainfrom
LeonSGP43:codex/23286-anthropic-stream-retry

Conversation

@LeonSGP43

Copy link
Copy Markdown
Contributor

What does this PR do?

Fixes the credential-loss half of #23286 for Anthropic-wire providers like DeepSeek's /anthropic endpoint.

When a streaming retry is triggered after a transient stream drop, the retry cleanup path currently rebuilds the shared OpenAI client unconditionally. In api_mode == "anthropic_messages", that is the wrong transport: these providers stream through self._anthropic_client, and third-party Anthropic endpoints often leave self._client_kwargs empty. The result is a misleading Missing credentials rebuild failure during retry cleanup.

This change routes stream-retry cleanup through the Anthropic client when the active transport is anthropic_messages, so retries reset the correct client instead of trying to rebuild an unused OpenAI one.

It does not claim to solve DeepSeek's upstream ~600s server-side stream limit; it only fixes the retry-side credential loss / wrong-client rebuild.

Related Issue

Related to #23286

Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)
  • ✨ New feature (non-breaking change that adds functionality)
  • 🔒 Security fix
  • 📝 Documentation update
  • ✅ Tests (adding or improving test coverage)
  • ♻️ Refactor (no behavior change)
  • 🎯 New skill (bundled or hub)

Changes Made

  • Added AIAgent._refresh_primary_stream_retry_client() in /run_agent.py to branch retry cleanup by active transport.
  • Reused the existing Anthropic client rebuild path for api_mode == "anthropic_messages" instead of calling _replace_primary_openai_client().
  • Updated both transient stream retry cleanup sites (stream_retry_pool_cleanup and stream_mid_tool_retry_pool_cleanup) to use the new helper.
  • Added a regression test in /tests/run_agent/test_streaming.py that proves Anthropic stream retries rebuild the Anthropic client and do not try to rebuild the OpenAI client.

How to Test

  1. Run:
    uv run --frozen pytest -q -o addopts='' tests/run_agent/test_streaming.py -k 'AnthropicStreamRetryCleanup or test_anthropic_stream_refreshes_activity_on_every_event'
  2. Run:
    uv run --frozen ruff check run_agent.py tests/run_agent/test_streaming.py
  3. Optionally reproduce the original issue on a DeepSeek Anthropic endpoint and confirm stream retry cleanup no longer logs Failed to rebuild shared OpenAI client ... Missing credentials.

Checklist

Code

  • I've read the Contributing Guide
  • My commit messages follow Conventional Commits (fix(scope):, feat(scope):, etc.)
  • I searched for existing PRs to make sure this isn't a duplicate
  • My PR contains only changes related to this fix/feature (no unrelated commits)
  • I've run pytest tests/ -q and all tests pass
  • I've added tests for my changes (required for bug fixes, strongly encouraged for features)
  • I've tested on my platform: macOS

Documentation & Housekeeping

  • I've updated relevant documentation (README, docs/, docstrings) — or N/A
  • I've updated cli-config.yaml.example if I added/changed config keys — or N/A
  • I've updated CONTRIBUTING.md or AGENTS.md if I changed architecture or workflows — or N/A
  • I've considered cross-platform impact (Windows, macOS) per the compatibility guide — or N/A
  • I've updated tool descriptions/schemas if I changed tool behavior — or N/A

Screenshots / Logs

Targeted regression test:

  • 2 passed, 33 deselected

Repo smoke:

  • uv run --frozen pytest -q -o addopts='' --collect-only tests/run_agent/test_streaming.py
  • 35 tests collected

@alt-glitch alt-glitch added type/bug Something isn't working comp/agent Core agent loop, run_agent.py, prompt builder provider/anthropic Anthropic native Messages API P2 Medium — degraded but workaround exists labels May 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P2 Medium — degraded but workaround exists provider/anthropic Anthropic native Messages API type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants