Add retry support for OpenAI-compatible LLM providers#37891
Add retry support for OpenAI-compatible LLM providers#37891bennetbo merged 4 commits intozed-industries:mainfrom
Conversation
Automatically retry the agent's LLM completion requests when the provider returns 429 Too Many Requests using the Retry-After header if available. Many providers are frequently overloaded or have low rate limits. These providers are essentially unusable without automatic retries. Tested with Cerebras configured via openai_compatible.
|
We require contributors to sign our Contributor License Agreement, and we don't have @timmclean on file. You can sign our CLA at https://zed.dev/cla. Once you've signed, post a comment here that says '@cla-bot check'. |
|
@cla-bot check |
|
The cla-bot has been summoned, and re-checked this pull request! |
|
Hey @timmclean, sorry for taking such a long time to review this. I just solved the merge conflicts. Can you confirm that this is still working as expected on your end? |
|
Oh great, I will test this in a bit! If I remember correctly, I didn't implement the retries for Anthropic and some other providers. Would you like me to do that? I'm happy to do it to fix #31531 but I can't promise I'll test with every provider since I don't have accounts everywhere |
|
I'd like to get this in without making more changes first. However, I would be more than happy to review follow-up PRs that implement this for other providers. |
…#37891) Automatically retry the agent's LLM completion requests when the provider returns 429 Too Many Requests. Uses the Retry-After header to determine the retry delay if it is available. Many providers are frequently overloaded or have low rate limits. These providers are essentially unusable without automatic retries. Tested with Cerebras configured via openai_compatible. Related: zed-industries#31531 Release Notes: - Added automatic retries for OpenAI-compatible LLM providers --------- Co-authored-by: Bennet Bo Fenner <bennetbo@gmx.de>

Automatically retry the agent's LLM completion requests when the provider returns 429 Too Many Requests. Uses the Retry-After header to determine the retry delay if it is available.
Many providers are frequently overloaded or have low rate limits. These providers are essentially unusable without automatic retries.
Tested with Cerebras configured via openai_compatible.
Related: #31531
Release Notes: