fix(feishu): propagate rate-limit errors from typing indicator to circuit breaker#28157
fix(feishu): propagate rate-limit errors from typing indicator to circuit breaker#28157guoqunabc wants to merge 7 commits intoopenclaw:mainfrom
Conversation
…cuit breaker The typing indicator's addTypingIndicator/removeTypingIndicator silently caught all errors, preventing the circuit breaker in createTypingCallbacks (typing-start-guard) from tripping on rate-limit failures. This caused the keepalive loop to retry every 3s indefinitely when the Feishu API quota was exceeded, generating 76,000+ failed calls in ~8 hours. Changes: - Extract isFeishuBackoffError() helper that detects HTTP 429, Feishu error codes 99991403 (quota exceeded) and 99991400 (rate limit), in both AxiosError and SDK error shapes - Re-throw backoff errors from both add and remove functions so the circuit breaker can count consecutive failures and trip - Add comprehensive unit tests for the detection helper - Other non-critical errors (message deleted, permissions) remain silently handled Fixes #28062
Greptile SummaryFixed critical rate-limit handling bug in Feishu typing indicator that caused infinite API retries when quota was exceeded (reported as 76,000+ calls in 8 hours). Key changes:
The implementation is clean, well-structured, and follows repository guidelines. No issues found. Confidence Score: 5/5
Last reviewed commit: 559526d |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 87dd4c6cd2
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
Address Codex review feedback on #28157: the Feishu SDK may return a normal response object with an API-level error code (e.g. 99991403) instead of throwing. Previously this path treated the call as success with reactionId: null, so the circuit breaker never counted a failure. Changes: - Add hasBackoffCodeInResponse() helper to detect backoff codes in normal SDK response objects - Check response codes after both messageReaction.create and .delete before treating the result as success - Throw to trip the circuit breaker when backoff codes are detected - Add 7 new unit tests for hasBackoffCodeInResponse() (total: 18)
|
Thanks for the excellent catch @chatgpt-codex-connector! 🎯 You're right — the Feishu SDK can return a normal response with an API-level error code instead of throwing. This would bypass the Addressed in
|
|
To use Codex here, create a Codex account and connect to github. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4f201ae6fe
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
Address Codex P2 review: when hasBackoffCodeInResponse() detects a backoff code and throws, the error is caught by the surrounding try/catch. A plain Error has no .code property, so isFeishuBackoffError() cannot match it and the error is silently swallowed. This affects BOTH addTypingIndicator and removeTypingIndicator (Codex only flagged removeTypingIndicator). Fix: introduce FeishuBackoffError class that extends Error with a numeric .code property. isFeishuBackoffError() already checks err.code, so these errors are now correctly detected and re-thrown to trip the circuit breaker. Added 5 new tests for FeishuBackoffError including a catch-rethrow simulation that proves the fix works (total: 23 tests).
|
Addressed Codex P2 review in Root cause: Fix: Introduced Added 5 new tests (total: 23), including a catch-rethrow simulation that proves the fix works end-to-end. 🎯 |
…h getBackoffCodeFromResponse The boolean helper left `code` typed as `number | undefined`, causing TS2345 when passed to FeishuBackoffError(number). Instead of using a non-null assertion, refactored to getBackoffCodeFromResponse() which returns the numeric code directly (or undefined), eliminating the type mismatch cleanly.
Address Codex review feedback on openclaw#28157: the Feishu SDK may return a normal response object with an API-level error code (e.g. 99991403) instead of throwing. Previously this path treated the call as success with reactionId: null, so the circuit breaker never counted a failure. Changes: - Add hasBackoffCodeInResponse() helper to detect backoff codes in normal SDK response objects - Check response codes after both messageReaction.create and .delete before treating the result as success - Throw to trip the circuit breaker when backoff codes are detected - Add 7 new unit tests for hasBackoffCodeInResponse() (total: 18)
Summary / 概述
Fixes #28062
The typing indicator silently swallowed all errors, preventing the circuit breaker (
typing-start-guard,maxConsecutiveFailures: 2) from tripping on rate-limit failures. This caused the keepalive loop to retry every 3 seconds indefinitely when the Feishu API quota was exceeded.飞书 typing indicator 静默吞掉了所有异常,导致断路器无法在 429/配额超限时触发熔断。keepalive 循环每3秒无限重试,实测8小时内产生 76,000+ 次无效 API 调用。
Changes / 改动
extensions/feishu/src/typing.tsisFeishuBackoffError()helper — detects rate-limit conditions in both AxiosError and Feishu SDK error shapes:99991403(monthly quota exceeded)99991400(per-second rate limit)Setfor error codes, making it easy to add future codesaddTypingIndicatorandremoveTypingIndicatorso the circuit breaker can count failures and trip"hit rate-limit/quota, stopping keepalive") to differentiate from non-critical failures in debug logs提取
isFeishuBackoffError()辅助函数,支持 AxiosError 和飞书 SDK 两种错误形态。429/配额错误向上抛出,让断路器正常熔断。使用Set存储错误码,方便未来扩展。extensions/feishu/src/typing.test.ts(new)新增 11 个单元测试,覆盖各种错误形态和边界条件。
Comparison with #28086 / 与 #28086 的区别
Set-based, easy to add codesisFeishuBackoffError)Testing / 测试