fix(dingtalk): adapter reliability — websockets proxy, card QPS throttle, inbound queue#14333
fix(dingtalk): adapter reliability — websockets proxy, card QPS throttle, inbound queue#14333meng93 wants to merge 1 commit into
Conversation
…tle, inbound queue serialization - Capture websockets proxy env vars before dingtalk-stream SDK can clobber them; restore after import (_install_dingtalk_websockets_proxy). - Add _CardTokenBucket (token-bucket, 20 QPS) and per-message 800 ms edit_message() throttle to stay under DingTalk AI-Card rate limits. 403 responses trigger an automatic 2 s exponential backoff. - Inbound message queue (_enqueue_inbound / _sweep_session_queues) serialises same-chat messages so long-running agent turns are not duplicated; a random 'busy' acknowledgement is sent for queued msgs. - Hydrate *_HOME_CHANNEL yaml keys into os.environ on gateway boot so /sethome survives process restarts (gateway/config.py). - Fix REGISTRATION_SOURCE default (openClaw → DING_DWS_CLAW) in hermes_cli/dingtalk_auth.py. - Platform display-name overrides for /sethome prompt (DingTalk, WeCom, etc.) in gateway/run.py. - Add scripts/gateway_guard.sh — auto-restart supervisor for the gateway process with caffeinate support on macOS. - Add .claude to .gitignore.
975a91a to
d9811d8
Compare
|
The CI failures on this stack all look like main-breakage at submit time, not anything in the DingTalk changes:
None of these touch DingTalk code. A rebase onto current main should turn the test job green — same situation I hit on #17141 back in April. That might also help move the review queue @alt-glitch flagged. |
|
Followup: I rebased your single commit onto current main and pushed it to my fork so you can pick it up without redoing the conflict resolution yourself. Branch: https://github.com/jackjin1997/hermes-agent/tree/rebase/meng93-14333-onto-main (commit Two conflicts I resolved (no behavior change):
If you want it: git fetch https://github.com/jackjin1997/hermes-agent.git rebase/meng93-14333-onto-main
git reset --hard FETCH_HEAD
git push origin fix/dingtalk-adapter-reliability --force-with-leaseThat should turn the CI job green (the test failures were all main-breakage from 4-23, already fixed on main as noted above). Happy to do the same for #14334-#14336 if this one lands cleanly. |
|
Superseded by #40929 (rebased onto current main, CI failures resolved). |
Summary
Stabilise the DingTalk Stream-Mode adapter with three reliability improvements that prevent message loss and API-side rate-limit errors during sustained usage.
Motivation
dingtalk-streamSDK clobbersHTTPS_PROXY/HTTP_PROXYat import time, breaking corporate proxy setups.PUTAPI enforces a ~20 QPS limit; exceeding it returns 403 and drops the card update.Changes
gateway/platforms/dingtalk.py_install_dingtalk_websockets_proxy()— snapshot + restore proxy env vars around SDK importgateway/platforms/dingtalk.py_CardTokenBucket— token-bucket (20 QPS) + per-message 800 ms throttle onedit_message()gateway/platforms/dingtalk.py_enqueue_inbound/_sweep_session_queues— promise-chain queue per chat; busy-ack with random phrasegateway/config.py*_HOME_CHANNELyaml keys →os.environon gateway boot (survives restart)hermes_cli/dingtalk_auth.pyREGISTRATION_SOURCEdefault (openClaw→DING_DWS_CLAW)gateway/run.py/sethomeprompt (DingTalk,WeCom, etc.)scripts/gateway_guard.shcaffeinatesupport on macOS.gitignore.claudeTest Plan
tests/gateway/test_dingtalk.pysuite passes.Risk Assessment
Low. Changes are additive (new classes / functions) and the queue serialisation is opt-in per chat. The proxy capture only activates when env vars are pre-set.