Skip to content

fix(dingtalk): adapter reliability — websockets proxy, card QPS throttle, inbound queue (rebased)#40929

Open
meng93 wants to merge 1 commit into
NousResearch:mainfrom
meng93:fix/dingtalk-adapter-reliability-v2
Open

fix(dingtalk): adapter reliability — websockets proxy, card QPS throttle, inbound queue (rebased)#40929
meng93 wants to merge 1 commit into
NousResearch:mainfrom
meng93:fix/dingtalk-adapter-reliability-v2

Conversation

@meng93

@meng93 meng93 commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

Summary

Stabilise the DingTalk Stream-Mode adapter with three reliability improvements that prevent message loss and API-side rate-limit errors during sustained usage.

Rebased onto current main — resolves all CI failures from #14333 (which were main-breakage, not DingTalk-related) and the two merge conflicts flagged in review.

Motivation

  1. Websockets proxy capture – the dingtalk-stream SDK clobbers HTTPS_PROXY / HTTP_PROXY at import time, breaking corporate proxy setups.
  2. AI-Card 403 storms – DingTalk's interactive-card PUT API enforces a ~20 QPS limit; exceeding it returns 403 and drops the card update.
  3. Duplicate agent turns – when the agent is busy, a second inbound message for the same chat spawns a parallel turn, producing duplicate (or interleaved) replies.

Changes

File What
gateway/platforms/dingtalk.py _install_dingtalk_websockets_proxy() — snapshot + restore proxy env vars around SDK import
gateway/platforms/dingtalk.py _CardTokenBucket — token-bucket (20 QPS) + per-message 800 ms throttle on edit_message()
gateway/platforms/dingtalk.py _enqueue_inbound / _sweep_session_queues — promise-chain queue per chat; busy-ack with random phrase
gateway/config.py Hydrate *_HOME_CHANNEL yaml keys → os.environ on gateway boot (survives restart)
hermes_cli/dingtalk_auth.py Fix REGISTRATION_SOURCE default (openClawDING_DWS_CLAW)
gateway/run.py Platform display-name overrides for /sethome prompt (DingTalk, WeCom, etc.)
scripts/gateway_guard.sh Auto-restart supervisor with caffeinate support on macOS
.gitignore Add .claude/

Conflicts resolved (vs original #14333)

File Resolution
.gitignore Folded .claude/ into the existing editor-tooling block (.codex/, .cursor/, .gemini/, .zed/) added on main
gateway/run.py Kept _display_overrides dict but routed through main's newer _home_target_env_var(platform_name) helper and _deliver_platform_notice(source, notice) wrapper

Test Plan

  • Cherry-pick applies cleanly onto current main (no conflicts)
  • Existing tests/gateway/test_dingtalk.py suite passes
  • CI green (previous failures were all from tests already fixed on main)
  • Manual testing with DingTalk Stream-Mode behind corporate proxy
  • Sustained card-edit stress test confirms throttle holds at ~18 QPS

Risk Assessment

Low. Changes are additive (new classes / functions) and the queue serialisation is opt-in per chat. The proxy capture only activates when env vars are pre-set.

Supersedes #14333. Credit to @jackjin1997 for the rebase conflict resolution guidance.

🤖 Generated with Claude Code

…tle, inbound queue serialization

- Capture websockets proxy env vars before dingtalk-stream SDK can
  clobber them; restore after import (_install_dingtalk_websockets_proxy).
- Add _CardTokenBucket (token-bucket, 20 QPS) and per-message 800 ms
  edit_message() throttle to stay under DingTalk AI-Card rate limits.
  403 responses trigger an automatic 2 s exponential backoff.
- Inbound message queue (_enqueue_inbound / _sweep_session_queues)
  serialises same-chat messages so long-running agent turns are not
  duplicated; a random 'busy' acknowledgement is sent for queued msgs.
- Hydrate *_HOME_CHANNEL yaml keys into os.environ on gateway boot so
  /sethome survives process restarts (gateway/config.py).
- Fix REGISTRATION_SOURCE default (openClaw → DING_DWS_CLAW) in
  hermes_cli/dingtalk_auth.py.
- Platform display-name overrides for /sethome prompt (DingTalk, WeCom,
  etc.) in gateway/run.py.
- Add scripts/gateway_guard.sh — auto-restart supervisor for the
  gateway process with caffeinate support on macOS.
- Add .claude to .gitignore.
@alt-glitch alt-glitch added type/bug Something isn't working comp/gateway Gateway runner, session dispatch, delivery platform/dingtalk DingTalk adapter P2 Medium — degraded but workaround exists labels Jun 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/gateway Gateway runner, session dispatch, delivery P2 Medium — degraded but workaround exists platform/dingtalk DingTalk adapter type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants