fix: DingTalk platform adapter has multiple bugs preventing message processing#5038
Closed
cloorc wants to merge 1 commit into
Closed
fix: DingTalk platform adapter has multiple bugs preventing message processing#5038cloorc wants to merge 1 commit into
cloorc wants to merge 1 commit into
Conversation
…rocessing The DingTalk adapter (gateway/platforms/dingtalk.py) has six bugs that prevent it from receiving and responding to messages: 1. DingTalkStreamClient.start() is async but was wrapped in asyncio.to_thread(), which is for sync functions. The coroutine was never awaited, causing infinite reconnection loops. 2. ChatbotHandler.process() is async in the dingtalk-stream SDK, but the adapter defined it as a regular function. The SDK tried to await the returned tuple, causing "object tuple can not be used in await expression". 3. process() blocked the SDK thread with future.result(timeout=60). Agent responses take longer than 60s (LLM inference, tool calls), causing TimeoutError and preventing timely ACK. 4. _extract_text() used getattr(message, "text") which returns None on CallbackMessage. The SDK stores text in message.data["text"]["content"]. Messages were silently dropped as empty. 5. _on_message() used getattr() for all fields (conversation_id, sender_id, session_webhook, etc.) but CallbackMessage stores everything in message.data with camelCase keys. All fields resolved to defaults — session_webhook was never captured, replies impossible. 6. Authorization used encrypted senderId ($:LWCP_v1:$...) as user_id instead of readable senderStaffId (numeric corp employee ID), making allowlists impractical. Fixes: - Direct await on stream_client.start() - async def process() with fire-and-forget task dispatch - _extract_text() falls back to message.data["text"]["content"] - _get_field() helper with camelCase key mapping for CallbackMessage - Use senderStaffId as primary user_id for authorization Closes NousResearch#5037 Signed-off-by: cloorc <wittcnezh@foxmail.com>
Contributor
|
Closing as superseded by #11471 (#11471) which salvaged @kevinskysunny's minimal fix (#11257) and added a follow-up for the broken Thanks for the fix — a lot of contributors hit this SDK break at the same time. Your investigation helped confirm the root cause. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
The DingTalk platform adapter (
gateway/platforms/dingtalk.py) has several critical bugs that prevent it from receiving and responding to messages. Found while debugging on HarmonyOS 4 (DingTalk mobile client).Bugs Found
1.
DingTalkStreamClient.startwrapped incorrectly withasyncio.to_threadLine 135 (original):
DingTalkStreamClient.start()is an async coroutine.asyncio.to_thread()is designed for synchronous (blocking) functions — wrapping an async function creates a coroutine object that never gets awaited. This causes repeated reconnection loops with no useful error.Fix: Changed to
await self._stream_client.start()2.
process()method signature mismatch with SDK base classThe
dingtalk_stream.ChatbotHandler.process()is defined asasync def process(self, message: CallbackMessage)in the SDK. The adapter overrides it as a regulardef process(self, message)— when the SDK triesawait handler.process(message), it gets a plain tuple(STATUS_OK, "OK")and fails with:Fix: Changed to
async def process(self, message)and replaced blockingfuture.result(timeout=60)with fire-and-forgetfuture.add_done_callback()for error logging.3.
TimeoutErrorfrom blockingfuture.result(timeout=60)The
process()method blocked the dingtalk-stream SDK thread waiting for agent processing to complete within 60 seconds. Agent responses typically take longer (tool calls, LLM inference), causing aTimeoutErrorand preventing the ACK from being returned to the SDK promptly.Fix: Return ACK immediately, dispatch agent work as fire-and-forget background task.
4.
_extract_text()fails onCallbackMessage— messages silently droppedThe SDK sends messages as
CallbackMessagewith all fields insidemessage.datadict (e.g.,message.data['text'] = {'content': 'hello'}). The adapter usesgetattr(message, "text", None)which returnsNonesinceCallbackMessagehas notextattribute — onlydata,headers,spec_version,type,extensions.Result: text extraction returns empty string, message is silently skipped with DEBUG log
"Empty message, skipping".Fix: Added fallback to
message.data['text']['content']forCallbackMessageformat.5.
_on_message()fails to extract any fields fromCallbackMessageSame issue as #4 but for ALL fields:
conversation_id,sender_id,sender_nick,sender_staff_id,session_webhook,create_at,conversation_title. All usegetattr(message, "...", default)which returns defaults sinceCallbackMessagestores everything indatadict with camelCase keys (conversationId,senderId,sessionWebhook, etc.).Result:
session_webhookis never captured, replies cannot be sent. User IDs are empty, authorization fails.Fix: Added
_get_field()helper with_DATA_KEY_MAP(snake_case to camelCase) that falls back tomessage.data[key].6. Authorization uses unreadable encrypted
senderIdinstead ofsenderStaffIdDingTalk provides two user identifiers:
senderId: encrypted open ID like$:LWCP_v1:$qoM1+WxS0Q5F5iqeTKOz7Hge06B2HTXW(unreadable, varies per app)senderStaffId: numeric corp employee ID like22514138787330(human-readable, stable)The adapter used
senderIdasuser_idfor authorization, making it impractical to add users to allowlists.Fix: Use
senderStaffIdas primaryuser_id, keepsenderIdasuser_id_alt.Environment
clibranchReproduction
config.yamlhermes gateway runFixes Applied
All six bugs have been fixed locally. The changes ensure:
awaitCallbackMessagefields are properly extracted viadatadictHappy to submit a PR if the fixes look good.