Skip to content

fix: DingTalk platform adapter has multiple bugs preventing message processing #5037

@cloorc

Description

@cloorc

Description

The DingTalk platform adapter (gateway/platforms/dingtalk.py) has several critical bugs that prevent it from receiving and responding to messages. Found while debugging on HarmonyOS 4 (DingTalk mobile client).

Bugs Found

1. DingTalkStreamClient.start wrapped incorrectly with asyncio.to_thread

Line 135 (original):

await asyncio.to_thread(self._stream_client.start)

DingTalkStreamClient.start() is an async coroutine. asyncio.to_thread() is designed for synchronous (blocking) functions — wrapping an async function creates a coroutine object that never gets awaited. This causes repeated reconnection loops with no useful error.

Fix: Changed to await self._stream_client.start()

2. process() method signature mismatch with SDK base class

The dingtalk_stream.ChatbotHandler.process() is defined as async def process(self, message: CallbackMessage) in the SDK. The adapter overrides it as a regular def process(self, message) — when the SDK tries await handler.process(message), it gets a plain tuple (STATUS_OK, "OK") and fails with:

ERROR dingtalk_stream.client: error processing message: object tuple can't be used in 'await' expression

Fix: Changed to async def process(self, message) and replaced blocking future.result(timeout=60) with fire-and-forget future.add_done_callback() for error logging.

3. TimeoutError from blocking future.result(timeout=60)

The process() method blocked the dingtalk-stream SDK thread waiting for agent processing to complete within 60 seconds. Agent responses typically take longer (tool calls, LLM inference), causing a TimeoutError and preventing the ACK from being returned to the SDK promptly.

ERROR gateway.platforms.dingtalk: [DingTalk] Error processing incoming message
TimeoutError

Fix: Return ACK immediately, dispatch agent work as fire-and-forget background task.

4. _extract_text() fails on CallbackMessage — messages silently dropped

The SDK sends messages as CallbackMessage with all fields inside message.data dict (e.g., message.data['text'] = {'content': 'hello'}). The adapter uses getattr(message, "text", None) which returns None since CallbackMessage has no text attribute — only data, headers, spec_version, type, extensions.

Result: text extraction returns empty string, message is silently skipped with DEBUG log "Empty message, skipping".

Fix: Added fallback to message.data['text']['content'] for CallbackMessage format.

5. _on_message() fails to extract any fields from CallbackMessage

Same issue as #4 but for ALL fields: conversation_id, sender_id, sender_nick, sender_staff_id, session_webhook, create_at, conversation_title. All use getattr(message, "...", default) which returns defaults since CallbackMessage stores everything in data dict with camelCase keys (conversationId, senderId, sessionWebhook, etc.).

Result: session_webhook is never captured, replies cannot be sent. User IDs are empty, authorization fails.

Fix: Added _get_field() helper with _DATA_KEY_MAP (snake_case to camelCase) that falls back to message.data[key].

6. Authorization uses unreadable encrypted senderId instead of senderStaffId

DingTalk provides two user identifiers:

  • senderId: encrypted open ID like $:LWCP_v1:$qoM1+WxS0Q5F5iqeTKOz7Hge06B2HTXW (unreadable, varies per app)
  • senderStaffId: numeric corp employee ID like 22514138787330 (human-readable, stable)

The adapter used senderId as user_id for authorization, making it impractical to add users to allowlists.

Fix: Use senderStaffId as primary user_id, keep senderId as user_id_alt.

Environment

  • OS: HarmonyOS 4 (DingTalk mobile client sending messages)
  • Hermes: Current cli branch
  • Python: 3.11.15
  • dingtalk-stream SDK: Latest from pip
  • Platform: DingTalk Stream Mode (WebSocket)

Reproduction

  1. Configure DingTalk platform in config.yaml
  2. Start gateway: hermes gateway run
  3. Send message from DingTalk mobile client
  4. Observe: messages arrive but are silently dropped or timeout

Fixes Applied

All six bugs have been fixed locally. The changes ensure:

  • Stream client starts correctly with await
  • SDK thread is not blocked by long-running agent processing
  • CallbackMessage fields are properly extracted via data dict
  • Human-readable staff IDs are used for authorization

Happy to submit a PR if the fixes look good.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions