Skip to content

feat(dingtalk): add QR code scan authorization for setup wizard#9610

Closed
audanye-sudo wants to merge 1 commit into
NousResearch:mainfrom
audanye-sudo:feat/dingtalk-qr-auth
Closed

feat(dingtalk): add QR code scan authorization for setup wizard#9610
audanye-sudo wants to merge 1 commit into
NousResearch:mainfrom
audanye-sudo:feat/dingtalk-qr-auth

Conversation

@audanye-sudo

Copy link
Copy Markdown

Summary

Adds a QR code scan authorization flow to the DingTalk setup wizard (hermes gateway setup), eliminating the need for users to manually navigate the DingTalk Developer Console to obtain Client ID and Client Secret. This dramatically simplifies DingTalk onboarding — from a multi-step developer console workflow to a single QR scan.

Motivation

Currently, configuring DingTalk requires users to:

  1. Log in to the DingTalk Developer Console (open.dingtalk.com)
  2. Create an application manually
  3. Enable the Robot capability
  4. Copy the Client ID and Client Secret
  5. Paste them into the setup wizard

This is error-prone (wrong credentials, missing robot capability) and intimidating for non-developer users. Other DingTalk ecosystem tools (e.g., dingtalk-openclaw-connector) already support device-flow registration — this PR brings parity.

How It Works

The implementation follows the DingTalk device-flow registration protocol (same as OAuth device flow):

┌─────────────────────────────────────────────────────────┐
│  1. POST /app/registration/init                         │
│     → nonce                                             │
│                                                         │
│  2. POST /app/registration/begin                        │
│     → device_code + verification_uri_complete           │
│                                                         │
│  3. Render QR code in terminal                          │
│     User scans with DingTalk app → authorizes           │
│                                                         │
│  4. POST /app/registration/poll (loop)                  │
│     → WAITING... WAITING... SUCCESS                     │
│     → client_id + client_secret                         │
│                                                         │
│  5. Auto-save to .env:                                  │
│     DINGTALK_CLIENT_ID=xxx                              │
│     DINGTALK_CLIENT_SECRET=xxx                          │
│     DINGTALK_ALLOW_ALL_USERS=true                       │
└─────────────────────────────────────────────────────────┘

User Experience

When running hermes gateway setup and selecting DingTalk:

  ─── 🔔 DingTalk Setup ───

  Choose setup method
  [0] QR Code Scan (Recommended, auto-obtain Client ID and Client Secret)
  [1] Manual Input (Client ID and Client Secret)
  > 0

  Initializing DingTalk device authorization...

  Please scan the QR code below with DingTalk to authorize:

    ▄▄▄▄▄▄▄ ▄▄▄▄▄ ▄▄▄▄▄▄▄
    █ ▄▄▄ █ █▀▄▀█ █ ▄▄▄ █
    █ ███ █ ▄▀█▀▄ █ ███ █
    ...

  Or open this link manually: https://oapi.dingtalk.com/...

  Waiting for QR scan authorization... (timeout: 2 hours)
  ......
  ✓ QR scan authorization successful!
  ✓ Client ID:     dingxxxxxxxx
  ✓ Client Secret: xxxxxxxx********

Changes

File Change
hermes_cli/dingtalk_auth.py New — Device-flow registration module (init/begin/poll + QR rendering)
hermes_cli/gateway.py DingTalk setup now offers QR scan vs manual input choice; wired into platform dispatch
gateway/config.py Added DINGTALK_CLIENT_ID / DINGTALK_CLIENT_SECRET env var auto-detection (parity with Feishu/Telegram/Discord)

Design Decisions

  1. QR library auto-install: If qrcode is not installed, the module tries uv pip install then pip install automatically. Falls back to showing the plain URL if install fails.
  2. Graceful degradation: If the QR auth module fails to import or the flow fails at any step, the wizard falls back to manual input — no dead ends.
  3. 2-hour timeout with polling: Matches the DingTalk API expires_in default. Polling interval respects the server-suggested value (minimum 2s).
  4. Transient error retry window: Network blips during polling get a 120s retry window before giving up — handles mobile network flakiness during QR scan.
  5. DINGTALK_ALLOW_ALL_USERS=true by default: Both QR and manual paths set this for convenience, since most users want all org members to be able to interact with the bot.

Commits

Commit Scope
feat(dingtalk): add QR code scan authorization for setup wizard hermes_cli/dingtalk_auth.py, hermes_cli/gateway.py, gateway/config.py

Test Plan

  • QR scan flow: init → begin → render QR → scan with DingTalk → poll succeeds → credentials saved to .env
  • Manual input fallback: selecting option [1] behaves identically to the previous flow
  • QR auth failure: network error during init → prints error, falls back to manual
  • QR auth timeout: waited past expiry → clean error message, falls back to manual
  • qrcode library missing: auto-installs via uv pip, renders QR successfully
  • qrcode install fails: shows plain URL link, user can open in browser
  • Reconfiguration: existing DINGTALK_CLIENT_ID detected → prompts "Reconfigure?" → respects choice
  • gateway/config.py env var detection: DINGTALK_CLIENT_ID + DINGTALK_CLIENT_SECRET in env → platform auto-enabled
  • DINGTALK_HOME_CHANNEL env var → HomeChannel correctly configured

Risk Assessment

Low risk — purely additive feature:

  • Existing manual setup flow is untouched (option [1] calls _setup_standard_platform as before)
  • New module dingtalk_auth.py is only imported when user selects QR scan
  • gateway/config.py change adds env var detection that was previously missing (all other platforms already had this)
  • No new required dependencies (qrcode is optional, auto-installed only when needed)
  • No changes to the DingTalk runtime adapter — this only affects the one-time setup wizard

Dependencies: This feature is independent of the stream adapter fix PRs and can be merged in any order.

Implement DingTalk device-flow registration that allows users to configure
the DingTalk platform by scanning a QR code with their DingTalk app instead
of manually copying Client ID and Client Secret from the developer console.

New module hermes_cli/dingtalk_auth.py implements the 3-step registration
flow (init → begin → poll) with terminal QR code rendering. The setup wizard
in hermes_cli/gateway.py now offers QR scan as the recommended method with
manual input as fallback. Also adds DINGTALK env var auto-detection in
gateway/config.py for consistency with other platforms.
@meng93

meng93 commented Apr 14, 2026

Copy link
Copy Markdown
Contributor

I need this func!

@audanye-sudo

Copy link
Copy Markdown
Author

Last one of the DingTalk trilogy (#9608, #9609, this one). The first two fix blocking bugs; this one improves the setup experience.

The problem with current DingTalk setup

Right now, configuring DingTalk in hermes gateway setup requires users to:

  1. Go to the DingTalk Developer Console (open.dingtalk.com)
  2. Create a new application
  3. Enable the "Robot" capability
  4. Find and copy the Client ID
  5. Find and copy the Client Secret
  6. Paste them into the CLI

This is a 5+ step manual process through a Chinese-language developer portal. For non-Chinese speakers it's basically impossible without a tutorial. Even for Chinese users, it's tedious and error-prone — wrong credentials, forgetting to enable robot capability, etc.

Compare this to how other DingTalk ecosystem tools handle it: they use a QR code scan flow (similar to OAuth device flow). You scan a code with your DingTalk app, tap "authorize," and you're done. Credentials are obtained automatically.

What this PR adds

A QR code scan option in the setup wizard:

Choose setup method
[0] QR Code Scan (Recommended)
[1] Manual Input (Client ID and Client Secret)

Option 0 renders a QR code right in the terminal. Scan it with DingTalk, approve, and the Client ID + Secret are automatically fetched and saved. The whole thing takes about 10 seconds vs. the 5-minute manual flow.

If the QR flow fails for any reason (no internet, user cancels, library missing), it gracefully falls back to manual input. No dead ends.

Relationship to the other two PRs

This PR is independent of the bug fixes in #9608 and #9609. You can merge them in any order. But obviously, all three together give users the best experience: easy setup (this PR) + actually working message handling (#9608 + #9609).

@RuckVibeCodes RuckVibeCodes left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[gus-first-pass] feat(dingtalk): add QR code scan authorization for setup wizard - Clear feature addition, no issues found.

@teknium1

Copy link
Copy Markdown
Contributor

Closing — your QR auth PR ships the same three-step device flow as #8345 (meng93, submitted Apr 12 — before this PR), which has now been merged via salvage PR #11574 (#11574).

Credit goes to meng93 as the original author since their submission was first and the implementations are line-for-line identical. Thanks for your effort and the independent validation of the approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants