Skip to content

feat(wecom): add image and voice message support for callback adapter#8312

Closed
chqchshj wants to merge 2 commits into
NousResearch:mainfrom
chqchshj:feat/wecom-image-voice-support
Closed

feat(wecom): add image and voice message support for callback adapter#8312
chqchshj wants to merge 2 commits into
NousResearch:mainfrom
chqchshj:feat/wecom-image-voice-support

Conversation

@chqchshj

Copy link
Copy Markdown
Contributor

Changes

wecom_callback.py

  • Image support: Download images via media/get API with PicUrl fallback, auto-detect format (PNG/JPG/WebP)
  • Voice support: Download AMR audio via media/get API, cache locally for STT processing
  • _build_event: Converted from sync to async to support media downloads
  • msg_type whitelist: Expanded from {text, event} to {text, event, image, voice}
  • Added helper for magic-byte format detection

transcription_tools.py

  • Added and to SUPPORTED_FORMATS (WeChat native audio formats)
  • Added bidirectional STT fallback: local → Groq when primary fails, Groq → local when primary fails
  • Fallback is automatic and transparent — logs provider as e.g. groq (fallback)

Motivation

WeCom callback adapter currently only handles text messages. Users sending images or voice messages get silently ignored. This adds full media support with automatic STT transcription for voice.

chqchshj added 2 commits April 12, 2026 18:34
- Expand msg_type whitelist to include 'image' and 'voice'
- Convert _build_event to async to support media downloads
- Add _download_image: fetch via media/get API with PicUrl fallback
- Add _download_voice: fetch AMR audio via media/get API
- Add _detect_image_ext helper for magic-byte format detection
- Add .amr and .silk to SUPPORTED_FORMATS for WeChat audio
- Add bidirectional STT fallback: local <-> Groq
The _build_event method was changed to async but the tests were not
updated to use await, causing 'coroutine object has no attribute source'
failures in CI.
@alt-glitch alt-glitch added type/feature New feature or request P2 Medium — degraded but workaround exists comp/gateway Gateway runner, session dispatch, delivery platform/wecom WeCom / WeChat Work adapter tool/tts Text-to-speech and transcription labels Apr 28, 2026
@chqchshj

Copy link
Copy Markdown
Contributor Author

Closing this out for now from my side to reduce stale/open Hermes Agent PRs. Thanks for the review and context.

@chqchshj chqchshj closed this May 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/gateway Gateway runner, session dispatch, delivery P2 Medium — degraded but workaround exists platform/wecom WeCom / WeChat Work adapter tool/tts Text-to-speech and transcription type/feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants