feat(wecom): add voice message send support via media upload by chqchshj · Pull Request #12521 · NousResearch/hermes-agent

chqchshj · 2026-04-19T10:50:13Z

Summary

Add voice message send support for WeCom (企业微信) callback adapter.

When the agent generates a TTS voice reply, the adapter now uploads the audio file to WeCom's temporary media store and sends it as a native voice message, instead of falling back to text.

Changes

**** — Upload audio and send as WeCom voice message via /cgi-bin/message/send
**** — Upload files to WeCom temporary media store via /cgi-bin/media/upload (respects 2MB voice limit)
**** — ffmpeg-based audio conversion to AMR format (WeCom voice API requires AMR 8kHz mono)

Design decisions

Graceful fallback: if voice upload or AMR conversion fails, falls back to the parent class send_voice (text-based reply)
Uses existing _resolve_app_for_chat for multi-app support
Auto-cleanup of temporary converted files
Requires ffmpeg on PATH for non-AMR audio files (logged warning if missing)

Test plan

All existing WeCom tests pass (54 passed, 3 skipped)
All CLI/quick_commands tests pass (124 passed)
Python import check passes

- Add send_voice() to upload audio and send as WeCom voice message - Add _upload_media() for WeCom temporary media store upload - Add _convert_to_amr() for ffmpeg-based audio conversion (WeCom requires AMR 8kHz mono) - Graceful fallback to text when voice upload/convert fails - Auto-cleanup of temporary converted files

- Handle incoming voice messages in _build_event (previously dropped) - Download voice audio from WeCom media store via _download_voice_media - Cache locally using cache_audio_from_bytes for STT transcription pipeline - Set MessageType.VOICE + media_urls so gateway auto-transcribes - Make _build_event async (requires media download) - Update tests to use async/await

WeCom voice messages are in AMR format which most STT engines don't support. Convert to WAV (16kHz mono) via ffmpeg before caching for transcription.

alt-glitch · 2026-04-23T11:10:05Z

Overlaps with #8312 which also adds voice message support for the WeCom callback adapter. Maintainers should decide which approach to merge.

chqchshj · 2026-05-16T01:04:44Z

Closing this out for now from my side to reduce stale/open Hermes Agent PRs. Thanks for the review and context.

chqchshj added 2 commits April 19, 2026 18:48

Julientalbot mentioned this pull request Apr 19, 2026

feat(xai): add x_search tool — search X via xAI Responses API #12528

Closed

18 tasks

chqchshj added 2 commits April 19, 2026 19:55

fix(wecom): convert AMR voice to WAV for STT compatibility

8a2554e

WeCom voice messages are in AMR format which most STT engines don't support. Convert to WAV (16kHz mono) via ffmpeg before caching for transcription.

fix(attribution): add chqchshj@outlook.com to AUTHOR_MAP

489b840

alt-glitch added type/feature New feature or request P2 Medium — degraded but workaround exists platform/wecom WeCom / WeChat Work adapter comp/gateway Gateway runner, session dispatch, delivery labels Apr 23, 2026

chqchshj closed this May 16, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(wecom): add voice message send support via media upload#12521

feat(wecom): add voice message send support via media upload#12521
chqchshj wants to merge 4 commits into
NousResearch:mainfrom
chqchshj:feat/wecom-voice-send

chqchshj commented Apr 19, 2026

Uh oh!

alt-glitch commented Apr 23, 2026

Uh oh!

chqchshj commented May 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

chqchshj commented Apr 19, 2026

Summary

Changes

Design decisions

Test plan

Uh oh!

alt-glitch commented Apr 23, 2026

Uh oh!

chqchshj commented May 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants