fix(discord): transcribe native voice notes#28918
Closed
helix4u wants to merge 1 commit into
Closed
Conversation
teknium1
added a commit
that referenced
this pull request
May 20, 2026
Sibling fix to PR #28918 (Discord voice notes). DingTalk's rich-text "voice" item type is its native voice-message format, but the adapter was routing it to MessageType.AUDIO — which gateway/run.py:7605 skips for STT. The docs claim every voice-capable platform auto-transcribes, so this brings DingTalk in line. Generic audio uploads (mapped to "file" by DINGTALK_TYPE_MAPPING) are unchanged — they were already classified as DOCUMENT, not AUDIO. Adds tests/gateway/test_dingtalk.py::TestExtractMedia covering both the voice path and the audio-passthrough invariant.
Contributor
|
Merged via #28993 — your commit was cherry-picked onto current main with your authorship preserved (commit 448a3f9, rebase-merge). Thanks @helix4u! Audit of the other gateway platforms turned up DingTalk with the same bug (rich-text |
6 tasks
EloquentBrush0x
added a commit
to EloquentBrush0x/hermes-agent
that referenced
this pull request
May 21, 2026
Feishu's "audio" message type is exclusively for in-app voice recordings, but _resolve_normalized_message_type was delegating to _resolve_media_message_type which maps audio/* MIME types to MessageType.AUDIO. gateway/run.py:7605 skips STT for AUDIO, so every voice note sent on Feishu was silently dropped instead of transcribed. Generic audio file uploads in Feishu travel through message_type="file" → preferred_message_type="document", never through the "audio" branch, so returning MessageType.VOICE here is unambiguous. Sibling fix to PR NousResearch#28922 (DingTalk) and PR NousResearch#28918 (Discord) which corrected the same AUDIO-vs-VOICE misclassification on those platforms. Update the existing test to assert MessageType.VOICE and rename it to reflect the invariant it actually guards.
Lillard01
pushed a commit
to Lillard01/hermes-agent
that referenced
this pull request
May 21, 2026
Sibling fix to PR NousResearch#28918 (Discord voice notes). DingTalk's rich-text "voice" item type is its native voice-message format, but the adapter was routing it to MessageType.AUDIO — which gateway/run.py:7605 skips for STT. The docs claim every voice-capable platform auto-transcribes, so this brings DingTalk in line. Generic audio uploads (mapped to "file" by DINGTALK_TYPE_MAPPING) are unchanged — they were already classified as DOCUMENT, not AUDIO. Adds tests/gateway/test_dingtalk.py::TestExtractMedia covering both the voice path and the audio-passthrough invariant.
Gpapas
pushed a commit
to Gpapas/hermes-agent
that referenced
this pull request
May 23, 2026
Sibling fix to PR NousResearch#28918 (Discord voice notes). DingTalk's rich-text "voice" item type is its native voice-message format, but the adapter was routing it to MessageType.AUDIO — which gateway/run.py:7605 skips for STT. The docs claim every voice-capable platform auto-transcribes, so this brings DingTalk in line. Generic audio uploads (mapped to "file" by DINGTALK_TYPE_MAPPING) are unchanged — they were already classified as DOCUMENT, not AUDIO. Adds tests/gateway/test_dingtalk.py::TestExtractMedia covering both the voice path and the audio-passthrough invariant.
Mucky010
pushed a commit
to Mucky010/hermes-agent
that referenced
this pull request
May 24, 2026
Sibling fix to PR NousResearch#28918 (Discord voice notes). DingTalk's rich-text "voice" item type is its native voice-message format, but the adapter was routing it to MessageType.AUDIO — which gateway/run.py:7605 skips for STT. The docs claim every voice-capable platform auto-transcribes, so this brings DingTalk in line. Generic audio uploads (mapped to "file" by DINGTALK_TYPE_MAPPING) are unchanged — they were already classified as DOCUMENT, not AUDIO. Adds tests/gateway/test_dingtalk.py::TestExtractMedia covering both the voice path and the audio-passthrough invariant.
Bryce-huang
pushed a commit
to wbkunlun/hermes-agent
that referenced
this pull request
May 29, 2026
Sibling fix to PR NousResearch#28918 (Discord voice notes). DingTalk's rich-text "voice" item type is its native voice-message format, but the adapter was routing it to MessageType.AUDIO — which gateway/run.py:7605 skips for STT. The docs claim every voice-capable platform auto-transcribes, so this brings DingTalk in line. Generic audio uploads (mapped to "file" by DINGTALK_TYPE_MAPPING) are unchanged — they were already classified as DOCUMENT, not AUDIO. Adds tests/gateway/test_dingtalk.py::TestExtractMedia covering both the voice path and the audio-passthrough invariant. #AI commit#
gweeteve
pushed a commit
to gweeteve/hermes-agent
that referenced
this pull request
Jun 2, 2026
Sibling fix to PR NousResearch#28918 (Discord voice notes). DingTalk's rich-text "voice" item type is its native voice-message format, but the adapter was routing it to MessageType.AUDIO — which gateway/run.py:7605 skips for STT. The docs claim every voice-capable platform auto-transcribes, so this brings DingTalk in line. Generic audio uploads (mapped to "file" by DINGTALK_TYPE_MAPPING) are unchanged — they were already classified as DOCUMENT, not AUDIO. Adds tests/gateway/test_dingtalk.py::TestExtractMedia covering both the voice path and the audio-passthrough invariant.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Fixes Discord native voice-note handling so inbound voice notes are classified as
MessageType.VOICEinstead of genericMessageType.AUDIO.Discord.py exposes native voice-note metadata through
Attachment.is_voice_message(). Hermes was only checkingcontent_type.startswith("audio/"), which routed Discord voice notes into the plain audio attachment path. The gateway intentionally skips automatic STT forMessageType.AUDIO, so native Discord voice notes were cached as files instead of being automatically transcribed.This keeps ordinary audio uploads as
MessageType.AUDIOand only marks native Discord voice-message attachments asMessageType.VOICE.Related Issue
Support report: Discord voice notes stopped auto-transcribing after update.
Fixes #
Type of Change
Changes Made
gateway/platforms/discord.py: detect native Discord voice-note attachments withattachment.is_voice_message()before the genericaudio/*branch.tests/gateway/test_discord_attachment_download.py: add regression coverage proving native voice notes becomeMessageType.VOICEwhile ordinary audio uploads remainMessageType.AUDIO.How to Test
MessageType.VOICEevent for the attachment.Targeted tests run locally:
python -m pytest tests/gateway/test_discord_attachment_download.py tests/gateway/test_telegram_audio_vs_voice.py -q— 19 passedpython -m pytest tests/gateway/test_stt_config.py -q— 6 passedFull suite run locally:
scripts/run_tests.sh— 17 failed, 24561 passed, 54 skipped, 250 warnings in 646.20sFull-suite failures observed:
tests/gateway/test_api_server.py::TestAdapterInit::test_default_configtests/gateway/test_approve_deny_commands.py::TestBlockingApprovalE2E::test_blocking_approval_approve_oncetests/gateway/test_approve_deny_commands.py::TestBlockingApprovalE2E::test_blocking_approval_denytests/gateway/test_config.py::TestLoadGatewayConfig::test_bridges_quoted_false_platform_enabled_from_config_yamltests/gateway/test_discord_bot_filter.py::TestDiscordBotFilter::test_default_is_nonetests/gateway/test_restart_resume_pending.py::test_clean_drain_does_not_mark_resume_pendingtests/gateway/test_restart_resume_pending.py::test_drain_timeout_only_marks_still_running_sessionstests/gateway/test_runner_startup_failures.py::test_start_gateway_replace_force_uses_terminate_pidtests/hermes_cli/test_gateway_service.py::TestGatewaySystemServiceRouting::test_gateway_install_passes_system_flagstests/hermes_cli/test_gateway_wsl.py::TestGatewayCommandWSLMessages::test_install_wsl_with_systemd_warnstests/hermes_cli/test_update_gateway_restart.py::TestLaunchdPlistPath::test_plist_path_starts_with_venv_bintests/tools/test_file_operations.py::TestGitBaselineCheck::test_git_not_available_returns_nonetests/tools/test_file_operations.py::TestGitBaselineCheck::test_not_in_git_repo_returns_nonetests/tools/test_file_operations.py::TestGitBaselineCheck::test_clean_repo_returns_nonetests/tools/test_file_operations.py::TestGitBaselineCheck::test_dirty_repo_returns_warningtests/tools/test_file_operations.py::TestGitBaselineCheck::test_write_file_includes_git_warning_when_dirtytests/tools/test_tirith_security.py::TestDiskFailureMarker::test_cosign_missing_marker_clears_when_cosign_appearsThe Discord voice-note regression tests added by this PR pass in the targeted run.
Checklist
Code
fix(scope):,feat(scope):, etc.)pytest tests/ -qand all tests passDocumentation & Housekeeping
docs/, docstrings) — or N/Acli-config.yaml.exampleif I added/changed config keys — or N/ACONTRIBUTING.mdorAGENTS.mdif I changed architecture or workflows — or N/AFor New Skills
N/A
Screenshots / Logs
Targeted test output:
tests/gateway/test_discord_attachment_download.py tests/gateway/test_telegram_audio_vs_voice.py: 19 passedtests/gateway/test_stt_config.py: 6 passedFull suite output summary:
scripts/run_tests.sh: 17 failed, 24561 passed, 54 skipped, 250 warnings in 646.20s