Skip to content

feat(qwen3-tts-cpp): normalize request language for flexible matching#10174

Merged
mudler merged 1 commit into
masterfrom
worktree-feat+qwen3-tts-language
Jun 4, 2026
Merged

feat(qwen3-tts-cpp): normalize request language for flexible matching#10174
mudler merged 1 commit into
masterfrom
worktree-feat+qwen3-tts-language

Conversation

@localai-bot

Copy link
Copy Markdown
Collaborator

What

The qwen3-tts.cpp backend already honored the request language field, but only via an exact-match lowercase two-letter code lookup in the C++ language_to_id table. Anything else (en-US, EN, en_US, english, ...) silently fell back to English, and the behavior was undocumented.

This adds a small normalization step in the Go handler so the field accepts the forms callers actually send:

  • two-letter code, case-insensitive (fr, FR)
  • locale/region forms, region ignored (fr-FR, pt_BR, zh-Hans -> fr/pt/zh)
  • common English full names (english -> en, Portuguese -> pt)

Empty input still defaults to English (handled C++-side), and an unrecognized value is passed through normalized so the existing C++ "Unknown language ... defaulting to English" log still fires.

How

  • New normalizeLanguage() in backend/go/qwen3-tts-cpp/goqwen3ttscpp.go, applied in the TTS handler before the value reaches the C++ synthesize call.
  • The canonical codes deliberately match the existing C++ language_to_id table, so no C++ change and no new token IDs are needed.
  • Pure-Go Ginkgo/Gomega spec (language_test.go) covering codes, casing, whitespace, locale stripping, full-name aliases, empty, and unknown passthrough. No model/.so required, so it runs in CI.
  • New "Language" section in docs/content/features/text-to-audio.md documenting the field, the supported set, the accepted forms, and the default-on-unknown behavior.

Testing

  • go test ./backend/go/qwen3-tts-cpp/... -> green (19 specs)
  • go vet, gofmt clean

The C++ side is unchanged, so it was not rebuilt; verification covers the Go normalization layer.

The qwen3-tts.cpp backend honored the request `language` field only via exact lowercase two-letter codes in the C++ language_to_id table, silently defaulting to English for anything else (en-US, EN, english, ...).

Add normalizeLanguage() in the Go handler: lowercase + trim, strip the region/locale suffix (en-US, pt_BR, zh-Hans -> en/pt/zh), and resolve common English full names (english -> en). The canonical codes match the existing C++ table, so no C++ change is needed. Covered by a pure-Go Ginkgo spec. Also document the language field and accepted forms under the Qwen3-TTS docs.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

Assisted-by: Claude:claude-opus-4-8 [Claude Code]
@mudler mudler merged commit 994063b into master Jun 4, 2026
63 of 64 checks passed
@mudler mudler deleted the worktree-feat+qwen3-tts-language branch June 4, 2026 15:26
@localai-bot localai-bot added the enhancement New feature or request label Jun 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants