fix: handle non-ASCII filenames in Content-Disposition headers by martyniukyurii · Pull Request #79 · jamiepine/voicebox

martyniukyurii · 2026-02-17T08:59:02Z

Summary

Fixes export endpoints crashing with 'latin-1' codec can't encode characters when generated text or profile/story names contain non-ASCII characters (Cyrillic, Chinese, Arabic, etc.)
Adds _safe_content_disposition() helper that builds standards-compliant headers with an ASCII-only filename fallback and RFC 5987 filename*=UTF-8''... parameter for Unicode-capable clients
Affects 4 endpoints: export_profile, export_generation, export_generation_audio, export_story_audio

Root cause

Python 3's str.isalnum() returns True for Unicode alphanumeric characters, so Cyrillic/Chinese/etc. letters pass through the filename filter. However, HTTP response headers are encoded as latin-1 by the ASGI server (uvicorn/starlette), which cannot represent characters outside the 0–255 range.

Fix

Introduced _safe_content_disposition(disposition_type, filename) that:

Strips non-ASCII characters from the filename parameter (latin-1 safe fallback)
Adds a filename*=UTF-8''<percent-encoded> parameter per RFC 5987 so modern browsers can still display the original Unicode filename

Test plan

Generate speech with non-Latin text (e.g. Russian, Chinese)
Verify "Export Audio" downloads successfully (was returning 500)
Verify "Export Package" downloads successfully (was returning 500)
Verify ASCII-only text exports still work as before

Fixes #68

The export endpoints (export-audio, export generation, export profile, export story) crash with `'latin-1' codec can't encode characters` when the generated text or profile/story name contains non-ASCII characters (e.g. Cyrillic, Chinese, Arabic). Root cause: Python's `str.isalnum()` passes Unicode letters through to the filename, but HTTP headers are encoded as latin-1 by the ASGI server, which cannot represent characters outside the 0-255 range. Fix: introduce `_safe_content_disposition()` helper that builds a standards-compliant header with an ASCII-only `filename` fallback and a RFC 5987 `filename*=UTF-8''...` parameter for Unicode-capable clients. Fixes jamiepine#68 Co-authored-by: Cursor <cursoragent@cursor.com>

jamiepine merged commit cc298fe into jamiepine:main Feb 23, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: handle non-ASCII filenames in Content-Disposition headers#79

fix: handle non-ASCII filenames in Content-Disposition headers#79
jamiepine merged 1 commit intojamiepine:mainfrom
martyniukyurii:fix/unicode-content-disposition

martyniukyurii commented Feb 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

martyniukyurii commented Feb 17, 2026

Summary

Root cause

Fix

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants