Skip to content

fix: handle non-ASCII filenames in Content-Disposition headers#79

Merged
jamiepine merged 1 commit intojamiepine:mainfrom
martyniukyurii:fix/unicode-content-disposition
Feb 23, 2026
Merged

fix: handle non-ASCII filenames in Content-Disposition headers#79
jamiepine merged 1 commit intojamiepine:mainfrom
martyniukyurii:fix/unicode-content-disposition

Conversation

@martyniukyurii
Copy link

Summary

  • Fixes export endpoints crashing with 'latin-1' codec can't encode characters when generated text or profile/story names contain non-ASCII characters (Cyrillic, Chinese, Arabic, etc.)
  • Adds _safe_content_disposition() helper that builds standards-compliant headers with an ASCII-only filename fallback and RFC 5987 filename*=UTF-8''... parameter for Unicode-capable clients
  • Affects 4 endpoints: export_profile, export_generation, export_generation_audio, export_story_audio

Root cause

Python 3's str.isalnum() returns True for Unicode alphanumeric characters, so Cyrillic/Chinese/etc. letters pass through the filename filter. However, HTTP response headers are encoded as latin-1 by the ASGI server (uvicorn/starlette), which cannot represent characters outside the 0–255 range.

Fix

Introduced _safe_content_disposition(disposition_type, filename) that:

  1. Strips non-ASCII characters from the filename parameter (latin-1 safe fallback)
  2. Adds a filename*=UTF-8''<percent-encoded> parameter per RFC 5987 so modern browsers can still display the original Unicode filename

Test plan

  • Generate speech with non-Latin text (e.g. Russian, Chinese)
  • Verify "Export Audio" downloads successfully (was returning 500)
  • Verify "Export Package" downloads successfully (was returning 500)
  • Verify ASCII-only text exports still work as before

Fixes #68

The export endpoints (export-audio, export generation, export profile,
export story) crash with `'latin-1' codec can't encode characters` when
the generated text or profile/story name contains non-ASCII characters
(e.g. Cyrillic, Chinese, Arabic).

Root cause: Python's `str.isalnum()` passes Unicode letters through to
the filename, but HTTP headers are encoded as latin-1 by the ASGI server,
which cannot represent characters outside the 0-255 range.

Fix: introduce `_safe_content_disposition()` helper that builds a
standards-compliant header with an ASCII-only `filename` fallback and a
RFC 5987 `filename*=UTF-8''...` parameter for Unicode-capable clients.

Fixes jamiepine#68

Co-authored-by: Cursor <cursoragent@cursor.com>
@jamiepine jamiepine merged commit cc298fe into jamiepine:main Feb 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Export Audio and Export Package not working with Russian text on MacOS

2 participants