Skip to content

fix(windows): record download history on non-UTF-8 locales (GBK)#35

Merged
vanloctech merged 4 commits intovanloctech:mainfrom
TTAWDTT:fix/windows-history-not-recorded
Mar 2, 2026
Merged

fix(windows): record download history on non-UTF-8 locales (GBK)#35
vanloctech merged 4 commits intovanloctech:mainfrom
TTAWDTT:fix/windows-history-not-recorded

Conversation

@TTAWDTT
Copy link
Contributor

@TTAWDTT TTAWDTT commented Feb 28, 2026

Summary

On Windows with a non-UTF-8 system locale (e.g. Chinese/GBK, Japanese/Shift-JIS), downloaded files never appeared in the library. This PR fixes the root cause and adds a frontend improvement to ensure the library refreshes immediately after a download completes.

Root cause

When stdout is piped, yt-dlp encodes file paths in the system ANSI code page (e.g. GBK on Chinese Windows). Tokio's BufReader::lines() expects UTF-8 and returns Err(InvalidData) on non-UTF-8 bytes, which silently exits the while let Ok(Some(line)) reading loop. As a result, final_filepath is never captured, the if let Some(ref filepath) guard is skipped, and add_history_internal() is never called.

Additionally, GBK cannot represent certain Unicode characters that yt-dlp uses in filenames (e.g. U+29F8 Big Solidus, used to replace /), so even correct GBK decoding produces an incorrect file path that doesn't match the actual file on disk.

Fix (3-layer defense)

  1. --print-to-file as primary filepath source — yt-dlp's --print-to-file always writes UTF-8 regardless of the system locale. We write after_move:filepath to a temp file and read it back after the process exits. This is the most reliable approach.

  2. Raw byte reading instead of BufReader::lines() — Replace .lines() (which requires UTF-8) with .read_until(b'\n') + manual line-ending stripping. This ensures the reading loop never silently breaks on non-UTF-8 output.

  3. decode_process_output() with Win32 API — A new helper that converts raw process bytes to a Rust String using the MultiByteToWideChar Win32 API (falls back to from_utf8_lossy on non-Windows). This preserves CJK characters in progress messages and log output.

Other fixes in this PR

  • stderr race condition — Replaced task.abort() with tokio::time::timeout(5s) to allow the stderr reader to finish capturing file paths before the main task proceeds
  • [ExtractAudio] Destination: parsing — Capture audio file paths from stderr (yt-dlp sometimes prints these only to stderr)
  • Extended file extension checks — Added .flac and .wav to filepath capture patterns
  • Auto-refresh library — Frontend now listens for download-progress events and refreshes the history list when a download finishes, so new items appear immediately without page switching

Test plan

  • Download audio from Bilibili on Chinese Windows (GBK locale) — file appears in library
  • Verified --print-to-file temp file contains correct UTF-8 filepath
  • Verified file exists on disk with the captured filepath
  • cargo check passes
  • bun run tsc -b passes

🤖 Generated with Claude Code

@TTAWDTT TTAWDTT force-pushed the fix/windows-history-not-recorded branch 2 times, most recently from a216b6d to 2066d04 Compare February 28, 2026 14:35
@TTAWDTT TTAWDTT changed the title fix(windows): record download history and fix redownload path on Windows fix(windows): record download history on non-UTF-8 locales (GBK) Mar 1, 2026
@TTAWDTT TTAWDTT force-pushed the fix/windows-history-not-recorded branch from 15b5da6 to 52e914e Compare March 1, 2026 11:53
Copy link
Owner

@vanloctech vanloctech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Possible race when reading final filepath from --print-to-file temp file in handle_tokio_download.

Right now the code reads filepath_tmp and removes it before process.wait(). If yt-dlp flushes/writes after_move:filepath near process exit, this early read can miss the path, and then the file gets deleted before a second chance. That can still lead to successful download but missing history record.

Suggestion: move the temp-file read/remove block to after process.wait(), or re-check after wait when first read is empty.

TTAWDTT added 3 commits March 1, 2026 23:21
On Windows, yt-dlp routes --print after_move:filepath output to stderr
instead of stdout. The stderr task only parsed progress and logged lines,
never capturing the filepath, so final_filepath stayed None and
add_history_internal was never called, leaving the library empty
after every download.

Also fix redownload output path extraction which used lastIndexOf('/')
and returned an empty string on Windows backslash paths.

- download.rs: capture filepath from stderr as fallback in handle_tokio_download
- HistoryContext.tsx: use Math.max of lastIndexOf('/') and lastIndexOf('\')
  to handle both Unix and Windows path separators
On Chinese Windows (GBK code page), yt-dlp pipes file paths in the
system ANSI encoding.  Tokio's BufReader::lines() expects UTF-8 and
silently drops non-UTF-8 lines, so final_filepath is never captured
and no history record is created.

Three-layer fix:
1. Use --print-to-file (always UTF-8) as the primary filepath source
2. Replace BufReader::lines() with raw byte reading (read_until)
3. Add decode_process_output() using Win32 MultiByteToWideChar API
   to convert system ANSI bytes to UTF-8 as a fallback

Also fixes:
- stderr task.abort() replaced with tokio::time::timeout to avoid
  race condition losing filepath captured in stderr
- Capture [ExtractAudio] Destination: paths from stderr
- Add .flac/.wav to filepath extension checks
Listen for download-progress events and refresh the history list
when status becomes 'finished'.  Without this the user had to
manually switch pages to see newly downloaded items in the library.
@TTAWDTT TTAWDTT force-pushed the fix/windows-history-not-recorded branch from 52e914e to 7f131b7 Compare March 1, 2026 15:26
… file

Prevents a race condition where the temp file is read and deleted before
yt-dlp finishes writing the final filepath, which could result in a
successful download but missing history record.
@TTAWDTT
Copy link
Contributor Author

TTAWDTT commented Mar 1, 2026

P1: Possible race when reading final filepath from --print-to-file temp file in handle_tokio_download.

Right now the code reads filepath_tmp and removes it before process.wait(). If yt-dlp flushes/writes after_move:filepath near process exit, this early read can miss the path, and then the file gets deleted before a second chance. That can still lead to successful download but missing history record.目前代码在读取 filepath_tmp 之后会移除它,

Suggestion: move the temp-file read/remove block to after process.wait(), or re-check after wait when first read is empty.

Thanks for the review! The P1 race condition has been fixed in 4c2b440.

process.wait() is now called before reading the --print-to-file temp file, ensuring yt-dlp has fully exited and flushed the filepath before we attempt to read it. The sequence is now:

Wait for stderr task to finish (timeout 5s)
process.wait() — ensure yt-dlp fully exits
Read temp file
Clean up temp file
Fallback to stdout/stderr captures if temp file was empty
Also rebased onto latest main to resolve conflicts with the BackendError refactoring.

Copy link
Owner

@vanloctech vanloctech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-checked after latest commit (fix(windows): move process.wait() before reading --print-to-file temp file).

The previous P1 race is resolved: handle_tokio_download now waits for process exit before reading/removing filepath_tmp.

I don't see additional blocking issues in this patch.

@TTAWDTT
Copy link
Contributor Author

TTAWDTT commented Mar 2, 2026

Re-checked after latest commit (fix(windows): move process.wait() before reading --print-to-file temp file).

The previous P1 race is resolved: handle_tokio_download now waits for process exit before reading/removing filepath_tmp.

I don't see additional blocking issues in this patch.

Thanks for the review~

@vanloctech vanloctech merged commit 67fa9ef into vanloctech:main Mar 2, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants