fix(windows): record download history on non-UTF-8 locales (GBK)#35
Conversation
a216b6d to
2066d04
Compare
15b5da6 to
52e914e
Compare
vanloctech
left a comment
There was a problem hiding this comment.
P1: Possible race when reading final filepath from --print-to-file temp file in handle_tokio_download.
Right now the code reads filepath_tmp and removes it before process.wait(). If yt-dlp flushes/writes after_move:filepath near process exit, this early read can miss the path, and then the file gets deleted before a second chance. That can still lead to successful download but missing history record.
Suggestion: move the temp-file read/remove block to after process.wait(), or re-check after wait when first read is empty.
On Windows, yt-dlp routes --print after_move:filepath output to stderr
instead of stdout. The stderr task only parsed progress and logged lines,
never capturing the filepath, so final_filepath stayed None and
add_history_internal was never called, leaving the library empty
after every download.
Also fix redownload output path extraction which used lastIndexOf('/')
and returned an empty string on Windows backslash paths.
- download.rs: capture filepath from stderr as fallback in handle_tokio_download
- HistoryContext.tsx: use Math.max of lastIndexOf('/') and lastIndexOf('\')
to handle both Unix and Windows path separators
On Chinese Windows (GBK code page), yt-dlp pipes file paths in the system ANSI encoding. Tokio's BufReader::lines() expects UTF-8 and silently drops non-UTF-8 lines, so final_filepath is never captured and no history record is created. Three-layer fix: 1. Use --print-to-file (always UTF-8) as the primary filepath source 2. Replace BufReader::lines() with raw byte reading (read_until) 3. Add decode_process_output() using Win32 MultiByteToWideChar API to convert system ANSI bytes to UTF-8 as a fallback Also fixes: - stderr task.abort() replaced with tokio::time::timeout to avoid race condition losing filepath captured in stderr - Capture [ExtractAudio] Destination: paths from stderr - Add .flac/.wav to filepath extension checks
Listen for download-progress events and refresh the history list when status becomes 'finished'. Without this the user had to manually switch pages to see newly downloaded items in the library.
52e914e to
7f131b7
Compare
… file Prevents a race condition where the temp file is read and deleted before yt-dlp finishes writing the final filepath, which could result in a successful download but missing history record.
Thanks for the review! The P1 race condition has been fixed in 4c2b440. process.wait() is now called before reading the --print-to-file temp file, ensuring yt-dlp has fully exited and flushed the filepath before we attempt to read it. The sequence is now: Wait for stderr task to finish (timeout 5s) |
vanloctech
left a comment
There was a problem hiding this comment.
Re-checked after latest commit (fix(windows): move process.wait() before reading --print-to-file temp file).
The previous P1 race is resolved: handle_tokio_download now waits for process exit before reading/removing filepath_tmp.
I don't see additional blocking issues in this patch.
Thanks for the review~ |
Summary
On Windows with a non-UTF-8 system locale (e.g. Chinese/GBK, Japanese/Shift-JIS), downloaded files never appeared in the library. This PR fixes the root cause and adds a frontend improvement to ensure the library refreshes immediately after a download completes.
Root cause
When stdout is piped, yt-dlp encodes file paths in the system ANSI code page (e.g. GBK on Chinese Windows). Tokio's
BufReader::lines()expects UTF-8 and returnsErr(InvalidData)on non-UTF-8 bytes, which silently exits thewhile let Ok(Some(line))reading loop. As a result,final_filepathis never captured, theif let Some(ref filepath)guard is skipped, andadd_history_internal()is never called.Additionally, GBK cannot represent certain Unicode characters that yt-dlp uses in filenames (e.g.
⧸U+29F8 Big Solidus, used to replace/), so even correct GBK decoding produces an incorrect file path that doesn't match the actual file on disk.Fix (3-layer defense)
--print-to-fileas primary filepath source — yt-dlp's--print-to-filealways writes UTF-8 regardless of the system locale. We writeafter_move:filepathto a temp file and read it back after the process exits. This is the most reliable approach.Raw byte reading instead of
BufReader::lines()— Replace.lines()(which requires UTF-8) with.read_until(b'\n')+ manual line-ending stripping. This ensures the reading loop never silently breaks on non-UTF-8 output.decode_process_output()with Win32 API — A new helper that converts raw process bytes to a RustStringusing theMultiByteToWideCharWin32 API (falls back tofrom_utf8_lossyon non-Windows). This preserves CJK characters in progress messages and log output.Other fixes in this PR
task.abort()withtokio::time::timeout(5s)to allow the stderr reader to finish capturing file paths before the main task proceeds[ExtractAudio] Destination:parsing — Capture audio file paths from stderr (yt-dlp sometimes prints these only to stderr).flacand.wavto filepath capture patternsdownload-progressevents and refreshes the history list when a download finishes, so new items appear immediately without page switchingTest plan
--print-to-filetemp file contains correct UTF-8 filepathcargo checkpassesbun run tsc -bpasses🤖 Generated with Claude Code