Skip to content

Fix non-ASCII file path handling on Windows#318

Merged
spe-ciellt merged 1 commit intogerbv:developfrom
SourceParts:fix/nonascii-file-paths
Mar 3, 2026
Merged

Fix non-ASCII file path handling on Windows#318
spe-ciellt merged 1 commit intogerbv:developfrom
SourceParts:fix/nonascii-file-paths

Conversation

@rampageservices
Copy link
Copy Markdown
Contributor

@rampageservices rampageservices commented Mar 1, 2026

Summary

Tier 1 fix for non-ASCII file path handling on Windows.

  • Replace fopen() with g_fopen() in project.c (4 calls) and tooltable.c (1 call) so that files with non-ASCII characters in their paths (e.g. ö, ü, CJK) open correctly on Windows
  • Convert argv from system codepage to UTF-8 via g_locale_to_utf8() on Windows so CLI-supplied paths work with GLib file functions
  • Use g_filename_display_basename() instead of g_path_get_basename() for the window title so non-ASCII basenames render properly
  • Set explicit glib-2.0>=2.6 minimum in CMakeLists.txt to document the dependency on g_fopen() and g_filename_display_basename()

Closes #234

Details

On Windows, fopen() interprets filenames using the system codepage (e.g. Windows-1252), which cannot represent characters outside that codepage. GLib's g_fopen() internally converts UTF-8 paths to wide-char via _wfopen(), handling all Unicode filenames correctly.

Similarly, argv on Windows is encoded in the system codepage. Since GLib functions expect UTF-8, the fix converts each argv entry via g_locale_to_utf8() early in main(). This uses the conservative g_locale_to_utf8() (available in all GLib 2.x) rather than g_win32_get_command_line() (which requires GLib 2.40+).

Scope

This is a tier 1 fix covering the straightforward cases. Not touched (separate work):

  • export-image.c — would need a stream API refactor
  • scheme.c — embedded TinyScheme interpreter
  • lrealpath.c — Win32 path canonicalization
  • dxflib DXF export — third-party library

Test plan

  • Linux: verify files with UTF-8 names (e.g. Löten.gbr) open as before
  • Windows: open files with non-ASCII paths from both CLI and GUI file dialog
  • Existing test suite passes (94/104 pass; 10 are pre-existing image comparison mismatches)

Replace fopen() with g_fopen() in project.c (4 calls), tooltable.c
(1 call) so that filenames containing non-ASCII characters (e.g. ö, ü,
CJK) are opened correctly on Windows, where fopen() uses the system
codepage but g_fopen() converts UTF-8 to wide-char internally.

On Windows, convert argv from the system codepage to UTF-8 via
g_locale_to_utf8() so that CLI-supplied paths round-trip through GLib
file functions correctly.

Use g_filename_display_basename() instead of g_path_get_basename() for
the window title so that non-ASCII basenames render properly.

Set an explicit glib-2.0>=2.6 minimum in CMakeLists.txt to document the
dependency on g_fopen() and g_filename_display_basename().

Closes gerbv#234
@spe-ciellt spe-ciellt added the Windows This problem is reported on the Windows platform, which relies on third party debugging. label Mar 1, 2026
@spe-ciellt spe-ciellt added the fix Solution for a potential problem or omission. label Mar 3, 2026
@spe-ciellt
Copy link
Copy Markdown
Contributor

Minor Issues Found in fopen() patch

1. logToFileFilename still uses raw fopen()

main.c:950:

logFile = fopen(logToFileFilename, "w");

logToFileFilename is set from optarg at main.c:824 (the --log option), which comes from argv. After the PR's argv UTF-8 conversion those strings are in UTF-8, but fopen() on Windows still interprets them as system-codepage. A user who passes --log /path/with/ünïcödé/log.txt would get a silent open failure even with this PR applied. This should be g_fopen().

If this call were converted, the #include <glib/gstdio.h> added to main.c would be justified (currently the include is not needed — see issue 2).

2. #include <glib/gstdio.h> in main.c is premature

The PR adds <glib/gstdio.h> to main.c, but main.c does not call g_fopen() anywhere in the patched version. g_locale_to_utf8() is declared in <glib.h>, which is already pulled in transitively. The include is harmless but signals intent that wasn't followed through — see issue 1.

3. #ifdef G_OS_WIN32 vs. #ifdef WIN32 inconsistency

The new argv conversion block uses G_OS_WIN32:

#ifdef G_OS_WIN32
    for (i = 0; i < argc; i++) { … }
#endif

The surrounding code in main.c consistently uses WIN32:

# ifdef WIN32
    bind_textdomain_codeset(PACKAGE, "UTF-8");
# endif
#ifdef WIN32
    screenRenderInfo.renderType = GERBV_RENDER_TYPE_CAIRO_NORMAL;
#endif

Both macros resolve on Windows in this build (GLib defines G_OS_WIN32; WIN32 is compiler-defined), so there is no functional difference. Cosmetic inconsistency only — the new block should use #ifdef WIN32 to match the file's existing style.

4. argv mutation produces a Valgrind-visible leak (by design)

gchar *utf8_arg = g_locale_to_utf8(argv[i], -1, NULL, NULL, NULL);
if (utf8_arg) {
    argv[i] = utf8_arg;
}

The original argv[i] pointer (owned by the C runtime) is overwritten and cannot be freed. The new utf8_arg allocation is placed into argv[] and also never freed. This is intentional — these strings must live for the process lifetime — but Valgrind would report them as definite leaks. Since the G_OS_WIN32 guard means this block never runs on Linux (where the Valgrind tests run), there is no practical impact on the test suite. Worth noting in a code comment.

No show stoppers, I merge this with referencing this in an issue for future update.

@spe-ciellt spe-ciellt merged commit dc8fc84 into gerbv:develop Mar 3, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

fix Solution for a potential problem or omission. Windows This problem is reported on the Windows platform, which relies on third party debugging.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

(half-)fails to open file when path contains non-english character

2 participants