Skip to content

Enable the use of UTF-8 filenames under Windows#718

Merged
derselbst merged 3 commits intoFluidSynth:utf8from
getraid-gg:master
Dec 22, 2020
Merged

Enable the use of UTF-8 filenames under Windows#718
derselbst merged 3 commits intoFluidSynth:utf8from
getraid-gg:master

Conversation

@getraid-gg
Copy link
Copy Markdown
Contributor

While fopen (used through the macro FLUID_FOPEN) uses UTF-8 on *nix, it's restricted to ANSI on Windows. A change to enable using paths containing non-ANSI characters was suggested before in issue #128 but was rejected due to requiring large parts of both the public API and private implementation to be modified to accommodate Windows.

This PR instead changes the macro definition for FLUID_FOPEN from fopen to a new wrapper, fluid_fopen. This wrapper is defined in fluidsynth_priv.h and defined in fluid_sys.c (following the pattern of fluid_alloc). Under Windows, it converts the const char* UTF-8 parameters to Unicode wchar_t* strings using the Windows API function MultiByteToWideChar and opens the file using the Windows API-specific _wfopen. On all other platforms, it simply calls fopen.

The public API is unchanged. This solution will require Windows users of the API to convert UTF-16 strings to UTF-8 (which then get converted back into UTF-16 anyway), but that's still an improvement over only being able to use ANSI paths.

This PR also adds a new test, test_utf8_open, which tests FLUID_FOPEN directly and through fluid_is_soundfont and fluid_synth_sfload using a new soundfont file, sf2/■VintageDreamsWaves-v2■.sf2, which is just a copy of VintageDreamsWaves-v2.sf2 with Unicode characters in the filename.

@getraid-gg
Copy link
Copy Markdown
Contributor Author

I've been trying for hours to reproduce the CI builds' unit tests failing to load any files at all with fluid_fopen. The tests work in MSYS2/MinGW on my end so I'm going to try building it from scratch in a VM with a clean installation and see if it's just my dirty work environment that's inadvertently preventing MinGW from producing errors.

Copy link
Copy Markdown
Member

@derselbst derselbst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks surprisingly simple. Nice job, thanks! Two minor points:

  • Please add log messages FLUID_LOG(FLUID_ERR, "") if an error occurs. This should make it easier to diagnose, whether the CI failure is coming from any of the MultiByteToWideChar calls or from _wfopen itself.
  • MB_COMPOSITE - is that a good idea? (I really don't know)

@getraid-gg
Copy link
Copy Markdown
Contributor Author

getraid-gg commented Dec 19, 2020

* `MB_COMPOSITE` - is that a good idea? (I really don't know)

Well, this is embarrassing. I originally started using it because I saw its opposite flag marked deprecated when I looked at its comment through IntelliSense. It turns out that both are deprecated - all of the options are. Additionally, the documentation tells you to not use any flags for a bunch of codepages but doesn't list UTF-8 in that list but in a box right below, tells you to not use any flags (or use the flag that replaces invalid characters)... I still haven't reproduced the issue yet but I'll get at it and update my branch when it's ready.

@getraid-gg getraid-gg marked this pull request as draft December 19, 2020 21:07
@getraid-gg
Copy link
Copy Markdown
Contributor Author

getraid-gg commented Dec 21, 2020

I managed to reproduce the issue with file loading on a completely fresh MinGW-w64 installation and I believe it really was just using the wrong flags. I guess the MSYS and modern Windows C runtimes don't care but MinGW and Windows XP's CRTs are stricter about it.

I've also made it treat UTF-8 filenames more strictly on Windows (failing on malformed UTF-8 instead of replacing it with the replacement character �, which is a valid character in filenames) because that's the behavior I've observed on Linux.

@getraid-gg getraid-gg marked this pull request as ready for review December 21, 2020 22:23
@derselbst derselbst changed the base branch from master to utf8 December 22, 2020 09:33
Copy link
Copy Markdown
Member

@derselbst derselbst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thank you!

@derselbst derselbst merged commit c688b0d into FluidSynth:utf8 Dec 22, 2020
derselbst pushed a commit that referenced this pull request Dec 22, 2020
While `fopen` (used through the macro `FLUID_FOPEN`) uses UTF-8 on *nix, it's restricted to ANSI on Windows. A change to enable using paths containing non-ANSI characters was suggested before in issue #128 but was rejected due to requiring large parts of both the public API and private implementation to be modified to accommodate Windows.

This PR instead changes the macro definition for `FLUID_FOPEN` from `fopen` to a new wrapper, `fluid_fopen`. This wrapper is defined in `fluidsynth_priv.h` and defined in `fluid_sys.c` (following the pattern of `fluid_alloc`). Under Windows, it converts the `const char*` UTF-8 parameters to Unicode `wchar_t*` strings using the Windows API function `MultiByteToWideChar` and opens the file using the Windows API-specific `_wfopen`. On all other platforms, it simply calls `fopen`.

The public API is unchanged. This solution will require Windows users of the API to convert UTF-16 strings to UTF-8 (which then get converted back into UTF-16 anyway), but that's still an improvement over only being able to use ANSI paths.

This PR also adds a new test, `test_utf8_open`, which tests `FLUID_FOPEN` directly and through `fluid_is_soundfont` and `fluid_synth_sfload` using a new soundfont file, `sf2/■VintageDreamsWaves-v2■.sf2`, which is just a copy of `VintageDreamsWaves-v2.sf2` with Unicode characters in the filename.
@derselbst derselbst added this to the 2.2 milestone Jan 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants