-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Closed
Labels
bugSomething isn't workingSomething isn't workingfixedSomething works now, yay!Something works now, yay!formatC++20/23 formatC++20/23 format
Description
Per discussion on Discord.
Some multibyte charsets use 0x7b and 0x7d (the same as '{' and '}', respectively) for the second byte of character encoding. We need to be aware of this issue when implementing #30, and properly parse format strings containing such characters.
For example, the string "日本地図" (meaning "map of Japan" in Japanese) is encoded as "\x93\xfa\x96\x7b\x92\x6e\x90\x7d" in Shift JIS (code page 932), which contains both 0x7b and 0x7d. When running under code page 932, it shouldn't be parsed as "\x93\xfa\x96" + "{\x92n\x90}".
AFAIK, code pages 932, 936, 950, and 54936 contain such encodings.
Command-line test case
D:\Temp>type format_sjis.cpp
#include <format>
#include <iostream>
#include <string_view>
using namespace std;
int main() {
constexpr auto str = "\x93\xfa\x96\x7b\x92\x6e\x90\x7d"sv;
cout << str << "\n";
cout << format(str) << "\n";
}D:\Temp>chcp 932
Active code page: 932
D:\Temp>cl /EHsc /W4 /WX /std:c++latest format_sjis.cpp
[...]
D:\Temp>.\format_sjis.exe
[...]
Expected behavior
Should print:
日本地図
日本地図
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingfixedSomething works now, yay!Something works now, yay!formatC++20/23 formatC++20/23 format