Skip to content

<format> locale fixes#1862

Merged
StephanTLavavej merged 3 commits intomicrosoft:mainfrom
barcharcraz:format-locale-fixes
Apr 22, 2021
Merged

<format> locale fixes#1862
StephanTLavavej merged 3 commits intomicrosoft:mainfrom
barcharcraz:format-locale-fixes

Conversation

@barcharcraz
Copy link
Contributor

  • Fixes <format>: Formatting functions use a wrong locale #1852
  • improves performance by only constructing locales if the format string actually has a locale sensitive format specifier in it. This gives a small but notable performance boost (for simple char* output I got ~130ns/call before and ~105ns/call after in my extremely unscientific benchmarks). I did verify that std::locale::locale no longer shows up in the profile after the change.

Note: there's an alternate, slightly more clever approach, where _Default_arg_formatter and basic_format_context store uninitialized storage for a locale, and it's again initialized on demand. The advantage of this is that in format strings with multiple locale sensitive format-specs the locale would only have to be constructed once, it also might be possible to adopt this method without breaking ABI (although it would be scary). The disadvantage here is the code gets more confusing and it will be much harder to break the dependency on if we want to do that in the future.

@barcharcraz barcharcraz added cxx20 C++20 feature format C++20/23 format labels Apr 19, 2021
@barcharcraz barcharcraz requested a review from a team as a code owner April 19, 2021 23:13
@bhardwajs
Copy link
Contributor

FYI @vitaut

Copy link
Member

@StephanTLavavej StephanTLavavej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I have a few comments that I'll push changes for.

Copy link
Contributor

@CaseyCarter CaseyCarter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving as a clear improvement, but I think I'd like to investigate storing lazily-constructed locale and _Cvtvec in basic_format_context so we can guarantee at-most-one construction of each per format call.

@barcharcraz
Copy link
Contributor Author

barcharcraz commented Apr 20, 2021

Approving as a clear improvement, but I think I'd like to investigate storing lazily-constructed locale and _Cvtvec in basic_format_context so we can guarantee at-most-one construction of each per format call.

I don't think this effects the cvtvec at all, since that's populated once in the parsing functions, no matter how many format arguments you use.

@statementreply
Copy link
Contributor

statementreply commented Apr 21, 2021

I don't think this effects the cvtvec at all, since that's populated once in the parsing functions, no matter how many format arguments you use.

I observed that format("{:.2f}", numbers::pi) calls __std_get_cvt twice in non UTF-8 setting.

Besides, GetCPInfoExW is so much slower than core format code that I think it’s worth going further and investigate
catching the result across the program execution.

@StephanTLavavej StephanTLavavej self-assigned this Apr 21, 2021
@barcharcraz
Copy link
Contributor Author

I don't think this effects the cvtvec at all, since that's populated once in the parsing functions, no matter how many format arguments you use.

I observed that format("{:.2f}", numbers::pi) calls __std_get_cvt twice in non UTF-8 setting.

Besides, GetCPInfoExW is so much slower than core format code that I think it’s worth going further and investigate
catching the result across the program execution.

Yeah, I want to investigate faster options, however it's not ABI and users have an escape hatch with /utf-8 and the wide format. It's likely we'll make it faster in a future patch release, especially if the compiler grows the ability to just tell us what the execution charset is.

@StephanTLavavej StephanTLavavej added the performance Must go faster label Apr 21, 2021
@StephanTLavavej StephanTLavavej merged commit 4fae287 into microsoft:main Apr 22, 2021
@StephanTLavavej
Copy link
Member

Thanks for this combination bugfix/perf improvement! 🪲 🚀 😸

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cxx20 C++20 feature format C++20/23 format performance Must go faster

Projects

None yet

Development

Successfully merging this pull request may close these issues.

<format>: Formatting functions use a wrong locale

5 participants