Use division by 100 in `to_string` for integers by AlexGuteniev · Pull Request #5691 · microsoft/STL

AlexGuteniev · 2025-08-23T19:42:56Z

⚙️ Optimization

Resolves #3857. Divides by 100 instead of by 10, as proposed.

There's similar place in to_chars, skipped for now.

🏁 Benchmark

Large and small numbers., like numbers naturally seen when counting things.
Generated via log-normal distribution, as @statementreply suggested.
Picked some arbitrary parameters, to approximately fit in the integer ranges.

Benchmarked also std::_UIntegral_to_buff separetely as well to see how much the optimization helps on its own, avoiding #1024 limitation.

⏱️ Benchmark results

i5-1235U P cores:

Benchmark	Before	After	Speedup
internal_integer_to_buff<uint8_t, 2.5, 1.5>	2.30 ns	3.42 ns	0.67
internal_integer_to_buff<uint16_t, 5.0, 3.0>	3.70 ns	2.64 ns	1.40
internal_integer_to_buff<uint32_t, 10.0, 6.0>	4.69 ns	2.86 ns	1.64
internal_integer_to_buff<uint64_t, 20.0, 12.0>	10.5 ns	5.29 ns	1.98
integer_to_string<uint8_t, 2.5, 1.5>	5.87 ns	5.44 ns	1.08
integer_to_string<uint16_t, 5.0, 3.0>	6.79 ns	6.32 ns	1.07
integer_to_string<uint32_t, 10.0, 6.0>	8.11 ns	7.28 ns	1.11
integer_to_string<uint64_t, 20.0, 12.0>	14.5 ns	14.2 ns	1.02
integer_to_string<int8_t, 2.5, 1.5>	6.64 ns	5.96 ns	1.11
integer_to_string<int16_t, 5.0, 3.0>	6.23 ns	5.88 ns	1.06
integer_to_string<int32_t, 10.0, 6.0>	7.58 ns	6.33 ns	1.20
integer_to_string<int64_t, 20.0, 12.0>	17.8 ns	18.8 ns	0.95

i5-1235U E cores:

Benchmark	Before	After	Speedup
internal_integer_to_buff<uint8_t, 2.5, 1.5>	4.14 ns	4.79 ns	0.86
internal_integer_to_buff<uint16_t, 5.0, 3.0>	8.08 ns	4.76 ns	1.70
internal_integer_to_buff<uint32_t, 10.0, 6.0>	11.4 ns	5.41 ns	2.11
internal_integer_to_buff<uint64_t, 20.0, 12.0>	23.8 ns	13.9 ns	1.71
integer_to_string<uint8_t, 2.5, 1.5>	17.2 ns	12.7 ns	1.35
integer_to_string<uint16_t, 5.0, 3.0>	17.1 ns	13.6 ns	1.26
integer_to_string<uint32_t, 10.0, 6.0>	18.3 ns	14.0 ns	1.31
integer_to_string<uint64_t, 20.0, 12.0>	36.6 ns	29.4 ns	1.24
integer_to_string<int8_t, 2.5, 1.5>	17.8 ns	12.0 ns	1.48
integer_to_string<int16_t, 5.0, 3.0>	20.0 ns	13.4 ns	1.49
integer_to_string<int32_t, 10.0, 6.0>	21.5 ns	15.1 ns	1.42
integer_to_string<int64_t, 20.0, 12.0>	39.7 ns	35.0 ns	1.13

🥉 Results interpretation

I'm not even sure if this is worth doing.

Allocating the string and copying the result there takes roughly half of the time, so the effect of micro-optimization in digits generation is small.

However, the internal function seem to show improvement. This looks like an indication that #1024 improvement would help here. It could be that the performance is limited due to failed store-to-load forwarding, as individual character stores are followed by bulk memcpy; in this case, the improvement may be somewhat negated by a longer stall.

stl/inc/xmemory

The benchmark shows a minor speedup: Benchmark | 9 digits | 8 digits | Speedup for 8 digits ----------------------------------------------------------|----------|----------|--------------------- `internal_integer_to_buff<char, uint64_t, 20.0, 12.0>` | 9.46 ns | 8.47 ns | 1.12 `internal_integer_to_buff<wchar_t, uint64_t, 20.0, 12.0>` | 8.40 ns | 8.13 ns | 1.03 `integer_to_string<uint64_t, 20.0, 12.0>` | 19.7 ns | 18.4 ns | 1.07 `integer_to_string<int64_t, 20.0, 12.0>` | 20.9 ns | 19.6 ns | 1.07

This helps a little more: Benchmark | 4 loop | special | speedup ----------------------------------------------------------|---------|----------|-------- `internal_integer_to_buff<char, uint64_t, 20.0, 12.0>` | 8.49 ns | 7.97 ns | 1.07 `internal_integer_to_buff<wchar_t, uint64_t, 20.0, 12.0>` | 8.15 ns | 7.79 ns | 1.05 `integer_to_string<uint64_t, 20.0, 12.0>` | 18.5 ns | 18.0 ns | 1.03 `integer_to_string<int64_t, 20.0, 12.0>` | 19.6 ns | 19.3 ns | 1.02

stl/inc/xmemory

benchmarks/src/integer_to_string.cpp

stl/inc/xmemory

StephanTLavavej · 2026-01-07T15:33:01Z

Thanks! I pushed major changes, please double-check.

AlexGuteniev · 2026-01-07T15:49:23Z

Looks good. 3708bbf might change codegen, but I assume it is fine, as you have benchmarked (at least for wchar_t and on x86)

StephanTLavavej · 2026-01-07T15:53:51Z

Yeah, I checked and the differences between your codegen and mine appeared to be lost in the noise.

StephanTLavavej · 2026-01-07T15:54:57Z

I'm mirroring this to the MSVC-internal repo - please notify me if any further changes are pushed.

StephanTLavavej · 2026-01-07T22:38:43Z

I've pushed commits to fix a bug found by internal AI code review. The lognormal_distribution was generating out-of-range values up to 4% of the time, triggering UB when we static_cast from double to integer. I've replaced this with a retry loop, so that we preserve the distribution's behavior, without generating the maximum value unusually often (which would happen if we just used std::clamp()). This carefully uses floor() to actually generate the maximum value some of the time. Needless to say, my code was written without any AI assistance.

MattStephanson · 2026-01-08T05:58:08Z

benchmarks/src/integer_to_string.cpp

+            constexpr auto max_val = static_cast<double>(numeric_limits<T>::max());
+            if (dbl <= max_val) {


As pointed out by @statementreply on Discord, for uint_64 this rounds to $2^{64}$, so to be rigorous you would need to either special-case that or make the test dbl < max_val.

Thanks, I can fix this in a followup.

StephanTLavavej · 2026-01-08T16:52:46Z

💯 💯 💯

AlexGuteniev requested a review from a team as a code owner August 23, 2025 19:42

github-project-automation bot added this to STL Code Reviews Aug 23, 2025

github-project-automation bot moved this to Initial Review in STL Code Reviews Aug 23, 2025

StephanTLavavej added performance Must go faster decision needed We need to choose something before working on this labels Aug 24, 2025

StephanTLavavej self-assigned this Aug 24, 2025

AlexGuteniev commented Aug 24, 2025

View reviewed changes

stl/inc/xmemory Outdated Show resolved Hide resolved

This comment was marked as resolved.

Sign in to view

This comment was marked as outdated.

Sign in to view

AlexGuteniev force-pushed the integers branch 2 times, most recently from 672f1db to 7ea6121 Compare August 25, 2025 13:09

AlexGuteniev added 4 commits August 25, 2025 18:55

benchmark

86958ab

100 branch in to_string impl

8b6f863

sort better

84b3d1a

hack around some linker issue in C++14

08b83da

AlexGuteniev force-pushed the integers branch from 7ea6121 to 08b83da Compare August 25, 2025 15:56

AlexGuteniev added 3 commits September 15, 2025 20:38

Merge branch 'microsoft:main' into integers

068234b

Merge branch 'microsoft:main' into integers

5de9ae1

Merge branch 'microsoft:main' into integers

74815fc

StephanTLavavej removed their assignment Nov 14, 2025

Merge branch 'microsoft:main' into integers

986ba2a

This comment was marked as resolved.

Sign in to view

AlexGuteniev added 6 commits November 28, 2025 19:39

Merge branch 'main' into integers

6966faf

unshare table

1ffa338

unrevert merge

f422b45

format

b39a620

eliminate the tail loop, we need at most one iteration

77d37ff

size

c1efce4

This comment was marked as resolved.

Sign in to view

StephanTLavavej removed the decision needed We need to choose something before working on this label Dec 7, 2025

AlexGuteniev and others added 11 commits December 19, 2025 18:16

Merge branch 'microsoft:main' into integers

ad8e94f

Use a multi-dim array with a constructor.

edc2042

Transform control flow to be simpler. No additional branches.

3708bbf

Use mt19937_64.

6c29273

Use 20 chars, comment why.

333e3f6

Fix major bug: _UIntegral_to_buff takes the END of the buffer.

395c8f6

Adjust header inclusions.

65138c1

Benchmark wchar_t.

33fef86

Add additional correctness tests for every length.

4b706e5

StephanTLavavej reviewed Jan 7, 2026

View reviewed changes

StephanTLavavej approved these changes Jan 7, 2026

View reviewed changes

StephanTLavavej moved this from Initial Review to Ready To Merge in STL Code Reviews Jan 7, 2026

StephanTLavavej moved this from Ready To Merge to Merging in STL Code Reviews Jan 7, 2026

StephanTLavavej added 2 commits January 7, 2026 13:57

You want me to give him the CLAMPS, boss?

77ab0a7

Clamp a while. Clamp forever!

74064c9

MattStephanson reviewed Jan 8, 2026

View reviewed changes

StephanTLavavej merged commit 641410d into microsoft:main Jan 8, 2026
45 checks passed

github-project-automation bot moved this from Merging to Done in STL Code Reviews Jan 8, 2026

AlexGuteniev deleted the integers branch January 8, 2026 16:56

AlexGuteniev added a commit to AlexGuteniev/STL that referenced this pull request Jan 11, 2026

Apply microsoft#5691 to another place

89c1d4b

StephanTLavavej removed this from STL Code Reviews Mar 4, 2026

		constexpr auto max_val = static_cast<double>(numeric_limits<T>::max());
		if (dbl <= max_val) {

Conversation

AlexGuteniev commented Aug 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚙️ Optimization

🏁 Benchmark

⏱️ Benchmark results

🥉 Results interpretation

Uh oh!

Uh oh!

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as outdated.

This comment was marked as resolved.

This comment was marked as resolved.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

StephanTLavavej commented Jan 7, 2026

Uh oh!

AlexGuteniev commented Jan 7, 2026

Uh oh!

StephanTLavavej commented Jan 7, 2026

Uh oh!

StephanTLavavej commented Jan 7, 2026

Uh oh!

StephanTLavavej commented Jan 7, 2026

Uh oh!

MattStephanson Jan 8, 2026 • edited by StephanTLavavej Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

StephanTLavavej Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

StephanTLavavej commented Jan 8, 2026

💯 💯 💯

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

AlexGuteniev commented Aug 23, 2025 •

edited

Loading

MattStephanson Jan 8, 2026 •

edited by StephanTLavavej

Loading