Version 7.7.0 (release candidate) #863

lemire · 2025-11-21T05:53:29Z

We are preparing a breaking change following @erikcorry's recent PR on the new functions (utf8_length_from_utf16_with_replacement).

Compared to Erik's version, I made the documentation more explicit since this function will behave differently from the rest of the library.

…860) The next step after utf8_length_from_utf16_with_replacement is almost always going to be to allocate a UTF-8 buffer and then convert the string. Sadly, we have to insert a third pass, to_well_formed_utf16, which converts the unpaired surrogates. Since surrogates are relatively rare, and the _with_replacement functions have already scanned the input, we could skip the conversion if we were given this information along with the utf-8 length. In my measurements on Icelake this doesn't slow down utf8_length_from_utf16_with_replacement at all.

erikcorry and others added 7 commits November 20, 2025 21:09

lint

8f711f1

better documentation.

0880d11

version bump.

95f520e

[no-ci] minor simplification

9affc1d

correct macro name. (!!!)

5983821

removing silly space

78e9d4b

lemire merged commit 94fb52e into master Nov 22, 2025
70 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Version 7.7.0 (release candidate) #863

Version 7.7.0 (release candidate) #863

Uh oh!

lemire commented Nov 21, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Version 7.7.0 (release candidate) #863

Version 7.7.0 (release candidate) #863

Uh oh!

Conversation

lemire commented Nov 21, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants