Skip to content

Releases: simdutf/simdutf

Version 8.0.0

13 Jan 08:30
7f3757f

Choose a tag to compare

Major changes

The major change in this release is that now most simdutf functions are immediate functions (constexpr), i.e., they can be executed at compile time. Thus, for example, you can validate that a string is proper UTF-8 at compile time:

static_assert(simdutf::validate_utf8(s));

The constexpr interface requires C++23. (You can still use simdutf with C++11.)

Another major change is the introduction of a C API. You can now easily call simdutf from C (although you still need to link against a C++ library, either statically or at runtime). This C API should make it easier to write wrappers to simdutf from other programming languages. We now include a C header as part of our releases.

What's Changed

Because we fixed a couple of bugs, including a potential buffer overflow in convert_utf16_to_utf8_safe , we recommend that all users of the library update to 8.0.0. There are no breaking changes.

Infrastructure changes

New Contributors

Full Changelog: v7.7.1...v8.0.0

Version 7.7.1

20 Dec 16:52

Choose a tag to compare

What's Changed

  • Do not use include inside our namespaces by @lemire in #870
  • add simdutf constexpr more thoroughly by @lemire in #864
  • optimize utf16 validation on icelake by @anonrig in #873
  • optimize utf32 validation on icelake by @anonrig in #872
  • Fix aarch64 constexpr build error by @pauldreik in #875
  • introduce cmake option SIMDUTF_FAST_TESTS by @pauldreik in #876
  • better documentation for maximal_binary_length_from_base64 by @lemire in #871
  • Treat C++20 char8_t as byte-like by @leezaj in #877
  • Include validate_utf16le_as_ascii inside UTF16 and ASCII features by @leezaj in #878
  • Improving the performance of validate_ascii by @lemire in #879 credit to @ChALkeR for raising the issue

New Contributors

Full Changelog: v7.7.0...v7.7.1

Version 7.7.0

22 Nov 03:31
94fb52e

Choose a tag to compare

⚠️ This version introduces a breaking change in utf8_length_from_utf16_with_replacement. cc @anonrig We allow the breaking change on the assumption that nobody has had time to use our new function and if they do, the patch is simple (trivial). It is agains the practice of simdutf to introduce such breaking changes, so it is an exception.

What's Changed

  • Return more information from utf8_length_from_utf16_with_replacement by @erikcorry in #860

New Contributors

Full Changelog: v7.6.0...v7.7.0

Version 7.6.0

18 Nov 18:19

Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v7.5.0...v7.6.0

Version 7.5.0

16 Oct 19:26

Choose a tag to compare

What's Changed

  • Implement rvv validate_utf16_as_ascii function by @tantei3 in #836
  • Enable SIMD generic validate_utf16_as_ascii for lsx + lasx + ppc64 by @tantei3 in #837
  • Implement to_well_formed_utf16 for rvv by @tantei3 in #838
  • Typo fix: parem to param by @jasseeeem in #841
  • utf16fix_block_rvv: improve mask shift by @camel-cdr in #842
  • converting binary data to base64 with lines by @lemire in #840

New Contributors

Full Changelog: v7.4.0...v7.5.0

Version 7.4.0

24 Aug 00:24

Choose a tag to compare

What's Changed

  • improving support for legacy GCC and validate_utf16_as_ascii by @lemire in #833 This fixes both #832 and #831

The new feature of this minor release is that we can check whether an UTF-16 string is 'ASCII' meaning that it can be converted to ASCII without any loss. This was requested by @trflynn89 of the Ladybird project.

/**
 * Validate the ASCII string as a UTF-16 sequence.
 * An UTF-16 sequence is considered an ASCII sequence
 * if it could be converted to an ASCII string losslessly.
 *
 * Overridden by each implementation.
 *
 * @param buf the UTF-16 string to validate.
 * @param len the length of the string in bytes.
 * @return true if and only if the string is valid ASCII.
 */
simdutf_warn_unused bool validate_utf16_as_ascii(const char16_t *buf,
                                                 size_t len) noexcept;

Full Changelog: v7.3.6...v7.4.0

Version 7.3.6

14 Aug 18:26

Choose a tag to compare

What's Changed

This patch should only concern users of the trim_partial_utf16 function.

Full Changelog: v7.3.5...v7.3.6

Version 7.3.5

09 Aug 03:49

Choose a tag to compare

What's Changed

  • Improving the performance of simdutf::find and adding a benchmark for simdutf::find.

Version 7.3.4

01 Aug 04:11

Choose a tag to compare

What's Changed

  • fixing Issue 824 by @lemire in #825 We are fixing a minor issued (defined as standard compliance with TC39 base64 proposal). When using last_chunk_handling::stop_before_partial, we would sometimes we would either consume too many or too few characters, the difference being due to ignorable characters. Thus, for a string made entirely of spaces, we might report consuming no character whereas the proposed standard wants us to consume all characters. Conversely, when stopping before a partial chunk, we would consume all characters up to that point, whereas the proposed standard would want us to consume no trailing ignorable characters. Thanks to @syg for the report. Ping to @trflynn89, @Constellation, and @anonrig. We added corresponding tests.

Full Changelog: v7.3.3...v7.3.4

Version 7.3.3

13 Jul 14:54

Choose a tag to compare

What's Changed

This patch release fixes minor documentation issues, an issue with UTF-8 BOM detection and an issue with stop_before_partial last chunk handling. Thanks to @syg for the report. Ping to @trflynn89, @Constellation, and @anonrig.

New Contributors

Full Changelog: v7.3.2...v7.3.3