Skip to content

Conversation

@lemire
Copy link
Member

@lemire lemire commented Jul 12, 2025

Fixes #821

The issue was reported by Shu-yu Guo (@syg) with feedback from Kevin Gibbons (@bakkot).

The fix in this commit is quite simple and involves adding a missing test that we probably should have had.

It also adds more tests (in addition to what @syg provided) and fixes some typos in the documentation: somehow, I was using 'taylor' instead of 'tailor'.

I am deliberately keeping @syg's commit so that they get credit.

This is related to WebKit PR WebKit/WebKit#47926

syg and others added 2 commits July 11, 2025 16:08
"base64 decoding with stop_before_partial returning
OUTPUT_BUFFER_IS_TOO_SMALL for illegal padded chunk"

The issue was reported by Shu-yu Guo.

The fix in this commit is quite simple and involves adding a missing
test.

It also adds more tests and fixes some typos in the documentation.
@lemire lemire changed the title Issue821 Issue 821 Jul 12, 2025
@lemire lemire requested a review from Copilot July 12, 2025 23:51
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR addresses issue #821 by adding missing tests for padded-base64 handling, strengthening error checks in decoding implementations, and correcting typos in documentation.

  • Introduce a std::string overload of add_simple_spaces and add TC39-based tests for illegal padded chunks.
  • Add guards in scalar and Ice Lake implementations to reject invalid padding lengths.
  • Fix spelling (“taylor” → “tailor”) and expand the README with detailed WHATWG forgiving-base64 documentation.

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tests/base64_tests.cpp Added string version of add_simple_spaces, new padded-chunk tests, and removed outdated test
src/scalar/base64.h Added check to reject cases where idx + padding_characters > 4
src/icelake/icelake_base64.inl.cpp Duplicated the invalid-padding-length guard for Ice Lake backend
include/simdutf/implementation.h Fixed “taylor” → “tailor” typos in documentation comments
README.md Expanded WHATWG forgiving-base64 section; corrected grammar and typos
Comments suppressed due to low confidence (1)

README.md:1848

  • [nitpick] Grammar issue: change 'We also converting from' to 'We also convert from' or 'We also support converting from'.
We also converting from [WHATWG forgiving-base64](https://infra.spec.whatwg.org/#forgiving-base64-decode) to binary, and back. In particular, you can convert base64 inputs which contain ASCII spaces (' ', '\t', '\n', '\r', '\f') to binary. We also support the base64 URL encoding alternative. These functions are part of the Node.js JavaScript runtime: in particular `atob` in Node.js relies on simdutf.

lemire and others added 4 commits July 12, 2025 19:52
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
ASSERT_EQUAL(back, expected);
}

// https://github.com/tc39/test262

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I will fix in a pre-release commit.

@lemire lemire merged commit f5acdb0 into master Jul 13, 2025
71 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

base64 decoding with stop_before_partial returning OUTPUT_BUFFER_IS_TOO_SMALL for illegal padded chunk

4 participants