Skip to content

Conversation

@OwenSanzas
Copy link
Contributor

Summary

  • Move bounds check BEFORE write for 1-byte ASCII cases to prevent heap-buffer-overflow
  • Fix two locations with the same "write-before-check" pattern (lines 123 and 141)
  • The 2-byte, 3-byte, and 4-byte cases already correctly check before writing

Root Cause

The _safe function designed to prevent buffer overflows had a bug where it writes to the buffer BEFORE checking if there is sufficient space:

// BUG: write FIRST, check AFTER
*utf8_output++ = char(word);
pos++;
if (check_output && size_t(end - utf8_output) == 0) {  // Too late!

Fix

Move the check before the write:

// FIX: check FIRST, write AFTER
if (check_output && size_t(end - utf8_output) < 1) {
  return full_result(error_code::OUTPUT_BUFFER_TOO_SMALL, pos, utf8_output - start);
}
*utf8_output++ = char(word);
pos++;

Test Plan

  • Verified fix with AddressSanitizer
  • Original crash input no longer triggers overflow
  • Normal conversion still works correctly

Fixes: #911

Move bounds check BEFORE write for 1-byte ASCII cases to prevent
heap-buffer-overflow when output buffer is exactly full.

The bug was in two locations where the code writes to the output buffer
before checking if there is sufficient space (write-before-check pattern).
This is inconsistent with the 2-byte, 3-byte, and 4-byte cases which
correctly check before writing.

Fixes: simdutf#911
@lemire
Copy link
Member

lemire commented Jan 12, 2026

Verification of the issue at #913

@lemire
Copy link
Member

lemire commented Jan 12, 2026

@OwenSanzas Can you sync with our main branch to verify the fix ?

@pauldreik
Copy link
Collaborator

I tried this PR on the newly written fuzzer just merged seconds ago. It works!

@pauldreik
Copy link
Collaborator

The newly added tests for this also stopped failing with this patch applied. I suggest we merge it.

@lemire lemire merged commit c172256 into simdutf:master Jan 12, 2026
84 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

# Heap buffer overflow in convert_utf16_to_utf8_safe

3 participants