Skip to content

gh-141336: Fix Unicode escape assertion failure in error handler#141344

Open
mohsinm-dev wants to merge 2 commits into
python:mainfrom
mohsinm-dev:fix-unicode-escape-assertion-gh-141336
Open

gh-141336: Fix Unicode escape assertion failure in error handler#141344
mohsinm-dev wants to merge 2 commits into
python:mainfrom
mohsinm-dev:fix-unicode-escape-assertion-gh-141336

Conversation

@mohsinm-dev

@mohsinm-dev mohsinm-dev commented Nov 10, 2025

Copy link
Copy Markdown
Contributor

Fixes assertion failure in Unicode escape decoding when using custom error handlers that return single-character replacements and rewind input position.

Root Cause

The issue occurs in unicode_decode_call_errorhandler_writer() when:

  1. Error handler returns a replacement string of length 1 (replen == 1)
  2. Error handler rewinds input position (increasing remaining input bytes)
  3. Original code failed to account for replacement length in buffer calculations

This led to assertion failures in debug builds.

Fix

  • Always include full replacement length in writer.min_length calculations
  • Change condition from replen > 1 to replen > 0
  • Add full replen instead of replen - 1 to maintain buffer invariant

Testing

  • Reproduction case now works without assertion failure
  • All codec callback tests pass (43/43)
  • Specific mutating decode handler test passes
  • No regressions detected in broader codec tests

Fixes #141336

When unicode_decode_call_errorhandler_writer() processes a replacement
string of length 1 and the error handler rewinds the input position
(increasing remaining input), the original code failed to account for
the replacement length in writer.min_length calculations.

This resulted in assertion failures when writer capacity was insufficient
after writing the replacement: assert(end - s <= writer.size - writer.pos).

The fix ensures the full replacement length is always included in
writer.min_length, regardless of replacement size, maintaining the
buffer capacity invariant for all decoders using this error helper.
@serhiy-storchaka serhiy-storchaka self-requested a review November 10, 2025 14:07
@github-actions

github-actions Bot commented May 2, 2026

Copy link
Copy Markdown

This PR is stale because it has been open for 30 days with no activity.

@github-actions github-actions Bot added the stale Stale PR or inactive for long period of time. label May 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

awaiting review stale Stale PR or inactive for long period of time.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Assertion failure in Objects/unicodeobject.c _PyUnicode_DecodeUnicodeEscapeInternal2: Assertion 'end - s <= writer.size - writer.pos' failed

2 participants