Skip to content

Fix reference/memory leaks in decode_definite_long_string#290

Merged
agronholm merged 2 commits intoagronholm:masterfrom
killiancowan82:fix-decode-long-string-refcount-leak
Mar 21, 2026
Merged

Fix reference/memory leaks in decode_definite_long_string#290
agronholm merged 2 commits intoagronholm:masterfrom
killiancowan82:fix-decode-long-string-refcount-leak

Conversation

@killiancowan82
Copy link
Copy Markdown
Contributor

Fix two reference/memory leaks in the C extension's long string decoder.

Changes

  1. Missing Py_DECREF(ret) before reassignment after PyUnicode_Concat, leaking all intermediate Unicode objects during chunked decoding (decode_definite_long_string).

  2. Missing PyMem_Free(buffer) on the success path, potentially leaking the scratch buffer used for UTF-8 boundary handling (decode_definite_long_string).

Checklist

If this is a user-facing code change, like a bugfix or a new feature, please ensure that
you've fulfilled the following conditions (where applicable):

  • [✅] You've added tests (in tests/) which would fail without your patch
    • Not in tests, but added applicable input within scripts/ref_leak_test.py and tested before and after changes:
    • Before:
+--------------------+------------------+-------------------------+-------------------------+
| Test               | Encoding         | Decoding                | Round-trip              |
+--------------------+------------------+-------------------------+-------------------------+
| None               | -                | -                       | -                       |
| 10e0               | -                | -                       | -                       |
| 10e12              | -                | -                       | -                       |
| 10e29              | -                | -                       | -                       |
| -10e0              | -                | -                       | -                       |
| -10e12             | -                | -                       | -                       |
| -10e29             | -                | -                       | -                       |
| float1             | -                | -                       | -                       |
| float2             | -                | -                       | -                       |
| str                | -                | -                       | -                       |
| bigstr             | -                | -                       | -                       |
| bytes              | -                | -                       | -                       |
| bigbytes           | -                | -                       | -                       |
| datetime           | -                | -                       | -                       |
| decimal            | -                | -                       | -                       |
| fraction           | -                | -                       | -                       |
| intlist            | -                | -                       | -                       |
| bigintlist         | -                | -                       | -                       |
| strlist            | -                | -                       | -                       |
| bigstrlist         | 254 bytes (/145) | 238 bytes (/78)         | 302 bytes (/50)         |
| dict               | -                | -                       | -                       |
| bigdict            | 254 bytes (/118) | 238 bytes (/109)        | 302 bytes (/39)         |
| set                | -                | -                       | -                       |
| bigset             | -                | -                       | 302 bytes (/248)        |
| bigdictlist        | 254 bytes (/18)  | 5350 bytes (/19)        | 5350 bytes (/9)         |
| objectdict         | -                | -                       | -                       |
| objectdictlist     | 312 bytes (/246) | 21134 bytes (/216)      | 21192 bytes (/112)      |
| tag                | -                | -                       | -                       |
| nestedtag          | -                | -                       | -                       |
| longstr_128k       | -                | 196099388 bytes (/2990) | 129924187 bytes (/1981) |
| longstr_multi_utf8 | -                | 533751762 bytes (/1357) | 431878838 bytes (/1098) |
+--------------------+------------------+-------------------------+-------------------------+
  • After:
+--------------------+------------------+--------------------+-------------------+
| Test               | Encoding         | Decoding           | Round-trip        |
+--------------------+------------------+--------------------+-------------------+
| None               | -                | -                  | -                 |
| 10e0               | -                | -                  | -                 |
| 10e12              | -                | -                  | -                 |
| 10e29              | -                | -                  | -                 |
| -10e0              | -                | -                  | -                 |
| -10e12             | -                | -                  | -                 |
| -10e29             | -                | -                  | -                 |
| float1             | -                | -                  | -                 |
| float2             | -                | -                  | -                 |
| str                | -                | -                  | -                 |
| bigstr             | -                | -                  | -                 |
| bytes              | -                | -                  | -                 |
| bigbytes           | -                | -                  | -                 |
| datetime           | -                | -                  | -                 |
| decimal            | -                | -                  | -                 |
| fraction           | -                | -                  | -                 |
| intlist            | -                | -                  | -                 |
| bigintlist         | -                | -                  | -                 |
| strlist            | -                | -                  | -                 |
| bigstrlist         | 254 bytes (/146) | 294 bytes (/51)    | 358 bytes (/37)   |
| dict               | -                | -                  | -                 |
| bigdict            | 254 bytes (/118) | 238 bytes (/62)    | 302 bytes (/40)   |
| set                | -                | -                  | -                 |
| bigset             | -                | -                  | 302 bytes (/176)  |
| bigdictlist        | 254 bytes (/18)  | 5350 bytes (/13)   | 5350 bytes (/7)   |
| objectdict         | -                | -                  | -                 |
| objectdictlist     | 312 bytes (/236) | 21190 bytes (/147) | 21248 bytes (/91) |
| tag                | -                | -                  | -                 |
| nestedtag          | -                | -                  | -                 |
| longstr_128k       | -                | -                  | -                 |
| longstr_multi_utf8 | -                | -                  | -                 |
+--------------------+------------------+--------------------+-------------------+
  • [N/A] You've updated the documentation (in docs/), in case of behavior changes or new
    features
  • [✅] You've added a new changelog entry (in docs/versionhistory.rst).

If this is a trivial change, like a typo fix or a code reformatting, then you can ignore
these instructions.

Updating the changelog

If there are no entries after the last release, use **UNRELEASED** as the version.
If, say, your patch fixes issue #123, the entry should look like this:

- Fix big bad boo-boo in the encoder
  (`#123 <https://github.com/agronholm/cbor2/issues/123>`_; PR by @yourgithubaccount)

If there's no issue linked, just link to your pull request instead by updating the
changelog after you've created the PR.

Two reference/memory leaks in the C extension's long string decoder:

1. Missing Py_DECREF(ret) before reassignment after PyUnicode_Concat,
   leaking all intermediate Unicode objects during chunked decoding.

2. Missing PyMem_Free(buffer) on the success path, leaking the scratch
   buffer used for UTF-8 boundary handling.
@coveralls
Copy link
Copy Markdown

coveralls commented Mar 21, 2026

Coverage Status

coverage: 94.55%. remained the same
when pulling 73a6452 on killiancowan82:fix-decode-long-string-refcount-leak
into a8d92dc on agronholm:master.

Copy link
Copy Markdown
Owner

@agronholm agronholm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@agronholm agronholm merged commit 54c8ed5 into agronholm:master Mar 21, 2026
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants