SEC: Prevent infinite loop from circular xref /Prev references#3655
Merged
stefan6419846 merged 2 commits intopy-pdf:mainfrom Feb 22, 2026
Merged
Conversation
Malformed PDFs can contain circular /Prev references in the xref chain (e.g., xref A -> /Prev -> xref B -> /Prev -> xref A). This causes _read_xref_tables_and_trailers() to loop forever, spamming "Overwriting cache for N M" warnings on every iteration as the same objects are re-parsed and re-cached indefinitely. Fix: Track visited xref offsets in a set. If a startxref value has already been visited, log a warning and break the loop. Closes py-pdf#3654
3a379b9 to
0fbd959
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3655 +/- ##
=======================================
Coverage 97.35% 97.35%
=======================================
Files 55 55
Lines 9916 9921 +5
Branches 1814 1815 +1
=======================================
+ Hits 9654 9659 +5
Misses 152 152
Partials 110 110 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Add a dedicated test that constructs a synthetic PDF with a self-referencing /Prev trailer entry to verify the circular xref chain detection works correctly and doesn't hang.
eaa8d91 to
8beff19
Compare
stefan6419846
approved these changes
Feb 22, 2026
stefan6419846
added a commit
that referenced
this pull request
Feb 22, 2026
## What's new ### Security (SEC) - Prevent infinite loop from circular xref /Prev references (#3655) by @rampageservices ### Bug Fixes (BUG) - Fix wrong LUT size error (#3651) by @stefan6419846 - Fix handling of page boxes defined on `/Pages` (#3650) by @stefan6419846 [Full Changelog](6.7.1...6.7.2)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #3654
A crafted PDF with circular
/Prevreferences in its cross-reference table causes_read_xref_tables_and_trailers()to loop indefinitely, hanging the process. This is a denial-of-service vulnerability (CWE-835) for any application that parses untrusted PDFs.Root Cause
In
_reader.py, thewhile startxref is not Noneloop at line ~874 follows/Prevpointers to walk the xref chain but has no guard against circular references. If/Prevpoints back to an already-visited offset, the loop never terminates.Fix
Added a
visited_xref_offsetsset that tracks every xref offset before processing. If a previously-seen offset is encountered, the loop logs a warning and breaks — consistent with the existing circular-reference guard pattern used elsewhere in pypdf (e.g., outlines traversal after CVE-2026-24688).Security Note
This is the same vulnerability class as CVE-2026-24688 and GHSA-hm9v-vj3r-r55m. A GitHub Security Advisory / CVE assignment may be appropriate.
Test
Discovered in production processing LCSC datasheets. The affected PDF triggers "Overwriting cache for X Y" log spam in an infinite loop during xref parsing.