Skip to content

BUG: resolve UnboundLocalError for xobjs in _get_image#3684

Merged
stefan6419846 merged 6 commits intopy-pdf:mainfrom
Yuki9814:fix/xobjs-unbound-in-get-image
Mar 17, 2026
Merged

BUG: resolve UnboundLocalError for xobjs in _get_image#3684
stefan6419846 merged 6 commits intopy-pdf:mainfrom
Yuki9814:fix/xobjs-unbound-in-get-image

Conversation

@Yuki9814
Copy link
Contributor

Description

Fixes a potential UnboundLocalError in the _get_image() method of pypdf/_page.py.

The Bug

In pypdf/_page.py, the _get_image() method uses a try-except block to retrieve XObject resources:

try:
    xobjs = cast(
        DictionaryObject, cast(DictionaryObject, obj[PG.RESOURCES])[RES.XOBJECT]
    )
except KeyError:
    if not (id[0] == "~" and id[-1] == "~"):
        raise
# Later...
imgd = _xobj_to_image(cast(DictionaryObject, xobjs[id]))  # BUG: xobjs may be unbound!

If a KeyError is raised (no XObject resources) but the id starts and ends with ~ (indicating an inline image), the exception is silently caught and the code continues. However, for non-inline images, the subsequent code tries to access xobjs, which was never assigned, causing UnboundLocalError: local variable 'xobjs' referenced before assignment.

The Fix

Initialize xobjs = None before the try block and add explicit checks before using it:

xobjs: Optional[DictionaryObject] = None
try:
    xobjs = cast(...)
except KeyError:
    if not (id[0] == "~" and id[-1] == "~"):
        raise

# Later...
if xobjs is None:
    raise KeyError(f"Cannot access image object {id} without XObject resources")
imgd = _xobj_to_image(cast(DictionaryObject, xobjs[id]))

This ensures that if xobjs is None, we raise a clear KeyError instead of crashing with an UnboundLocalError.

Testing

The fix maintains backward compatibility:

  • Inline images (id like ~0~) continue to work without XObject resources
  • Non-inline images properly raise KeyError if XObject resources are missing

The xobjs variable was used outside the try-except block that defined it.
If a KeyError was caught and the id started/ended with '~' (inline images),
the code would continue but xobjs would remain undefined, causing
UnboundLocalError when trying to access non-inline images.

Initialize xobjs to None and check before using it.
@codecov
Copy link

codecov bot commented Mar 15, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 97.42%. Comparing base (8f1f4aa) to head (3fc27dd).
⚠️ Report is 5 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3684      +/-   ##
==========================================
+ Coverage   97.41%   97.42%   +0.01%     
==========================================
  Files          55       55              
  Lines        9989     9990       +1     
  Branches     1833     1835       +2     
==========================================
+ Hits         9731     9733       +2     
+ Misses        150      149       -1     
  Partials      108      108              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Yuki9814 Yuki9814 changed the title fix: Resolve UnboundLocalError for xobjs in _get_image method BUG: resolve UnboundLocalError for xobjs in _get_image Mar 15, 2026
Copy link
Collaborator

@stefan6419846 stefan6419846 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the report. Please revert all unrelated changes and add a corresponding test to allow me to continue with the review.

@Yuki9814
Copy link
Contributor Author

Reverted the unrelated formatting changes and kept the _get_image fix minimal. I also added a regression test in ests/test_images.py covering inline image access when /XObject resources are missing.

@Yuki9814
Copy link
Contributor Author

All requested changes should now be addressed: the unrelated formatting changes were reverted, a regression test was added, and the full CI is now green. If you have time, I'd appreciate another look.

@Yuki9814
Copy link
Contributor Author

Addressed the latest review notes: I removed the new # pragma: no cover markers from the _get_image change and simplified the regression test to use object() instead of mock.sentinel. I also reran python -m pytest tests/test_images.py -k "inline_image_without_xobject_resources" -q locally, and it passed.

@stefan6419846
Copy link
Collaborator

Could you please add tests for the error cases as well?

@Yuki9814
Copy link
Contributor Author

Added the requested error-case coverage as well:

  • inline image lookup now has a test for the missing-inline-images path
  • regular XObject lookup now has a test for missing /XObject resources

I also tightened the _get_image error handling so the non-inline path now raises the intended KeyError message instead of leaking the raw missing-dictionary key, and reran the targeted tests plus ruff locally.

Copy link
Collaborator

@stefan6419846 stefan6419846 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

@stefan6419846 stefan6419846 merged commit 04b0a38 into py-pdf:main Mar 17, 2026
18 checks passed
stefan6419846 added a commit that referenced this pull request Mar 23, 2026
## What's new

### Security (SEC)
- Avoid infinite loop in read_from_stream for broken files (#3693) by @stefan6419846

### Robustness (ROB)
- Resolve UnboundLocalError for xobjs in _get_image (#3684) by @Yuki9814

[Full Changelog](6.9.1...6.9.2)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants