Skip to content

Page merge: 'NullObject' object has no attribute 'get_data' #2157

@stefan6419846

Description

@stefan6419846

Applying the fix from #2150 on the existing code and using a PDF where remove_text() and remove_images() have been called on raises an error.

Environment

Which environment were you using when you encountered the problem?

$ python -m platform
Linux-5.14.21-150400.24.81-default-x86_64-with-glibc2.31

$ python -c "import pypdf;print(pypdf._debug_versions)"
pypdf==3.15.5, crypt_provider=('pycryptodome', '3.18.0'), PIL=10.0.0

Code + PDF

This is a minimal, complete example that shows the issue:

from pypdf import PdfReader, PdfWriter

watermark = PdfReader("watermark.pdf").pages[0]

pdf_file = PdfWriter(clone_from="file.pdf")
for page in pdf_file.pages:
    page.merge_page(watermark, over=True)

The watermark is https://github.com/py-pdf/pypdf/files/12428857/watermark.pdf, the cleaned file abc.pdf

Traceback

This is the complete traceback I see:

Traceback (most recent call last):
  File "/home/stefan/temp/pdf/run1.py", line 7, in <module>
    page.merge_page(watermark, over=True)
  File "/home/stefan/temp/venv/lib/python3.9/site-packages/pypdf/_page.py", line 1044, in merge_page
    self._merge_page(page2, over=over, expand=expand)
  File "/home/stefan/temp/venv/lib/python3.9/site-packages/pypdf/_page.py", line 1124, in _merge_page
    original_content = self.get_contents()
  File "/home/stefan/temp/venv/lib/python3.9/site-packages/pypdf/_page.py", line 955, in get_contents
    return ContentStream(self[PG.CONTENTS].get_object(), pdf)
  File "/home/stefan/temp/venv/lib/python3.9/site-packages/pypdf/generic/_data_structures.py", line 1021, in __init__
    stream_data = stream.get_data()
AttributeError: 'NullObject' object has no attribute 'get_data'

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions