Skip to content

writer.clone_document_from_reader(reader) issue #1673

@KpLBaTMaN

Description

@KpLBaTMaN

When trying to clone certain pdf documents, an "AttributeError: 'TextStringObject' object has no attribute '_clone'" appears. This problem happens consistently with the same "AttributeError: 'TextStringObject'.

Environment

Which environment were you using when you encountered the problem?

$ python -m platform
# Windows-10-10.0.19045-SP0

$ python -c "import pypdf;print(pypdf.__version__)"
# 3.4.1

Code + PDF

# TODO: Your code goes here
`   
    # Read the file
    reader = PdfReader(in_file_path, strict = False) # strict = False
    file_name = os.path.basename(in_file_path) # get file name of pdf
    
    # Open a writer file and clone
    writer = PdfWriter()
    writer.clone_document_from_reader(reader) # FAILS IN THIS METHOD
`

A few examples of the PDF documents that fail when trying to clone.

budgeting-loan-form-sf500.pdf
industrial-injuries-disablement-claim-form-bi100a_1.pdf

Traceback

This is the complete Traceback I see:

pypdf\generic_data_structures.py:233, in DictionaryObject._clone(self, src, pdf_dest, force_duplicate, ignore_fields)
231 cur_obj = None
232 for (s, c) in objs:
--> 233 c._clone(s, pdf_dest, force_duplicate, ignore_fields + [k])
235 for k, v in src.items():
236 if k not in ignore_fields:

AttributeError: 'TextStringObject' object has no attribute '_clone'

Metadata

Metadata

Assignees

No one assigned

    Labels

    is-robustness-issueFrom a users perspective, this is about robustness

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions