Skip to content

Checkboxes, update_page_form_field_values, encoding and german characters. #2021

@kubaPod

Description

@kubaPod

I am filling a form with text and checkbox fields. Text values may contain german characters.
If a checkbox is programmatically selected then subsequent fields are corrupted.
It might not be a pypdf fault, see viewers comparison in caveats below.

image vs. image

Caveats:

  • if the corrupted field gets focus the displayed text is correct ( back to incorrect when the focus is lost):
    image

  • case reproduced in Adobe Acrobat Reader

  • Chrome viewer shows correct characters:
    image

  • Firefox viewer shows incorrect characters (even in the first field) regardless of checkbox:
    image

Environment

Which environment were you using when you encountered the problem?

$ python -m platform
Windows-10-10.0.19045-SP0

$ python -c "import pypdf;print(pypdf.__version__)"
3.12.2

Code + PDF

This is a minimal, complete example that shows the issue:

from pypdf import PdfWriter

writer = PdfWriter()
writer.append("./visa-form.pdf")

writer.update_page_form_field_values(
    writer.pages[0],
    {
        '1 Surname': 'Zürich', 
    #    'Check Box 9.1': '/Ja',  # skip or not
        '9': 'Zürich'
    }    
)

with open("./zurich-test.pdf", "wb") as output_stream:
    writer.write(output_stream)
    output_stream.close()

visa-form.pdf
zurich-test-no-checkbox.pdf
zurich-test-with-checkbox.pdf

Traceback


TODO

Metadata

Metadata

Assignees

No one assigned

    Labels

    workflow-formsFrom a users perspective, forms is the affected feature/workflow

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions