Skip to content

update_page_form_field_values wrong encoding #2035

@coolbombom

Description

@coolbombom

Hi

when using update_page_form_field_values special characters (æ,ø,å in denmark) is not correct encoded. ø bcomes ø

I tried to use the function update_page_form_field_values from PyPDF2 version 3.0.0 (https://pypdf2.readthedocs.io/en/3.0.0/_modules/PyPDF2/_writer.html#PdfWriter.update_page_form_field_values) inside version pypdf 3.13.0. Replaced the function from the old version inside the new. Then æ,ø,å worked fine, so there is an issue with the new update_page_form_field_values function.

The old update_page_form_field_values function has other flaws though, like already filled form fields not being updated. see my issue here: #2034

Environment

Which environment were you using when you encountered the problem?

$ python -m platform
Linux-6.2.12-arch1-1-x86_64-with-glibc2.37

$ python -c "import pypdf;print(pypdf.__version__)"
3.13.0

Code

from PyPDF2 import PdfWriter

dst_file = "test_output.pdf"
writer = PdfWriter()
writer.append("test.pdf")
form_fields = {"Text Box 1":"test æ ø å"}
for idx,page in enumerate(writer.pages):
     writer.update_page_form_field_values(writer.pages[idx], form_fields)
with open(dst_file, "wb") as output_stream:
     writer.write(output_stream)
writer.close()

test.pdf

Traceback

no traceback

Metadata

Metadata

Assignees

No one assigned

    Labels

    workflow-formsFrom a users perspective, forms is the affected feature/workflow

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions