Skip to content

BlackIs1 not supported for CCITTFaxDecode filter #3193

@GiovanniNova

Description

@GiovanniNova

When encoding/decoding 1bit (Group 4/CCITT)) tiff files black can either mean 0 or 1 depending on /DecodeParms /BlackIs1 variable

Code + PDF

By simply changing BlackIs1 from false to true on imagemagick-CCITTFaxDecode.pdf we are already able to observe this behaviour.

from pypdf import PdfReader

def extract_images(path):
    reader = PdfReader(path)

    page = reader.pages[0]

    for count, image_file_object in enumerate(page.images):
        with open(path + str(count) + image_file_object.name, "wb") as fp:
            fp.write(image_file_object.data)

extract_images('imagemagick-CCITTFaxDecode.pdf') # Standard black background with a white smiley face
extract_images('imagemagick-CCITTFaxDecode_BlackIs1-true.pdf') # On browsers and other PDF viewers displays a white background with a black smiley face, but the output here is the same as the original PDF

I suppose these changes should go into '_get_imagemode', but /DecodeParms might be specific to /CCITTFaxDecode filter in which case they'd probably go into '_get_mode_and_invert_color' or even '_xobj_to_image' instead.

Metadata

Metadata

Assignees

No one assigned

    Labels

    is-bugFrom a users perspective, this is a bug - a violation of the expected behavior with a compliant PDFworkflow-imagesFrom a users perspective, image handling is the affected feature/workflow

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions