See #1269 for further details, this reports another issue I've come accross.
Environment
Which environment were you using when you encountered the problem?
$ python -m platform
Linux-5.4.0-122-generic-x86_64-with-glibc2.29
$ python -c "import PyPDF2;print(PyPDF2.__version__)"
2.10.3
Code + PDF
This is a minimal, complete example that shows the issue:
import PyPDF2
with open("Segmentation & Activation Lab.pdf", "rb") as f:
pdfreader = PyPDF2.PdfFileReader(f, strict=False)
full_content = " ".join([page.extractText() for page in pdfreader.pages])
PDF used above: Segmentation & Activation Lab.pdf
Traceback
This is the complete Traceback I see:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/PyPDF2/_page.py", line 1538, in extractText
return self.extract_text()
File "/usr/local/lib/python3.8/dist-packages/PyPDF2/_page.py", line 1510, in extract_text
return self._extract_text(
File "/usr/local/lib/python3.8/dist-packages/PyPDF2/_page.py", line 1146, in _extract_text
cmaps[f] = build_char_map(f, space_width, obj)
File "/usr/local/lib/python3.8/dist-packages/PyPDF2/_cmap.py", line 21, in build_char_map
encoding, space_code = parse_encoding(ft, space_code)
File "/usr/local/lib/python3.8/dist-packages/PyPDF2/_cmap.py", line 124, in parse_encoding
enc: Union(str, DictionaryObject) = ft["/Encoding"].get_object() # type: ignore
AttributeError: 'NoneType' object has no attribute 'get_object'
The PDF can be read using a normal PDF viewer and the PDF even comes from Adobe.
Another example:
See #1269 for further details, this reports another issue I've come accross.
Environment
Which environment were you using when you encountered the problem?
$ python -m platform Linux-5.4.0-122-generic-x86_64-with-glibc2.29 $ python -c "import PyPDF2;print(PyPDF2.__version__)" 2.10.3Code + PDF
This is a minimal, complete example that shows the issue:
PDF used above: Segmentation & Activation Lab.pdf
Traceback
This is the complete Traceback I see:
The PDF can be read using a normal PDF viewer and the PDF even comes from Adobe.
Another example: