Skip to content

python 3 support#1

Merged
knowah merged 4 commits intopy-pdf:masterfrom
kushal-kumaran:master
Jul 31, 2012
Merged

python 3 support#1
knowah merged 4 commits intopy-pdf:masterfrom
kushal-kumaran:master

Conversation

@kushal-kumaran
Copy link
Copy Markdown
Contributor

Hi,

I have changes to make a single source tree work to build both python 2 and python 3 (using 2to3) packages. There are also changes to fix a couple of issues:

  • the spec says only the first 16 bytes of the encryption dictionary's U entry must be compared when security handler is revision 3 or greater. I have some pdf files (Producer: [iECCM Version 6.0.0] on Windows Vista) where decryption fails if you compare the entirety of U and real_U in the _authenticateUserPassword function
  • I had to make a change in the readObjectHeader function to skip over comments. This was needed to get number of pages in one of my pdf files (same producer as in previous issue). From my limited understanding of the Cross-Reference Table section, it seems like this file is non-conforming to the spec (the byte offset was pointing to the start of the comment before the actual object start), but I'm not sure.

I've tested these changes with a script that writes out decrypted version of some encrypted pdf files using both python2 and python3. If you guys are interested, you can test it out with your collection.

@claird
Copy link
Copy Markdown
Contributor

claird commented Jul 17, 2012

Marvelous! We'll be in touch. Many thanks.

@claird
Copy link
Copy Markdown
Contributor

claird commented Jul 17, 2012

I like your work very much. We definitely want to integrate it.
We're all tied up, perhaps to the end of July. We'll stay in
touch with you, though.

@kushal-kumaran
Copy link
Copy Markdown
Contributor Author

There's no rush. I'll continue testing and tweaking the code meanwhile.

@knowah
Copy link
Copy Markdown
Contributor

knowah commented Jul 21, 2012

Hi kushal-kumaran, I have one issue that maybe you can resolve (I might have done something wrong). I cloned your changes and am testing them for compatibility with Python 2.7 and 3.2. It works fine on 2.7, but after I run the setup.py script on 3.2, I am unable to use PyPDF2. Here's the output (I tried importing just one class and the whole library to no avail):

[knowah@THINK-NJK PyPDF2]$ python
Python 3.2.3 (default, Apr 23 2012, 23:35:30) 
[GCC 4.7.0 20120414 (prerelease)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from PyPDF2 import PdfFileReader
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "PyPDF2/__init__.py", line 1, in <module>
    from pdf import PdfFileReader, PdfFileWriter
ImportError: No module named pdf
>>> import PyPDF2
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "PyPDF2/__init__.py", line 1, in <module>
    from pdf import PdfFileReader, PdfFileWriter
ImportError: No module named pdf

I don't really have experience with Python 3, so I may be at fault here. To install I just ran python setup.py install. Do you know why it would be unable to load the pdf module?

@claird
Copy link
Copy Markdown
Contributor

claird commented Jul 22, 2012

On Sat, Jul 21, 2012 at 12:01:21PM -0700, knowah wrote:
.
.
.

Hi kushal-kumaran, I have one issue that maybe you can resolve (I might have done something wrong). I cloned your changes and am testing them for compatibility with Python 2.7 and 3.2. It works fine on 2.7, but after I run the setup.py script on 3.2, I am unable to use PyPDF2. Here's the output (I tried importing just one class and the whole library to no avail):

[knowah@THINK-NJK PyPDF2]$ python
Python 3.2.3 (default, Apr 23 2012, 23:35:30) 
[GCC 4.7.0 20120414 (prerelease)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from PyPDF2 import PdfFileReader
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "PyPDF2/__init__.py", line 1, in <module>
    from pdf import PdfFileReader, PdfFileWriter
ImportError: No module named pdf
>>> import PyPDF2
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "PyPDF2/__init__.py", line 1, in <module>
    from pdf import PdfFileReader, PdfFileWriter
ImportError: No module named pdf

I don't really have experience with Python 3, so I may be at fault here. To install I just ran python setup.py install. Do you know why it would be unable to load the pdf module?
.
.
.
I probably won't have time to get involved until mid-week.
Feel free to contact the guy yourself; he's at
reply@reply.github.com

@kushal-kumaran
Copy link
Copy Markdown
Contributor Author

@knowah is it possible you ran setup.py with python2 instead of python3? When running under python3, setup.py invokes 2to3, which would have changed the import statement in __init__.py from this:

from pdf import PdfFileReader, PdfFileWriter

to this:

from .pdf import PdfFileReader, PdfFileWriter

The leading dot in .pdf indicates that the import is a package-relative import (PEP 328).

The conversion to absolute imports is one of the conversions done by 2to3. So, there can be one of three problems:

  • setup.py was run using python2
  • when running python is somehow picking up the wrong version of PyPDF2
  • setup.py was run using python3, but something went wrong (seems unlikely, though, if there were no error messages)

@kushal-kumaran
Copy link
Copy Markdown
Contributor Author

@knowah After running into exactly the same issue myself yesterday, just make sure you are not running python from the directory with the PyPDF2 source. During the build process, 2to3 runs and converts the sources, but the converted files go to the build directory, and the original files are left alone. If you run python from the same directory, it will attempt to look at the PyPDF2 package in the current directory, getting the wrong versions of the files.

knowah added a commit that referenced this pull request Jul 31, 2012
@knowah knowah merged commit ad44feb into py-pdf:master Jul 31, 2012
polyglot-jones pushed a commit to polyglot-jones/PyPDF2 that referenced this pull request Aug 11, 2020
The unit tests are currently failing on all Python versions.

Closes py-pdf#1
Fixes py-pdf#36
vashek referenced this pull request in vashek/PyPDF2 May 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants