Will hang on invalid PDFs

Doing some testing, I noticed that PyPDF2 will hang if it encounters an invalid PDF… for example, the `skipOverComment` function:

``` python
def skipOverComment(stream):
    tok = stream.read(1)
    stream.seek(-1, 1)
    if tok == b_('%'):
        while tok not in (b_('\n'), b_('\r')):
            tok = stream.read(1)
```

Will hang indefinitely.

I would propose three courses of action:

1) Wrap the stream in a method which will raise an exception after a certain number of empty reads; ex:

``` python
class SafeStream(object):
    def __init__(self, stream):
        self.stream = stream
        self.seek = stream.seek
        self.tell = stream.tell
        self._empty_reads = 0

    def read(self, *args):
        res = self.stream.read(*args)
        if res == "":
             self._empty_reads += 1
             if self._empty_reads > 1000:
                 raise Exception("too many empty reads")
        else:
             self._empty_reads = 0
        return res
```

2) Add a script for automating fuzz testing to the repo

3) Fix the bugs as the script from step (2) finds them

What do you think? Would you be open to patches for those?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Will hang on invalid PDFs #77

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Will hang on invalid PDFs #77

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions