extractText() doesn't work on Chinese PDF

As I have tested, pure English content in a PDF can be extracted without problem.
But nothing readable could be extracted for a Chinese page.

I guess it's caused by the encoding.
I tried to modify the following line to below
https://github.com/mstamy2/PyPDF2/blob/master/PyPDF2/utils.py#L246

``` python
def u_(s):
    if sys.version_info[0] < 3:
        return unicode(s, encoding='utf-8')
    else:
        return s
```

But it doesn't work.

My environment:
- Python 2.7.10
- OS X El Capitan
- PyPDF2 version 1.25.1

Thank you.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

extractText() doesn't work on Chinese PDF #252

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

extractText() doesn't work on Chinese PDF #252

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions