Encoding issue in extract_text()

I need to read <a href="http://www.bepid.ifce.edu.br/resultado_prova_selecao.pdf">this PDF</a>.
However, it does not correctly extracts the text.

```
f = open('myfile.pdf', 'rb')
reader = PdfFileReader(f)
content = reader.getPage(0).extractText()
f.close()

print(content)
```

This print

```
Resultado da Prova de Sele“‰o...
```

But I expected

```
Resultado da Prova de Seleção...
```

<a href="http://stackoverflow.com/questions/33664665/error-in-the-coding-of-the-characters-in-reading-a-pdf/33667470">Accordance with the answer on Stack Overflow</a>, this problem is in PyPDF


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Encoding issue in extract_text() #235

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Encoding issue in extract_text() #235

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions