Cannot `extract_text` from weasyprint generated PDF

Generating a PDF with the following code ends up not returning anything from `extractText`. 

``` python
"""
PyPDF2==2.1.0
WeasyPrint==55.0
"""

from io import BytesIO
from PyPDF2 import PdfReader

# Create example
from weasyprint import HTML
stream = BytesIO()
HTML(string="""
<html>
<body>
<div>Hello World</div>
</body>
</html>
""").write_pdf(stream)
stream.seek(0)

# Try to read "Hello World"
reader = PdfReader(stream)
print(reader.pages[0].extract_text())
```

In this issue: Kozea/WeasyPrint/issues/290 @liZe points out that other tools are able to extract the text.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cannot `extract_text` from weasyprint generated PDF #242

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Cannot extract_text from weasyprint generated PDF #242

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Cannot `extract_text` from weasyprint generated PDF #242