Skip to content

Shutdown of Apache Tika Corpora #3035

@stefan6419846

Description

@stefan6419846

Windows tests started failing as the Apache Tika Corpora site has been taken offline some hours ago: https://lists.apache.org/thread/l53lct6hjojwlhsfwcnzgtj5b1kpyo0h

Example error:

FAILED tests/test_page.py::test_extract_text[https://corpora.tika.apache.org/base/docs/govdocs1/932/932446.pdf-tika-932446.pdf] - urllib.error.URLError: <urlopen error [WinError 10061] No connection could be made because the target machine actively refused it>

We have to review all the corresponding URLs and check for suitable solutions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    is-maintenanceAnything that is just internal: Simplifying code, syntax changes, updating docs, speed improvementsnf-testingNon-functional change: Testing

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions