Try PyMuPDF Welcome to pdf2docx¶ pdf2docx is a Python library to extract data from PDF with PyMuPDF, parse layout with rule, and generate docx files with python-docx. pdf2docx is hosted on GitHub and registered on PyPI. USER GUIDE Installation Install from PyPI Install from source code remotely Install from source code locally Uninstall Quickstart Convert PDF Extract table Command Line Interface Graphic User Interface Technical Documentation License and Copyright MIT License API DOCUMENTATION pdf2docx pdf2docx package Indices and tables¶ Index Module Index This documentation covers all versions up to 0.5.12.