PyPI page
Home page
Author:
None
License:
WizardDocx — Copyright (C) 2024–2025 Mattia Rubino
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU Affero General Public ...
Summary:
Text extraction from Microsoft Word files. Parses Word documents natively and can optionally run local OCR with Tesseract for embedded images or scanned pages. Supports page selection and bytes input. Legacy .doc is read-only and OCR is not available.
Latest version:
1.0.0
Required dependencies:
pip
|
pytesseract
Downloads last day:
0
Downloads last week:
6
Downloads last month:
23