PyPI page
Home page
Author:
None
License:
WizardExtract — Copyright (C) 2024–2025 Mattia Rubino
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU Affero General Publ...
Summary:
Text extraction from PDFs, Word files, spreadsheets, and images. Local OCR with Tesseract and optional Azure Document Intelligence for text, tables, and key–value pairs. Includes page/sheet selection and a hybrid PDF mode.
Latest version:
1.0.1
Required dependencies:
openpyxl
|
pip
|
pymupdf
|
pytesseract
|
xlrd
Optional dependencies:
azure-ai-documentintelligence
|
azure-core
Downloads last day:
0
Downloads last week:
4
Downloads last month:
12