wizardextract

PyPI page
Home page
Author: None
License: WizardExtract — Copyright (C) 2024–2025 Mattia Rubino This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Publ...
Summary: Text extraction from PDFs, Word files, spreadsheets, and images. Local OCR with Tesseract and optional Azure Document Intelligence for text, tables, and key–value pairs. Includes page/sheet selection and a hybrid PDF mode.
Latest version: 1.0.1
Required dependencies: openpyxl | pip | pymupdf | pytesseract | xlrd
Optional dependencies: azure-ai-documentintelligence | azure-core

Downloads last day: 0
Downloads last week: 11
Downloads last month: 35