PyPI page
Home page
Author:
Shorya Sethia
Summary:
A comprehensive PDF processing toolkit that converts PDFs to markdown with advanced AI-powered features for image and table analysis. Supports local files and URLs, preserves document structure, extracts high-quality images, detects tables using advanced ML models, and generates detailed content descriptions using multiple LLM providers including OpenAI GPT-4o, Google Gemini, Anthropic Claude, Groq, OpenRouter, and LiteLLM.
Latest version:
4.0.2
Required dependencies:
beautifulsoup4
|
docling
|
docling_core
|
google-genai
|
numpy
|
openai
|
openpyxl
|
packaging
|
pandas
|
pillow
|
protobuf
|
pymupdf
|
python-dotenv
|
requests
|
tqdm
Optional dependencies:
anthropic
|
groq
|
litellm
|
ollama
Downloads last day:
10
Downloads last week:
81
Downloads last month:
408