PyPI page
Home page
Author:
None
Summary:
Convert PDF, DOCX, HTML, and TXT files — or web pages by URL — to clean, LLM-optimized Markdown with YAML frontmatter.
Latest version:
1.1.1
Required dependencies:
beautifulsoup4
|
defusedxml
|
lxml
|
mammoth
|
markdownify
|
pillow
|
pymupdf
|
pymupdf4llm
|
tomli
|
trafilatura
|
urllib3
Optional dependencies:
docling
|
pytest
|
pytest-snapshot
|
pyyaml
|
reportlab
|
ruff
Downloads last day:
29
Downloads last week:
409
Downloads last month:
1,175