PyPI page
Home page
Author:
None
License:
Apache-2.0
Summary:
Convert raw documents into AI-understandable context with intelligent text extraction, table detection, and semantic chunking
Latest version:
0.2.5
Required dependencies:
beautifulsoup4
|
chardet
|
docx2pdf
|
langchain-text-splitters
|
olefile
|
openpyxl
|
pdf2image
|
pdfminer-six
|
pdfplumber
|
pi-heif
|
pyhwp
|
pymupdf
|
pytesseract
|
python-docx
|
python-pptx
|
striprtf
|
xlrd
Optional dependencies:
cachetools
|
langchain
|
langchain-anthropic
|
langchain-aws
|
langchain-community
|
langchain-core
|
langchain-google-genai
|
langchain-openai
|
langgraph
|
langsmith
|
orjson
|
pandas
|
psutil
|
pydantic
|
pydantic-settings
|
python-dotenv
|
python-multipart
Downloads last day:
11
Downloads last week:
50
Downloads last month:
140