PyPI page
Home page
Author:
Unstructured Technologies
License:
Apache-2.0
Summary:
A library that prepares raw documents for downstream ML tasks.
Latest version:
0.15.13
Required dependencies:
backoff
|
beautifulsoup4
|
chardet
|
dataclasses-json
|
emoji
|
filetype
|
langdetect
|
lxml
|
nltk
|
numpy
|
psutil
|
python-iso639
|
python-magic
|
python-oxmsg
|
rapidfuzz
|
requests
|
tabulate
|
tqdm
|
typing-extensions
|
unstructured-client
|
wrapt
Optional dependencies:
adlfs
|
astrapy
|
atlassian-python-api
|
azure-search-documents
|
boto3
|
boxfs
|
bs4
|
chromadb
|
clarifai
|
confluent-kafka
|
databricks-sdk
|
deltalake
|
discord-py
|
dropboxdrivefs
|
effdet
|
elasticsearch
|
fsspec
|
gcsfs
|
google-api-python-client
|
google-cloud-vision
|
htmlbuilder
|
hubspot-api-client
|
importlib-metadata
|
langchain
|
langchain-community
|
langchain-google-vertexai
|
langchain-huggingface
|
langchain-openai
|
langchain-voyageai
|
langdetect
|
markdown
|
mixedbread-ai
|
msal
|
networkx
|
notion-client
|
office365-rest-python-client
|
onnx
|
openai
|
openpyxl
|
opensearch-py
|
paddlepaddle
|
pandas
|
paramiko
|
pdf2image
|
pdfminer.six
|
pi-heif
|
pikepdf
|
pinecone-client
|
praw
|
psycopg2-binary
|
pyairtable
|
pygithub
|
pymongo
|
pypandoc
|
pypdf
|
python-docx
|
python-gitlab
|
python-pptx
|
qdrant-client
|
s3fs
|
sacremoses
|
sentencepiece
|
simple-salesforce
|
singlestoredb
|
slack-sdk
|
tenacity
|
tiktoken
|
torch
|
transformers
|
typer
|
unstructured-inference
|
unstructured.paddleocr
|
unstructured.pytesseract
|
urllib3
|
weaviate-client
|
wikipedia
|
xlrd
Downloads last day:
59,729
Downloads last week:
343,393
Downloads last month:
1,510,278