PyPI Stats

Search

All packages
Top packages

Track packages

unstructured


PyPI page
Home page
Author: Unstructured Technologies
License: Apache-2.0
Summary: A library that prepares raw documents for downstream ML tasks.
Latest version: 0.14.2
Required dependencies: backoff | beautifulsoup4 | chardet | dataclasses-json | emoji | filetype | langdetect | lxml | nltk | numpy | python-iso639 | python-magic | rapidfuzz | requests | tabulate | typing-extensions | unstructured-client | wrapt
Optional dependencies: adlfs | astrapy | atlassian-python-api | azure-search-documents | boto3 | boxfs | bs4 | chromadb | clarifai | databricks-sdk | deltalake | discord-py | dropboxdrivefs | effdet | elasticsearch | fsspec | gcsfs | google-api-python-client | google-cloud-vision | htmlbuilder | hubspot-api-client | huggingface | importlib-metadata | langchain | langchain-community | langchain-google-vertexai | langdetect | markdown | msal | msg-parser | networkx | notion-client | office365-rest-python-client | onnx | openai | openpyxl | opensearch-py | pandas | paramiko | pdf2image | pdfminer.six | pikepdf | pillow-heif | pinecone-client | praw | psycopg2-binary | pyairtable | pygithub | pymongo | pypandoc | pypdf | pytesseract | python-docx | python-gitlab | python-pptx | qdrant-client | s3fs | sacremoses | sentence-transformers | sentencepiece | simple-salesforce | slack-sdk | tiktoken | torch | transformers | typer | unstructured-inference | unstructured.paddleocr | unstructured.pytesseract | urllib3 | weaviate-client | wikipedia | xlrd

Downloads last day: 53,153
Downloads last week: 298,466
Downloads last month: 1,171,970