PyPI Stats

Search

All packages
Top packages

Track packages

unstructured


PyPI page
Home page
Author: Unstructured Technologies
License: Apache-2.0
Summary: A library that prepares raw documents for downstream ML tasks.
Latest version: 0.16.0
Required dependencies: backoff | beautifulsoup4 | chardet | dataclasses-json | emoji | filetype | langdetect | lxml | nltk | numpy | psutil | python-iso639 | python-magic | python-oxmsg | rapidfuzz | requests | tabulate | tqdm | typing-extensions | unstructured-client | wrapt
Optional dependencies: effdet | google-cloud-vision | langdetect | markdown | networkx | onnx | openpyxl | paddlepaddle | pandas | pdf2image | pdfminer.six | pi-heif | pikepdf | pypandoc | pypdf | python-docx | python-pptx | sacremoses | sentencepiece | torch | transformers | unstructured-inference | unstructured.paddleocr | unstructured.pytesseract | xlrd

Downloads last day: 66,362
Downloads last week: 372,862
Downloads last month: 1,565,038