PyPI Stats

Search

All packages
Top packages

Track packages

datatrove


PyPI page
Home page
Author: None
License: Apache-2.0
Summary: HuggingFace library to process and filter large amounts of webdata
Latest version: 0.9.0
Required dependencies: dill | fsspec | huggingface-hub | humanize | loguru | multiprocess | numpy | tqdm
Optional dependencies: aiofiles | aiosqlite | bitsandbytes | botok | datasets | datatrove | fasteners | fasttext-numpy2-wheel | faust-cchardet | flask | ftfy | httpx | indic-nlp-library | inscriptis | jieba | khmer-nltk | kiwipiepy | laonlp | lighteval | moto | nltk | numpy | orjson | pandas | pyahocorasick | pyarrow | pyidaungsu-numpy2 | pytest | pytest-rerunfailures | pytest-timeout | pytest-xdist | pythainlp | python-magic | pyvi | pyyaml | ray | regex | rich | ruff | s3fs | sglang | spacy | stanza | tensorflow | tldextract | tokenizers | trafilatura | transformers | typer | urduhack | vllm | warcio | xxhash | zstandard

Downloads last day: 1,475
Downloads last week: 9,487
Downloads last month: 51,967