PyPI Stats

Search

All packages
Top packages

Track packages

invisible-rabbit


PyPI page
Home page
Author: Joseph Jennings, Mostofa Patwary, Sandeep Subramanian, Shrimai Prabhumoye, Ayush Dattagupta, Vibhu Jawa, Jiwei Liu, Ryan Wolf
Summary: Scalable Data Preprocessing Tool for Training Large Language Models
Latest version: 0.5.0
Required dependencies: awscli | beautifulsoup4 | charset-normalizer | comment-parser | crossfit | dask | dask-mpi | distributed | fasttext | ftfy | in-place | jieba | justext | lxml-html-clean | mwparserfromhell | nemo-toolkit | numpy | openai | peft | presidio-analyzer | presidio-anonymizer | pycld2 | resiliparse | spacy | unidic-lite | usaddress | warcio | zstandard
Optional dependencies: cudf-cu12 | cugraph-cu12 | cuml-cu12 | dask-cuda | dask-cudf-cu12 | nvidia-dali-cuda120 | nvidia-nvjpeg2k-cu12 | spacy | timm

Downloads last day: 0
Downloads last week: 8
Downloads last month: 21