PyPI Stats

Search

All packages
Top packages

Track packages

auralith-data-pipeline


PyPI page
Home page
Author: None
Summary: Production-grade data collection and processing pipeline for training LLMs and multimodal AI
Latest version: 0.1.11
Required dependencies: click | datasets | datasketch | ftfy | huggingface-hub | langdetect | numpy | pillow | pyyaml | requests | rich | safetensors | scipy | sentencepiece | soundfile | tqdm | xxhash
Optional dependencies: astropy | azure-storage-blob | black | boto3 | decord | extract-msg | faiss-cpu | google-cloud-storage | h5py | librosa | mlflow | mypy | opencv-python | openpyxl | pdfplumber | pre-commit | psutil | pytest | pytest-asyncio | pytest-cov | python-docx | python-pptx | rarfile | ray | ruff | sentence-transformers | striprtf | timm | transformers | tree-sitter | tree-sitter-languages | wandb | warcio | zarr

Downloads last day: 28
Downloads last week: 81
Downloads last month: 112