PyPI Stats

Search

All packages
Top packages

Track packages

dalla-data-processing


PyPI page
Home page
Author: None
License: CC-BY-NC-SA-4.0
Summary: data processing pipeline with deduplication, stemming, quality checking, and readability scoring, used for the DALLA Models
Latest version: 0.0.11
Required dependencies: click | datasets | pyarrow | structlog | tqdm | transformers
Optional dependencies: camel-tools | cffi | dalla-data-processing | pre-commit | pytest | pytest-cov | pyyaml | ruff | sentencepiece | textstat

Downloads last day: 11
Downloads last week: 47
Downloads last month: 85