PyPI page
Home page
Author:
None
License:
MIT
Summary:
A data processing and analysis pipeline designed to handle various jobs related to data transformation, quality assessment, deduplication, and formatting. The pipeline can be configured and executed using YAML configuration files.
Latest version:
0.2.1
Required dependencies:
datasets
|
faiss-cpu
|
fuzzywuzzy
|
huggingface-hub
|
langchain-community
|
langchain-core
|
loguru
|
numpy
|
onnxruntime
|
openai
|
pandas
|
pydantic
|
python-levenshtein
|
retry
|
rich
|
ruamel-yaml
|
sqlalchemy
|
typer
Optional dependencies:
black
|
build
|
faiss-gpu
|
flake8
|
ipykernel
|
langchain-community
|
mypy
|
pytest
|
pytest-cov
|
sentence-transformers
|
tabulate
|
torch
|
transformers
|
twine
Downloads last day:
17
Downloads last week:
144
Downloads last month:
203