PyPI page
Home page
Author:
Qubâ–³se Team
Summary:
Data processing pipeline for LLM training datasets
Latest version:
1.0.0
Required dependencies:
beautifulsoup4
|
bertopic
|
chardet
|
click
|
croniter
|
fastapi
|
feedparser
|
gunicorn
|
jinja2
|
kafka-python
|
langdetect
|
nltk
|
numpy
|
opencv-python
|
pandas
|
pdfplumber
|
pillow
|
plotly
|
psutil
|
pydantic
|
pymupdf
|
pytesseract
|
python-docx
|
pyyaml
|
reportlab
|
requests
|
rich
|
scikit-learn
|
spacy
|
streamlit
|
transformers
|
uvicorn
Optional dependencies:
black
|
flake8
|
mypy
|
pytest
|
scrapy
|
selenium
Downloads last day:
3
Downloads last week:
15
Downloads last month:
26