PyPI page
Home page
Author:
DeepHarvest Contributors
Summary:
The world's most complete, resilient, multilingual web crawler
Latest version:
1.0.4
Required dependencies:
aiohttp
|
beautifulsoup4
|
boto3
|
chardet
|
charset-normalizer
|
click
|
datasketch
|
extruct
|
html5lib
|
langdetect
|
lxml
|
mmh3
|
networkx
|
numpy
|
openpyxl
|
pandas
|
pillow
|
playwright
|
prometheus-client
|
psycopg2-binary
|
pymupdf
|
pytesseract
|
python-dateutil
|
python-docx
|
python-pptx
|
pyyaml
|
redis
|
scikit-learn
|
setuptools
|
simhash
|
tqdm
Optional dependencies:
black
|
chromadb
|
faiss-cpu
|
flake8
|
mypy
|
psutil
|
pyarrow
|
pytest
|
pytest-asyncio
|
pytest-cov
|
sphinx
|
sphinx-rtd-theme
|
torch
|
transformers
Downloads last day:
6
Downloads last week:
79
Downloads last month:
89