PyPI page
Home page
Author:
Janek Bevendorff
License:
Apache License 2.0
Summary:
A collection of robust and fast processing tools for parsing and analyzing (not only) web archive data.
Latest version:
0.16.0
Required dependencies:
apache_beam
|
fastwarc
Optional dependencies:
beautifulsoup4
|
boto3
|
click
|
elasticsearch
|
joblib
|
langid
|
pytest
|
pytest-cov
|
resiliparse
|
selectolax
|
tqdm
Downloads last day:
66,167
Downloads last week:
517,622
Downloads last month:
1,287,222