PyPI page
Home page
Author:
Manuel de Prada Corral
License:
GPL-3.0-only
Summary:
Perplexity filter for documents and bulk HTML and WARC boilerplate removal.
Latest version:
0.2.12
Required dependencies:
cached-path
|
flask
|
html5lib
|
lxml
|
memory-tempfile
|
nltk
|
pandas
|
storable
|
typer
|
warcio
Downloads last day:
296
Downloads last week:
3,390
Downloads last month:
54,643