PyPI Stats

Search

All packages
Top packages

Track packages

trafilatura


PyPI page
Home page
Author: None
License: Apache 2.0
Summary: Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML.
Latest version: 2.0.0
Required dependencies: cchardet | certifi | charset_normalizer | courlan | faust-cchardet | htmldate | justext | lxml | urllib3
Optional dependencies: brotli | flake8 | htmldate | mypy | py3langid | pycurl | pytest | pytest-cov | types-lxml | types-urllib3 | urllib3 | zstandard

Downloads last day: 21,458
Downloads last week: 108,335
Downloads last month: 1,088,540