PyPI page
Home page
Author:
HawkClaws
License:
MIT
Summary:
A library to extract the main content from html. Developed for information on LLM and for feeding data into LangChain and LlamaIndex.
Latest version:
0.0.4
Required dependencies:
beautifulsoup4
|
html2text
|
trafilatura
Downloads last day:
56,074
Downloads last week:
193,672
Downloads last month:
1,869,598