maincontentextractor

PyPI page
Home page
Author: HawkClaws
License: MIT
Summary: A library to extract the main content from html. Developed for information on LLM and for feeding data into LangChain and LlamaIndex.
Latest version: 0.0.4
Required dependencies: beautifulsoup4 | html2text | trafilatura

Downloads last day: 1,710
Downloads last week: 14,478
Downloads last month: 86,307