PyPI page
Home page
Author:
Nikolaos Giovanopoulos
License:
MIT License
Copyright (c) 2025 Nikolaos Giovanopoulos
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associ...
Summary:
Command-line tool to split documents into chunks and automatically generate question–answer datasets, designed for preparing data to fine-tune large language models (LLMs).
Latest version:
0.1.2
Required dependencies:
beautifulsoup4
|
markdown
|
markdownify
|
numpy
|
openai
|
requests
|
scikit-learn
Optional dependencies:
nltk
|
pdfplumber
|
sentence-transformers
Downloads last day:
5
Downloads last week:
18
Downloads last month:
23