PyPI page
Home page
Author:
Tiejin Chen
License:
MIT
Summary:
RLUF: Reinforcement Learning with Uncertainty Feedback via Conformal Prediction for DPO
Latest version:
0.1.0
Required dependencies:
accelerate
|
alpaca-farm
|
datasets
|
gensim
|
numpy
|
openai
|
python-dotenv
|
pyyaml
|
torch
|
tqdm
|
transformers
|
trl
Optional dependencies:
build
|
pytest
|
ruff
|
twine
Downloads last day:
2
Downloads last week:
18
Downloads last month:
36