PyPI page
Home page
Author:
None
Summary:
Reinforcement learning for text generation on MLX (Apple Silicon): GRPO/GSPO, environments, rollout, rewards, LoRA/QLoRA
Latest version:
0.1.11
Required dependencies:
aiohttp
|
gymnasium
|
mlx
|
mlx-lm
|
numpy
|
psutil
|
pytest
|
wandb
Optional dependencies:
aiohttp
|
black
|
pydantic
|
pytest
|
ruff
|
scikit-learn
|
sentence-transformers
Downloads last day:
24
Downloads last week:
40
Downloads last month:
83