PyPI page
Home page
Author:
None
Summary:
Extreme weight and KV cache compression for LLMs on Apple Silicon (MLX implementation of Google's TurboQuant)
Latest version:
0.3.0
Required dependencies:
mlx
|
mlx-lm
|
numpy
Optional dependencies:
build
|
datasets
|
pytest
|
transformers
|
twine
Downloads last day:
31
Downloads last week:
226
Downloads last month:
997