PyPI page
Home page
Author:
back2matching
License:
Apache-2.0
Summary:
TurboQuant KV cache compression for LLM inference. Open-source pip-installable implementation for HuggingFace models.
Latest version:
0.2.0
Required dependencies:
numpy
|
scipy
|
torch
|
transformers
Optional dependencies:
fastapi
|
pytest
|
pytest-benchmark
|
uvicorn
Downloads last day:
98
Downloads last week:
475
Downloads last month:
3,912