PyPI page
Home page
Author:
None
Summary:
Nested-lattice KV-cache compression for LLM inference: Zamir-Feder D4 and E8 variants with shaping gain over scalar quantisation.
Latest version:
1.5.0
Required dependencies:
torch
Optional dependencies:
build
|
pytest
|
transformers
|
twine
Downloads last day:
0
Downloads last week:
20
Downloads last month:
137