PyPI page
Home page
Author:
Srihari Unnikrishnan
Summary:
Chunk-level KV cache reuse for faster HuggingFace inference
Latest version:
0.7.0
Required dependencies:
accelerate
|
autoawq-kernels
|
sentencepiece
|
torch
|
transformers
Optional dependencies:
fastapi
|
furo
|
httpx
|
huggingface_hub
|
myst-parser
|
ninja
|
pydantic
|
pytest
|
pytest-asyncio
|
ruff
|
safetensors
|
sphinx
|
sphinx-copybutton
|
uvicorn
Downloads last day:
139
Downloads last week:
259
Downloads last month:
548