PyPI page
Home page
Author:
ModelCloud
License:
Apache 2.0
Summary:
Production ready LLM model compression/quantization toolkit with hw accelerated inference support for both cpu/gpu via HF, vLLM, and SGLang.
Latest version:
2.2.0
Optional dependencies:
auto_round
|
bitblas
|
clearml
|
evalplus
|
fastapi
|
flashinfer-python
|
intel_extension_for_pytorch
|
isort
|
lm_eval
|
mlx_lm
|
optimum
|
parameterized
|
plotly
|
pydantic
|
pytest
|
random_word
|
ruff
|
sglang
|
triton
|
uvicorn
|
vllm
Downloads last day:
246
Downloads last week:
2,166
Downloads last month:
7,308