PyPI page
Home page
Author:
None
Summary:
Ichigo Whisper is a compact (22M parameters), open-source speech tokenizer for the whisper-medium model, designed to enhance performance on multilingual with minimal impact on its original English capabilities. Unlike models that output continuous embeddings, Ichigo Whisper compresses speech into discrete tokens, making it more compatible with large language models (LLMs) for immediate speech understanding.
Latest version:
2.3.0
Required dependencies:
black
|
build
|
datasets
|
evaluate
|
gradio
|
huggingface_hub
|
jiwer
|
librosa
|
lightning
|
matplotlib
|
openai_whisper
|
seaborn
|
soundfile
|
torch
|
torchaudio
|
transformers
|
twine
|
vector_quantize_pytorch
|
wandb
|
webdataset
Downloads last day:
0
Downloads last week:
51
Downloads last month:
179