llava-caption

PyPI page
Home page
Author: David Van de Ven
License: GPL3
Summary: Automatically caption images using various LLaVA multimodal models. This tool processes images with state-of-the-art vision language models to generate accurate, high-quality captions.
Latest version: 0.8.0
Required dependencies: httpx | huggingface-hub | json-repair | llama-cpp-python | mlx | mlx-vlm | ollama | pandas | pillow | requests | torch | tqdm | transformers

Downloads last day: 0
Downloads last week: 55
Downloads last month: 224