LLaVA Alternatives

Similar projects and alternatives to LLaVA

llama.cpp

786 58,856 10.0 C++ LLaVA VS llama.cpp

LLM inference in C/C++
surge

170 2,951 9.5 C LLaVA VS surge

Synthesizer plug-in (previously released as Vember Audio Surge)
InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
FastChat

83 34,879 9.6 Python LLaVA VS FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
FLiPStackWeekly

84 14 9.9 LLaVA VS FLiPStackWeekly

FLaNK AI Weekly covering Apache NiFi, Apache Flink, Apache Kafka, Apache Spark, Apache Iceberg, Apache Ozone, Apache Pulsar, and more...
llama_index

76 32,094 10.0 Python LLaVA VS llama_index

LlamaIndex is a data framework for your LLM applications
MiniGPT-4

37 25,015 9.1 Python LLaVA VS MiniGPT-4

Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
llamafile

38 16,040 9.6 C++ LLaVA VS llamafile

Distribute and run LLMs with a single file.
SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
alpaca_lora_4bit

41 532 8.6 Python LLaVA VS alpaca_lora_4bit
LoRA

34 9,331 4.7 Python LLaVA VS LoRA

Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
OpenAdapt

28 619 9.3 Python LLaVA VS OpenAdapt

AI-First Process Automation with Large ([Language (LLMs) / Action (LAMs) / Multimodal (LMMs)] / Visual Language (VLMs)) Models
CogVLM

16 5,355 9.0 Python LLaVA VS CogVLM

a state-of-the-art-level open visual language model | 多模态预训练模型
chatgpt-web

10 1,729 9.3 Svelte LLaVA VS chatgpt-web

ChatGPT web interface using the OpenAI API (by Niek)
vimGPT

7 2,504 7.4 Python LLaVA VS vimGPT

Browse the web with GPT-4V and Vimium
mPLUG-Owl

2 1,974 7.6 Python LLaVA VS mPLUG-Owl

mPLUG-Owl & mPLUG-Owl2: Modularized Multimodal Large Language Model
Segment-Everything-Everywhere-All-At-Once

6 4,089 7.9 Python LLaVA VS Segment-Everything-Everywhere-All-At-Once

[NeurIPS 2023] Official implementation of the paper "Segment Everything Everywhere All at Once"
LaVIN

4 488 7.4 Python LLaVA VS LaVIN

[NeurIPS 2023] Official implementations of "Cheap and Quick: Efficient Vision-Language Instruction Tuning for Large Language Models"
Segment-Everything-Everywhere-

2 - - LLaVA VS Segment-Everything-Everywhere-
image2dsl

1 11 5.3 Python LLaVA VS image2dsl

This repository contains the implementation of an Image to DSL (Domain Specific Language) model. The model uses a pre-trained Vision Transformer (ViT) as an encoder to extract image features and a custom Transformer Decoder to generate DSL code from the extracted features.
pymobiledevice3

4 1,069 9.7 Python LLaVA VS pymobiledevice3

Pure python3 implementation for working with iDevices (iPhone, etc...).
InternVideo

3 1,013 8.4 Python LLaVA VS InternVideo

Video Foundation Models & Data for Multimodal Understanding
SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better LLaVA alternative or higher similarity.

Suggest an alternative to LLaVA

LLaVA reviews and mentions

Posts with mentions or reviews of LLaVA. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-12-10.

Show HN: I Remade the Fake Google Gemini Demo, Except Using GPT-4 and It's Real
4 projects | news.ycombinator.com | 10 Dec 2023

Update: For anyone else facing the commercial use question on LLaVA - it is licensed under Apache 2.0. Can be used commercially with attribution: https://github.com/haotian-liu/LLaVA/blob/main/LICENSE
Image-to-Caption Generator
3 projects | /r/computervision | 7 Dec 2023

https://github.com/haotian-liu/LLaVA (fairly established and well supported)
Llamafile lets you distribute and run LLMs with a single file
12 projects | news.ycombinator.com | 29 Nov 2023

That's not a llamafile thing, that's a llava-v1.5-7b-q4 thing - you're running the LLaVA 1.5 model at a 7 billion parameter size further quantized to 4 bits (the q4).
GPT4-Vision is running a MUCH larger model than the tiny 7B 4GB LLaVA file in this example.
LLaVA have a 13B model available which might do better, though there's no chance it will be anywhere near as good as GPT-4 Vision. https://github.com/haotian-liu/LLaVA/blob/main/docs/MODEL_ZO...
FLaNK Stack Weekly for 27 November 2023
28 projects | dev.to | 27 Nov 2023
Using GPT-4 Vision with Vimium to browse the web
9 projects | news.ycombinator.com | 8 Nov 2023

There are open source models such as https://github.com/THUDM/CogVLM and https://github.com/haotian-liu/LLaVA.
Is supervised learning dead for computer vision?
9 projects | news.ycombinator.com | 28 Oct 2023

Hey Everyone,
I’ve been diving deep into the world of computer vision recently, and I’ve gotta say, things are getting pretty exciting! I stumbled upon this vision-language model called LLaVA (https://github.com/haotian-liu/LLaVA), and it’s been nothing short of impressive.
In the past, if you wanted to teach a model to recognize the color of your car in an image, you’d have to go through the tedious process of training it from scratch. But now, with models like LLaVA, all you need to do is prompt it with a question like “What’s the color of the car?” and bam – you get your answer, zero-shot style.
It’s kind of like what we’ve seen in the NLP world. People aren’t training language models from the ground up anymore; they’re taking pre-trained models and fine-tuning them for their specific needs. And it looks like we’re headed in the same direction with computer vision.
Imagine being able to extract insights from images with just a simple text prompt. Need to step it up a notch? A bit of fine-tuning can do wonders, and from my experiments, it can even outperform models trained from scratch. It’s like getting the best of both worlds!
But here’s the real kicker: these foundational models, thanks to their extensive training on massive datasets, have an incredible grasp of image representations. This means you can fine-tune them with just a handful of examples, saving you the trouble of collecting thousands of images. Indeed, they can even learn with a single example (https://www.fast.ai/posts/2023-09-04-learning-jumps)
Adept Open Sources 8B Multimodal Modal
6 projects | news.ycombinator.com | 18 Oct 2023

Fuyu is not open source. At best, it is source-available. It's also not the only one.
A few other multimodal models that you can run locally include IDEFICS[0][1], LLaVA[2], and CogVLM[3]. I believe all of these have better licenses than Fuyu.
[0]: https://huggingface.co/blog/idefics
[1]: https://huggingface.co/HuggingFaceM4/idefics-80b-instruct
[2]: https://github.com/haotian-liu/LLaVA
[3]: https://github.com/THUDM/CogVLM
AI — weekly megathread!
2 projects | /r/artificial | 15 Oct 2023

Researchers released LLaVA-1.5. LLaVA (Large Language and Vision Assistant) is an open-source large multimodal model that combines a vision encoder and Vicuna for general-purpose visual and language understanding. LLaVA-1.5 achieved SoTA on 11 benchmarks, with just simple modifications to the original LLaVA and completed training in ~1 day on a single 8-A100 node [Demo | Paper | GitHub].
LLaVA: Visual Instruction Tuning: Large Language-and-Vision Assistant
1 project | news.ycombinator.com | 11 Oct 2023
LLaVA gguf/ggml version
1 project | /r/LocalLLaMA | 19 Sep 2023

Hi all, I’m wondering if there is a version of LLaVA https://github.com/haotian-liu/LLaVA that works with gguf and ggml models?? I know there is one for miniGPT4 but it just doesn’t seem as reliable as LLaVA but you need at least 24gb of vRAM for LLaVA to run it locally by the looks of it. The 4bit version still requires 12gb vram.
A note from our sponsor - SaaSHub
www.saashub.com | 30 May 2024

SaaSHub helps you find the best software and product alternatives Learn more →

Stats

Basic LLaVA repo stats

Mentions

Stars

17,102

Activity

9.3

Last Commit

2 days ago

haotian-liu/LLaVA is an open source project licensed under Apache License 2.0 which is an OSI approved license.

The primary programming language of LLaVA is Python.

Popular Comparisons