SaaSHub helps you find the best software and product alternatives Learn more →
Top 9 Python llava Projects
-
LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
-
SUPIR
SUPIR aims at developing Practical Algorithms for Photo-Realistic Image Restoration In the Wild
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
multimodal-maestro
Effective prompting for Large Multimodal Models like GPT-4 Vision, LLaVA or CogVLM. 🔥
-
awesome-foundation-and-multimodal-models
👁️ + 💬 + 🎧 = 🤖 Curated list of top foundation and multimodal models! [Paper + Code + Examples + Tutorials]
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Project mention: Show HN: I Remade the Fake Google Gemini Demo, Except Using GPT-4 and It's Real | news.ycombinator.com | 2023-12-10Update: For anyone else facing the commercial use question on LLaVA - it is licensed under Apache 2.0. Can be used commercially with attribution: https://github.com/haotian-liu/LLaVA/blob/main/LICENSE
Current SOTA open source is I believe SUPIR (Example - https://replicate.com/p/okgiybdbnlcpu23suvqq6lufze), but it needs a lot of VRAM, or you can run it through replicate, or here's the repo (https://github.com/Fanghua-Yu/SUPIR)
Project mention: Show HN: Multimodal Maestro – Prompt tools for use with LMMs | news.ycombinator.com | 2023-11-29
Project mention: Show HN: I scraped 200M Shopify products to build a search engine | news.ycombinator.com | 2024-02-22I found some things on Github you could use, I'm not a dev myself and I'm not sure how scalable these are, but have a look, maybe there's something useful. https://github.com/jhc13/taggui
The category filtering is what I wanted to get at, I think the search would improve a lot.
Project mention: Embed arbitrary modalities (images, audio, documents, etc.) into LLMs | news.ycombinator.com | 2023-12-18
If you have a decent gpu (16gb+ vram) and are using Linux, then this tool I wrote some days ago might do the trick. (at least for googles recaptcha). Also, for now, you have to call the main.py every time you see a captcha on a site and you need the gui since I am only using vision via Screenshots, no HTML or similar. (Sorry that it's not yet that well optimized. I am currently very busy with lots of other things, but next week I should have time to improve this further. But it should still work for basic scraping.) https://github.com/notune/captcha-solver/
Project mention: Exploring Image Classification with Multimodal LLMs | news.ycombinator.com | 2024-01-17
Concept Modeling Techniques: the built-in concept modeling technique in this walkthrough uses GPT-4V and some light prompting to identify each cluster's core concept. This is but one way to approach an open-ended problem. Try using image captioning and topic modeling, or create your own technique!
Python llava related posts
- Compressing Images with Neural Networks
- Show HN: I Remade the Fake Google Gemini Demo, Except Using GPT-4 and It's Real
- Llamafile lets you distribute and run LLMs with a single file
- LLaVA: Visual Instruction Tuning: Large Language-and-Vision Assistant
- LLaVA gguf/ggml version
- Ai trained on photos
- Looking for a pre trained food recognition model
-
A note from our sponsor - SaaSHub
www.saashub.com | 29 Apr 2024
Index
What are some of the best open-source llava projects in Python? This list will help you:
Project | Stars | |
---|---|---|
1 | LLaVA | 16,101 |
2 | SUPIR | 3,359 |
3 | multimodal-maestro | 942 |
4 | awesome-foundation-and-multimodal-models | 509 |
5 | taggui | 305 |
6 | multi_token | 136 |
7 | captcha-solver | 61 |
8 | LLM-Image-Classification | 13 |
9 | fiftyone-image-captioning-plugin | 5 |
Sponsored