Looking for a pre trained food recognition model

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

LLaVA

20 15,910 9.4 Python

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Please read the rules before posting. If you want a model for visual instruction, use LLaVA, LaVIN, or MiniGPT-4.

LaVIN

4 470 7.4 Python

[NeurIPS 2023] Official implementations of "Cheap and Quick: Efficient Vision-Language Instruction Tuning for Large Language Models"

Please read the rules before posting. If you want a model for visual instruction, use LLaVA, LaVIN, or MiniGPT-4.

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
MiniGPT-4

37 24,834 9.4 Python

Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)

Please read the rules before posting. If you want a model for visual instruction, use LLaVA, LaVIN, or MiniGPT-4.

image2dsl

1 11 5.3 Python

This repository contains the implementation of an Image to DSL (Domain Specific Language) model. The model uses a pre-trained Vision Transformer (ViT) as an encoder to extract image features and a custom Transformer Decoder to generate DSL code from the extracted features.

Sorry to be mean here, but I think your idea is doable. Basically, you can build an image to category model and then feed its output to a fine-tuned LLama. You can have a look at one of my experiments for image to text-> https://github.com/mzbac/image2dsl

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project