Awesome-Multimodal-Large-Language-Models

:sparkles::sparkles:Latest Papers and Datasets on Multimodal Large Language Models, and Their Evaluation. (by BradyFU)

Awesome-Multimodal-Large-Language-Models Alternatives

Similar projects and alternatives to Awesome-Multimodal-Large-Language-Models

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better Awesome-Multimodal-Large-Language-Models alternative or higher similarity.

Awesome-Multimodal-Large-Language-Models reviews and mentions

Posts with mentions or reviews of Awesome-Multimodal-Large-Language-Models. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-07-08.
  • Don't we need a leaderboard for visual models?
    1 project | /r/LocalLLaMA | 6 Dec 2023
    There is this one: https://github.com/BradyFU/Awesome-Multimodal-Large-Language-Models/tree/Evaluation As well as a leaderboard from OpenCompass (probably outdated): https://mmbench.opencompass.org.cn/leaderboard
  • Recommended open LLMs with image input modality?
    3 projects | /r/LocalLLaMA | 8 Jul 2023
    https://github.com/BradyFU/Awesome-Multimodal-Large-Language-Models/tree/Evaluation this is pretty comprehensive. tldr; blip is probably the best, though i've heard it does need a lot of vram. In my experience its the most responsive to prompt engineering.