Effective prompting for Large Multimodal Models like GPT-4 Vision, LLaVA or CogVLM. 🔥
Why do you think that https://github.com/sshh12/multi_token is a good alternative to multimodal-maestro
Effective prompting for Large Multimodal Models like GPT-4 Vision, LLaVA or CogVLM. 🔥
Why do you think that https://github.com/sshh12/multi_token is a good alternative to multimodal-maestro