Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
Why do you think that https://github.com/tspannhw/FLaNK-Halifax is a good alternative to Video-LLaVA
Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
Why do you think that https://github.com/tspannhw/FLaNK-Halifax is a good alternative to Video-LLaVA