Suggest an alternative to

Video-LLaVA

Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

Why do you think that https://github.com/tspannhw/FLaNK-Halifax is a good alternative to Video-LLaVA