Language Models Can See: Plugging Visual Controls in Text Generation
Why do you think that https://github.com/DavidHuji/CapDec is a good alternative to MAGIC
Language Models Can See: Plugging Visual Controls in Text Generation
Why do you think that https://github.com/DavidHuji/CapDec is a good alternative to MAGIC