Run Effective Large Batch Contrastive Learning Beyond GPU/TPU Memory Constraint
Why do you think that https://github.com/DarshanDeshpande/jax-models is a good alternative to GradCache
Run Effective Large Batch Contrastive Learning Beyond GPU/TPU Memory Constraint
Why do you think that https://github.com/DarshanDeshpande/jax-models is a good alternative to GradCache