PyTorch Re-Implementation of "The Sparsely-Gated Mixture-of-Experts Layer" by Noam Shazeer et al. https://arxiv.org/abs/1701.06538
Why do you think that https://github.com/learning-at-home/hivemind is a good alternative to mixture-of-experts
PyTorch Re-Implementation of "The Sparsely-Gated Mixture-of-Experts Layer" by Noam Shazeer et al. https://arxiv.org/abs/1701.06538
Why do you think that https://github.com/learning-at-home/hivemind is a good alternative to mixture-of-experts