mixture-of-experts

PyTorch Re-Implementation of "The Sparsely-Gated Mixture-of-Experts Layer" by Noam Shazeer et al. https://arxiv.org/abs/1701.06538 (by davidmrau)

Mixture-of-experts Alternatives

Similar projects and alternatives to mixture-of-experts

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better mixture-of-experts alternative or higher similarity.

mixture-of-experts reviews and mentions

Posts with mentions or reviews of mixture-of-experts. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-06-20.
  • [Rumor] Potential GPT-4 architecture description
    2 projects | /r/LocalLLaMA | 20 Jun 2023
  • Local and Global loss
    1 project | /r/pytorch | 4 Mar 2021
    I have a requirement of training pipeline similar to Mixture of Experts (https://github.com/davidmrau/mixture-of-experts/blob/master/moe.py) but I want to train the Experts on a local loss for 1 epoch before predicting outputs from them (which would then be concatenated for the global loss of MoE). Can anyone suggest what’s the best way to set up this training pipeline?

Stats

Basic mixture-of-experts repo stats
2
818
5.3
6 days ago

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com