makeMoE
From scratch implementation of a sparse mixture of experts language model inspired by Andrej Karpathy's makemore :) (by AviSoori1x)
mixtral-offloading
Run Mixtral-8x7B models in Colab or consumer desktops (by dvmazur)
makeMoE | mixtral-offloading | |
---|---|---|
3 | 3 | |
518 | 2,255 | |
- | - | |
9.0 | 8.6 | |
about 2 months ago | about 1 month ago | |
Jupyter Notebook | Python | |
MIT License | MIT License |
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
makeMoE
Posts with mentions or reviews of makeMoE.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2024-03-27.
-
DBRX: A New Open LLM
This repo I created and the linked blog will help in understanding this: https://github.com/AviSoori1x/makeMoE
- FLaNK AI Weekly 25 March 2025
- Implementation of mixture of experts language model in a single file of PyTorch
mixtral-offloading
Posts with mentions or reviews of mixtral-offloading.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2024-03-27.
-
DBRX: A New Open LLM
Waiting for Mixed Quantization with MQQ and MoE Offloading [1]. With that I was able to run Mistral 8x7B on my 10 GB VRAM rtx3080... This should work for DBRX and should shave off a ton of VRAM requirement.
1. https://github.com/dvmazur/mixtral-offloading?tab=readme-ov-...
- Mixtral in Colab
- Run Mixtral-8x7B models in Colab or consumer desktops
What are some alternatives?
When comparing makeMoE and mixtral-offloading you can also consider the following projects:
mergekit - Tools for merging pretrained large language models.
lightning-mlflow-hf - Use QLoRA to tune LLM in PyTorch-Lightning w/ Huggingface + MLflow
spring-ai - An Application Framework for AI Engineering
dbrx - Code examples and resources for DBRX, a large language model developed by Databricks
FeatUp - Official code for "FeatUp: A Model-Agnostic Frameworkfor Features at Any Resolution" ICLR 2024
examples - This repository will contain examples of use cases that utilize Decodable streaming solution