llama-mistral
megablocks-public
llama-mistral | megablocks-public | |
---|---|---|
5 | 5 | |
374 | 857 | |
- | 0.1% | |
8.4 | 9.0 | |
6 months ago | 6 months ago | |
Python | ||
GNU General Public License v3.0 or later | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
llama-mistral
- Inference code for Mistral and Mixtral hacked up
-
French AI startup Mistral secures €2B valuation
No. Without the inference code, the best we can have are guesses on its implementation, so the benchmark figures we can get could be quite wrong. It does seem better than Llama2-70B in my tests, which rely on the work done by Dmytro Dzhulgakov[0] and DiscoResearch[1].
But the point of releasing on bittorrent is to see the effervescence in hobbyist research and early attempts at MoE quantization, which are already ongoing[2]. They are benefitting from the community.
[0]: https://github.com/dzhulgakov/llama-mistral
[1]: https://huggingface.co/DiscoResearch/mixtral-7b-8expert
[2]: https://github.com/TimDettmers/bitsandbytes/tree/sparse_moe
- Code to run Mistral - mixtral-8x7b-32kseqlen
-
New Mistral models just dropped (magnet links)
Someone made this. https://github.com/dzhulgakov/llama-mistral
-
Mistral 8x7B 32k model [magnet]
If anyone can help running this, would be appreciated. Resources so far:
- https://github.com/dzhulgakov/llama-mistral
megablocks-public
-
Mistral releases 8x7 MoE model via torrent
Stark contrast with Google's "all demo no model" approach from earlier this week! Seems to be trained off Stanford's Megablocks: https://github.com/mistralai/megablocks-public
- Megablocks-Public
-
New Mistral models just dropped (magnet links)
Repo: https://github.com/mistralai/megablocks-public
-
Mistral 8x7B 32k model [magnet]
https://github.com/mistralai/megablocks-public
Oddly absent: an over-rehearsed professional release video talking about a revolution in AI.
If people are wondering why there is so much AI activity right around now, it's because the biggest deep learning conference (NeurIPS) is next week.
https://twitter.com/karpathy/status/1733181701361451130
What are some alternatives?
llama.cpp - LLM inference in C/C++
bliss - 🧘 BLISS – a Benchmark for Language Induction from Small Sets
slint - Slint is a declarative GUI toolkit to build native user interfaces for Rust, C++, or JavaScript apps.