llama2.rs
petals
llama2.rs | petals | |
---|---|---|
3 | 98 | |
981 | 8,730 | |
- | 2.0% | |
8.9 | 8.3 | |
6 months ago | 22 days ago | |
Rust | Python | |
MIT License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
llama2.rs
-
Ask HN: Cheapest hardware to run Llama 2 70B
This code runs Llama2 quantized and unquantized in a roughly minimal way: https://github.com/srush/llama2.rs (though extracting the quantized 70B weights takes a lot of RAM). I'm running the 13B quantized model on ~10-11GB of CPU memory.
-
Candle: Torch Replacement in Rust
Nowhere near as neat as candle or ggml, but just released a 4-bit rust llama2 implementation with simd. Runs pretty fast.
https://github.com/srush/llama2.rs/
- Llama2.rs: One-file Rust implementation of Llama2
petals
-
Mistral Large
So how long until we can do an open source Mistral Large?
We could make a start on Petals or some other open source distributed training network cluster possibly?
[0] https://petals.dev/
-
Distributed Inference and Fine-Tuning of Large Language Models over the Internet
Can check out their project at https://github.com/bigscience-workshop/petals
- Make no mistake—AI is owned by Big Tech
- Would you donate computation and storage to help build an open source LLM?
-
Run 70B LLM Inference on a Single 4GB GPU with This New Technique
There is already an implementation along the same line using the torrent architecture.
https://petals.dev/
-
Run LLMs in bittorrent style
Check it out at Petals.dev. Chatbot
- Is distributed computing dying, or just fading into the background?
-
Ask HN: Are there any projects currently exploring distributed AI training?
https://github.com/bigscience-workshop/petals
-
Mistral 7B,The complete Guide of the Best 7B model
https://github.com/bigscience-workshop/petals
Inference only: https://lite.koboldai.net/
- Run LLMs at home, BitTorrent‑style
What are some alternatives?
burn - Burn is a new comprehensive dynamic Deep Learning Framework built using Rust with extreme flexibility, compute efficiency and portability as its primary goals.
text-generation-webui - A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.
candle - Minimalist ML framework for Rust
llama - Inference code for Llama models
euclid - Geometry primitives (basic linear algebra) for Rust
alpaca-lora - Instruct-tune LLaMA on consumer hardware
exllama - A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.
GLM-130B - GLM-130B: An Open Bilingual Pre-Trained Model (ICLR 2023)
llama.cpp - LLM inference in C/C++
Auto-GPT - An experimental open-source attempt to make GPT-4 fully autonomous. [Moved to: https://github.com/Significant-Gravitas/Auto-GPT]
syntaxdot - Neural syntax annotator, supporting sequence labeling, lemmatization, and dependency parsing.
Open-Assistant - OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.