open_llama VS llama.cpp

Compare open_llama vs llama.cpp and see what are their differences.

open_llama

OpenLLaMA, a permissively licensed open source reproduction of Meta AI’s LLaMA 7B trained on the RedPajama dataset (by openlm-research)

llama.cpp

LLM inference in C/C++ (by ggerganov)
Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
open_llama llama.cpp
52 769
7,193 56,891
1.3% -
5.3 10.0
10 months ago 1 day ago
C++
Apache License 2.0 MIT License
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

open_llama

Posts with mentions or reviews of open_llama. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-12-19.
  • How Open is Generative AI? Part 2
    8 projects | dev.to | 19 Dec 2023
    The RedPajama dataset was adapted by the OpenLLaMA project at UC Berkeley, creating an open-source LLaMA equivalent without Meta’s restrictions. The model's later version also included data from Falcon and StarCoder. This highlights the importance of open-source models and datasets, enabling free repurposing and innovation.
  • GPT-4 API general availability
    15 projects | news.ycombinator.com | 6 Jul 2023
    OpenLLaMA is though. https://github.com/openlm-research/open_llama

    All of these are surmountable problems.

    We can beat OpenAI.

    We can drain their moat.

  • Recommend me a computer for local a.i for 500 $
    2 projects | /r/ArtificialInteligence | 1 Jul 2023
    #1: 🌞 Open-source Reproduction of Meta AI’s LLaMA OpenLLaMA-13B released. (trained for 1T tokens) | 0 comments #2: πŸŽ‰ #1 on HuggingFace.co's Leaderboard Model Falcon 40B is now Free (Apache 2.0 License) | 0 comments #3: 😍 Have you seen this repo? "running LLMs on consumer-grade hardware. compatible models: llama.cpp, alpaca.cpp, gpt4all.cpp, rwkv.cpp, whisper.cpp, vicuna, koala, gpt4all-j, cerebras and many others!" | 0 comments
  • Who is openllama from?
    1 project | /r/LocalLLaMA | 30 Jun 2023
    Trained OpenLLaMA models are from the OpenLM Research team in collaboration with Stability AI: https://github.com/openlm-research/open_llama
  • Personal GPT: A tiny AI Chatbot that runs fully offline on your iPhone
    14 projects | /r/ChatGPT | 30 Jun 2023
    I can't use Llama or any model from the Llama family, due to license restrictions. Although now there's also the OpenLlama family of models, which have the same architecture but were trained on an open dataset (RedPajama, the same dataset the base model in my app was trained on). I'd love to pursue the direction of extended context lengths for on-device LLMs. Likely in a month or so, when I've implemented all the product feature that I currently have on my backlog.
  • XGen-7B, a new 7B foundational model trained on up to 8K length for 1.5T tokens
    3 projects | news.ycombinator.com | 28 Jun 2023
    https://github.com/openlm-research/open_llama#update-0615202...).

    XGen-7B is probably the superior 7B model, it's trained on more tokens and a longer default sequence length (although both presumably can adopt SuperHOT (Position Interpolation) to extend context), but larger models still probably perform better on an absolute basis.

  • MosaicML Agrees to Join Databricks to Power Generative AI for All
    3 projects | /r/LocalLLaMA | 26 Jun 2023
    Compare it to openllama. It github doesn't have a single script on how to do anything.
  • Databricks Strikes $1.3B Deal for Generative AI Startup MosaicML
    4 projects | news.ycombinator.com | 26 Jun 2023
    OpenLLaMA models up to 13B parameters have now been trained on 1T tokens:

    https://github.com/openlm-research/open_llama

  • Containerized AI before Apocalypse πŸ³πŸ€–
    4 projects | dev.to | 25 Jun 2023
    The deployed LLM binary, orca mini, has 3 billion parameters. Orca mini is based on the OpenLLaMA project.
  • AI β€” weekly megathread!
    2 projects | /r/artificial | 23 Jun 2023
    OpenLM Research released its 1T token version of OpenLLaMA 13B - the permissively licensed open source reproduction of Meta AI's LLaMA large language model. [Details].

llama.cpp

Posts with mentions or reviews of llama.cpp. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-04-21.
  • Phi-3 Weights Released
    1 project | news.ycombinator.com | 23 Apr 2024
    well https://github.com/ggerganov/llama.cpp/issues/6849
  • Lossless Acceleration of LLM via Adaptive N-Gram Parallel Decoding
    3 projects | news.ycombinator.com | 21 Apr 2024
  • Llama.cpp Working on Support for Llama3
    1 project | news.ycombinator.com | 18 Apr 2024
  • Embeddings are a good starting point for the AI curious app developer
    7 projects | news.ycombinator.com | 17 Apr 2024
    Have just done this recently for local chat with pdf feature in https://recurse.chat. (It's a macOS app that has built-in llama.cpp server and local vector database)

    Running an embedding server locally is pretty straightforward:

    - Get llama.cpp release binary: https://github.com/ggerganov/llama.cpp/releases

  • Mixtral 8x22B
    4 projects | news.ycombinator.com | 17 Apr 2024
  • Llama.cpp: Improve CPU prompt eval speed
    1 project | news.ycombinator.com | 17 Apr 2024
  • Ollama 0.1.32: WizardLM 2, Mixtral 8x22B, macOS CPU/GPU model split
    9 projects | news.ycombinator.com | 17 Apr 2024
    Ah, thanks for this! I can't edit my parent comment that you replied to any longer unfortunately.

    As I said, I only compared the contributors graphs [0] and checked for overlaps. But those apparently only go back about year and only list at most 100 contributors ranked by number of commits.

    [0]: https://github.com/ollama/ollama/graphs/contributors and https://github.com/ggerganov/llama.cpp/graphs/contributors

  • KodiBot - Local Chatbot App for Desktop
    2 projects | dev.to | 11 Apr 2024
    KodiBot is a desktop app that enables users to run their own AI chat assistants locally and offline on Windows, Mac, and Linux operating systems. KodiBot is a standalone app and does not require an internet connection or additional dependencies to run local chat assistants. It supports both Llama.cpp compatible models and OpenAI API.
  • Mixture-of-Depths: Dynamically allocating compute in transformers
    3 projects | news.ycombinator.com | 8 Apr 2024
    There are already some implementations out there which attempt to accomplish this!

    Here's an example: https://github.com/silphendio/sliced_llama

    A gist pertaining to said example: https://gist.github.com/silphendio/535cd9c1821aa1290aa10d587...

    Here's a discussion about integrating this capability with ExLlama: https://github.com/turboderp/exllamav2/pull/275

    And same as above but for llama.cpp: https://github.com/ggerganov/llama.cpp/issues/4718#issuecomm...

  • The lifecycle of a code AI completion
    6 projects | news.ycombinator.com | 7 Apr 2024
    For those who might not be aware of this, there is also an open source project on GitHub called "Twinny" which is an offline Visual Studio Code plugin equivalent to Copilot: https://github.com/rjmacarthy/twinny

    It can be used with a number of local model services. Currently for my setup on a NVIDIA 4090, I'm running both the base and instruct model for deepseek-coder 6.7b using 5_K_M Quantization GGUF files (for performance) through llama.cpp "server" where the base model is for completions and the instruct model for chat interactions.

    llama.cpp: https://github.com/ggerganov/llama.cpp/

    deepseek-coder 6.7b base GGUF files: https://huggingface.co/TheBloke/deepseek-coder-6.7B-base-GGU...

    deepseek-coder 6.7b instruct GGUF files: https://huggingface.co/TheBloke/deepseek-coder-6.7B-instruct...

What are some alternatives?

When comparing open_llama and llama.cpp you can also consider the following projects:

FastChat - An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

ollama - Get up and running with Llama 3, Mistral, Gemma, and other large language models.

RWKV-LM - RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.

gpt4all - gpt4all: run open-source LLMs anywhere

text-generation-webui - A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.

gorilla - Gorilla: An API store for LLMs

GPTQ-for-LLaMa - 4 bits quantization of LLaMA using GPTQ

ggml - Tensor library for machine learning

gpt-json - Structured and typehinted GPT responses in Python

alpaca.cpp - Locally run an Instruction-Tuned Chat-Style LLM