llama VS llama-dl

Compare llama vs llama-dl and see what are their differences.

llama

Inference code for LLaMA models (by gmorenz)

llama-dl

High-speed download of LLaMA, Facebook's 65B parameter GPT model [UnavailableForLegalReasons - Repository access blocked] (by shawwn)
Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
llama llama-dl
3 17
35 3,386
- -
1.6 8.8
about 1 year ago about 1 year ago
Shell
GNU General Public License v3.0 only GNU General Public License v3.0 only
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

llama

Posts with mentions or reviews of llama. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-03-13.
  • Alpaca- An Instruct Tuned Llama 7B. Responses on par with txt-DaVinci-3. Demo up
    9 projects | news.ycombinator.com | 13 Mar 2023
    > All the magic of "7B LLaMA running on a potato" seems to involve lowering precision down to f16 and then further quantizing to int4.

    LLaMa weights are f16s to start out with, no lowering necessary to get to there.

    You can stream weights from RAM to the GPU pretty efficiently. If you have >= 32GB ram and >=2GB vram my code here should work for you: https://github.com/gmorenz/llama/tree/gpu_offload

    There's probably a cleaner version of it somewhere else. Really you should only need >= 16 GB ram, but the (meta provided) code to load the initial weights is completely unnecessarily making two copies of the weights in RAM simultaneously.

  • LLaMA-7B in Pure C++ with full Apple Silicon support
    19 projects | news.ycombinator.com | 10 Mar 2023
    My code for this is very much not high quality, but I have a CPU + GPU + SSD combination: https://github.com/gmorenz/llama/tree/ssd

    Usage instructions in the commit message: https://github.com/facebookresearch/llama/commit/5be06e56056...

    At least with my hardware this runs at "[size of model]/[speed of SSD reads]" tokens per second, which (up to some possible further memory reduction so you can run larger batches at once on the same GPU) is a good as it gets when you need to read the whole model from disk each token.

    At a 125GB and a 2MB/s read (largest model, what I get from my ssd) that's 60 seconds per token (1 day per 1440 words), which isn't exactly practical. Which is really the issue here, if you need to stream the model from an SSD because you don't have enough RAM, it is just a fundamentally slow process.

    You could probably optimize quite a bit for batch throughput if you're ok with the latency though.

  • Llama-CPU: Fork of Facebooks LLaMa model to run on CPU
    8 projects | news.ycombinator.com | 7 Mar 2023
    I don't know about this fork specifically, but in general yes absolutely.

    Even without enough ram, you can stream model weights from disk and run at [size of model/disk read speed] seconds per token.

    I'm doing that on a small GPU with this code, but it should be easy to get this working with the CPU as compute instead (and at least with my disk/CPU, I'm not even sure that it would run even slower, I think disk read would probably still be the bottleneck)

    https://github.com/gmorenz/llama/tree/ssd

llama-dl

Posts with mentions or reviews of llama-dl. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-03-22.
  • Gitlab confirms it's removed Suyu, a fork of Nintendo Switch emulator Yuzu
    3 projects | news.ycombinator.com | 22 Mar 2024
    There seems to be some confusion here. Let me step in as someone who has gone through this.

    My repo https://github.com/shawwn/llama-dl was taken down last March by Facebook. They asserted copyright over LLaMA, which is obviously bogus since it was trained on data they do not own the copyright to. I was bummed about this, but after I mentioned on HN that I was willing to fight Meta, an anonymous person named L contacted me and sent $20k of Monero to cover legal fees. I was also contacted by an amazing lawyer who wanted to represent me in this. I was absurdly fortunate on both counts.

    He drafted a counternotice, we sent it, and then my repo was restored within a week or so.

    GitHub had no choice in the matter. Legally this is a required process. Ditto for GitLab. Both are US companies.

    When YouTube-dl was taken down some time ago by a DMCA, Nat went to bat and got it restored, and GitHub made some sort of pledge to cover legal fees associated with bogus takedown requests.

    Here’s the shitty part for this particular situation. A case can be made that the emulator is for the purpose of circumventing copyright protection mechanisms. This, sadly, is a solid legal basis for issuing a lawful takedown, as much as we all absolutely despise that idea. It’s pretty clear cut; Nintendo doesn’t want Switch games to be run on non-Nintendo platforms, and the emulator seeks to enable Switch games to be run on any platform. Therefore, the intent of the emulator is to circumvent Nintendo’s protection mechanisms.

    So where does this leave us? Well, the team can file a counternotice. GitLab will restore the repo. But that opens up the team to a lawsuit by Nintendo. And as much as I want to stand up to bullies, there’s a difference between standing up to a guy shoving a kid in a locker vs standing up to a Silverback gorilla charging at you. Nintendo’s legal history implies the latter.

    Welcome to Nintendo pain. The Smash community has been dealing with Nintendo’s BS for decades now. They shut down tournaments that use emulators for Smash Melee. And no one can do anything, because it’s their legal right to do so.

  • [Chat Gpt] Metas LLaMA LLM ist durchgesickert – Führen Sie unzensierte KI auf Ihrem Heim-PC aus!
    2 projects | /r/aufdeutsch | 24 Apr 2023
  • Run LLaMA and Alpaca on your computer
    3 projects | news.ycombinator.com | 5 Apr 2023
    Your philosophical argument is interesting, but what the op was saying was one of the linked repos in inaccessible due to DMCA: https://github.com/shawwn/llama-dl

    So while what you say may be true the DMCA seems to have worth for these orgs because they can get code removed by the host, who is uninterested in litigating, and the repo owner likely is even less capable of litigating the DMCA.

    Unfortunately as a tool of fear and legal gridlock DMCA has shown itself to be very useful to those with ill intent.

  • Meta DMCAs llama-dl Repository
    1 project | news.ycombinator.com | 23 Mar 2023
  • Load LLaMA Models Instantly
    5 projects | news.ycombinator.com | 17 Mar 2023
  • Is there some sort of open-source equivalent of this?
    1 project | /r/ChatGPT | 13 Mar 2023
    Here are some useful links: https://github.com/shawwn/llama-dl and https://rentry.org/llama-tard-v2#tips-and-tricks
  • FLiP Stack Weekly for 13 March 2023
    25 projects | dev.to | 13 Mar 2023
  • Using LLaMA with M1 Mac and Python 3.11
    6 projects | news.ycombinator.com | 12 Mar 2023
    Sure. You can get models with magnet link from here https://github.com/shawwn/llama-dl/

    To get running, just follow these steps https://github.com/ggerganov/llama.cpp/#usage

  • New JailBreak prompt + How to stop flagging/blocking!
    1 project | /r/u_Rumikosan | 12 Mar 2023
    https://rentry.org/llama-tard-v2#tips-and-tricks https://github.com/shawwn/llama-dl
  • LLaMA, o ChatGPT da Meta vaza na internet e já pode ser baixada
    2 projects | /r/brasil | 11 Mar 2023

What are some alternatives?

When comparing llama and llama-dl you can also consider the following projects:

llama.cpp - LLM inference in C/C++

ChatGLM-6B - ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型

llama - Inference code for Llama models

llama-mps - Experimental fork of Facebooks LLaMa model which runs it with GPU acceleration on Apple Silicon M1/M2

text-generation-webui - A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.

stanford_alpaca - Code and documentation to train Stanford's Alpaca models, and generate the data.

transformers - 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

tinygrad - You like pytorch? You like micrograd? You love tinygrad! ❤️ [Moved to: https://github.com/tinygrad/tinygrad]

dalai - The simplest way to run LLaMA on your local machine