Llama 2 – Meta AI

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • llama

    Inference code for Llama models

  • https://github.com/facebookresearch/llama

    Once you get a link to download on email make sure to copy it without spaces, an option is to open it in a new tab and then download. If you are using fish or another fancy shell, make sure you switch to bash or sh before running download.sh from the repo.

    I am not sure exactly how much space is needed but it is likely north of 500GB given that there are two 70B models (you are given the option to download just the small ones in a prompt).

  • llama.cpp

    LLM inference in C/C++

  • You can run inference for LLaMA 7B with 8GB of ram and a CPU: https://github.com/ggerganov/llama.cpp

    The major limitation for email classification would be the 2048 token limit though.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • llama2-chatbot

    LLaMA v2 Chatbot

  • cog-llama-template

    LLaMA Cog template

  • * or even deploy your own LLaMA v2 fine tune with Cog (https://github.com/a16z-infra/cog-llama-template)

    Please let us know what you use this for or if you have feedback! And thanks to all contributors to this model, Meta, Replicate, the Open Source community!

  • OpenPipe

    Turn expensive prompts into cheap fine-tuned models

  • It depends -- do you mean as a general end-user of a chat platform or do you mean to include a model as part of an app or service?

    As an end user, what I've found works in practice is to use one of the models until it gives me an answer I'm unhappy with. At that point I'll try another model and see whether the response is better. Do this for long enough and you'll get a sense of the various models' strengths and weaknesses (although the tl;dr is that if you're willing to pay GPT-4 is better than anything else across most use cases right now).

    For evaluating models for app integrations, I can plug an open source combined playground + eval harness I'm currently developing: https://github.com/openpipe/openpipe

    We're working on integrating Llama 2 so users can test it against other models for their own workloads head to head. (We're also working on a hosted SaaS version so people don't have to download/install Postgres and Node!)

  • cog

    Containers for machine learning

  • https://github.com/replicate/cog

    Our thinking was just that a bunch of folks will want to fine-tune right away, then deploy the fine-tunes, so trying to make that easy... Or even just deploy the models-as-is on their own infra without dealing with CUDA insanity!

  • ollama

    Get up and running with Llama 3, Mistral, Gemma, and other large language models.

  • If you want to try running Llama 2 locally, you can use https://github.com/jmorganca/ollama

    To run Llama 2 with it:

    ```

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • llama.cppav

  • I think using this project https://github.com/ggerganov/llama.cppav

    on a CPU machine with AVX instructions would be a better bang for your buck than GPU. Depends on if your use case can tolerate the latency

  • llama

    Inference code for LLaMA models on CPU and Mac M1/M2 GPU (by krychu)

  • Version that runs on the CPU: https://github.com/krychu/llama

    I get 1 word per ~1.5 secs on a Mac Book Pro M1.

  • marsha

    Marsha is a functional, higher-level, English-based programming language that gets compiled into tested Python software by an LLM

  • So this comment inspired me to write a Roman Numeral to Integer function in out LLM-based programming language, Marsha: https://github.com/alantech/marsha/blob/main/examples/genera...

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts