LeCun: Qualcomm working with Meta to run Llama-2 on mobile devices

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • armnn

    Arm NN ML Software. The code here is a read-only mirror of https://review.mlplatform.org/admin/repos/ml/armnn

  • Like ARM? https://github.com/ARM-software/armnn

    Optimization for this workload has arguably been in-progress for decades. Modern AVX instructions can be found in laptops that are a decade old now, and most big inferencing projects are built around SIMD or GPU shaders. Unless your computer ships with onboard Nvidia hardware, there's usually not much difference in inferencing performance.

  • serge

    A web interface for chatting with Alpaca through llama.cpp. Fully dockerized, with an easy to use API.

  • You might be pleased to hear that nothing really stops you from doing this today. If you ran Serge[0] on a Mac with Tailscale, you could hack together a decently-accelerated Llama chatbot.

    [0] https://github.com/serge-chat/serge

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • llama

    Inference code for Llama models

  • > Let's say retrieve instructions on how to efficiently overthrow a government?

    Your license to use Llama can be revoked if Meta investigates and deems your action to be against the code of conduct[1]

    1. https://github.com/facebookresearch/llama/blob/main/CODE_OF_...

  • llama.cpp

    LLM inference in C/C++

  • According to few papers and https://github.com/ggerganov/llama.cpp/pull/1684, 3GB 7B parameters model size has the same performance as baseline 7B model(with 14GB size).

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts