LeCun: Qualcomm working with Meta to run Llama-2 on mobile devices

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

armnn

2 1,117 9.4 C++

Arm NN ML Software. The code here is a read-only mirror of https://review.mlplatform.org/admin/repos/ml/armnn

Like ARM? https://github.com/ARM-software/armnn
Optimization for this workload has arguably been in-progress for decades. Modern AVX instructions can be found in laptops that are a decade old now, and most big inferencing projects are built around SIMD or GPU shaders. Unless your computer ships with onboard Nvidia hardware, there's usually not much difference in inferencing performance.

serge

40 5,535 9.8 Svelte

A web interface for chatting with Alpaca through llama.cpp. Fully dockerized, with an easy to use API.

You might be pleased to hear that nothing really stops you from doing this today. If you ran Serge[0] on a Mac with Tailscale, you could hack together a decently-accelerated Llama chatbot.
[0] https://github.com/serge-chat/serge

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
llama

184 53,053 8.1 Python

Inference code for Llama models

> Let's say retrieve instructions on how to efficiently overthrow a government?
Your license to use Llama can be revoked if Meta investigates and deems your action to be against the code of conduct[1]
1. https://github.com/facebookresearch/llama/blob/main/CODE_OF_...

llama.cpp

771 56,891 10.0 C++

LLM inference in C/C++

According to few papers and https://github.com/ggerganov/llama.cpp/pull/1684, 3GB 7B parameters model size has the same performance as baseline 7B model(with 14GB size).

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project