llama.cpp now officially supports GPU acceleration.

Our great sponsors

InfluxDB - Power Real-Time Data Analytics at Scale

WorkOS - The modern identity platform for B2B SaaS

SaaSHub - Software Alternatives and Reviews

Our great sponsors

llama.cpp

766 55,117 9.9 C++

LLM inference in C/C++

I don't know, after the context from gpt4, I was able to understand the source much easier. Is ChatGPT's understanding wrong? It seems to be summarizing the same points that the GitHub is about.
tinygrad

58 17,800 9.7 Python

Discontinued You like pytorch? You like micrograd? You love tinygrad! ❤️ [Moved to: https://github.com/tinygrad/tinygrad] (by geohot)

There are currently at least 3 ways to run llama on m1 with GPU acceleration. - mlc-llm (pre-built, only 1 model has been ported) - tinygrad (very memory efficient, not that easy to integrate into other projects) - llama-mps (original llama codebase + llama adapter support)
InfluxDB

www.influxdata.com
sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
llama-mps

4 83 3.8 Python

Experimental fork of Facebooks LLaMa model which runs it with GPU acceleration on Apple Silicon M1/M2

There are currently at least 3 ways to run llama on m1 with GPU acceleration. - mlc-llm (pre-built, only 1 model has been ported) - tinygrad (very memory efficient, not that easy to integrate into other projects) - llama-mps (original llama codebase + llama adapter support)
text-generation-webui

876 35,583 9.9 Python

A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.

Just got merged into ooba - https://github.com/oobabooga/text-generation-webui/commit/071f0776ad6e7d8dab08e0d98d089c808807ab45

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Mark Zuckerberg: Llama 3, $10B Models, Caesar Augustus, Bioweapons [video]
2 projects | news.ycombinator.com | 18 Apr 2024
Ajenti is a Linux and BSD modular server admin panel
1 project | news.ycombinator.com | 18 Apr 2024
Python Wrapper for Meta AI (Llama 3)
2 projects | news.ycombinator.com | 18 Apr 2024
Llama 3 in [8B and 70B] sizes is out
1 project | dev.to | 18 Apr 2024
Show HN: Tiger – Function Hub for LLM Agents
1 project | news.ycombinator.com | 18 Apr 2024

llama.cpp now officially supports GPU acceleration.

This page summarizes the projects mentioned and recommended in the original post on /r/LocalLLaMA Post date: 13 May 2023

llama.cpp

tinygrad

InfluxDB

llama-mps

text-generation-webui

Related posts