C++ llama2 Projects

cortex

8 1,613 9.8 C++

Drop-in, local AI alternative to the OpenAI stack. Multi-engine (llama.cpp, TensorRT-LLM). Powers 👋 Jan (by janhq)

Project mention: Introducing Jan | dev.to | 2024-05-05

Jan incorporates a lightweight, built-in inference server called Nitro. Nitro supports both llama.cpp and NVIDIA's TensorRT-LLM engines. This means many open LLMs in the GGUF format are supported. Jan's Model Hub is designed for easy installation of pre-configured models but it also allows you to install virtually any model from Hugging Face or even your own.

distributed-llama

4 756 9.2 C++

Run LLMs on weak devices or make powerful devices even more powerful by distributing the workload and dividing the RAM usage.

Project mention: Distributed Grok-1 (314B) | news.ycombinator.com | 2024-04-15

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

C++ llama2 related posts

Distributed Grok-1 (314B)

1 project | news.ycombinator.com | 15 Apr 2024
Show HN: Distributed Llama – Run LLMs on multiple devices in parallel

1 project | news.ycombinator.com | 25 Jan 2024
Distributed Llama

1 project | news.ycombinator.com | 21 Jan 2024
Nitro: A fast, lightweight 3MB inference server with OpenAI-Compatible API

9 projects | news.ycombinator.com | 5 Jan 2024

Index

	Project	Stars
1	cortex	1,613
2	distributed-llama	756

C++ llama2

C++ llama2 Projects

cortex

distributed-llama

InfluxDB

C++ llama2 related posts

Distributed Grok-1 (314B)

Show HN: Distributed Llama – Run LLMs on multiple devices in parallel

Distributed Llama

Nitro: A fast, lightweight 3MB inference server with OpenAI-Compatible API

Index