Introducing Basaran: self-hosted open-source alternative to the OpenAI text completion API

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

llama.cpp

769 55,846 10.0 C++

LLM inference in C/C++

After https://github.com/ggerganov/llama.cpp/pull/1459 was merged I found clblast to be around the same speed as cublas on my 3080.

basaran

22 1,281 10.0 Python

Discontinued Basaran is an open-source alternative to the OpenAI text completion API. It provides a compatible streaming API for your Hugging Face Transformers-based text generation models.
WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
Dependencies

24 8,109 0.0 C#

A rewrite of the old legacy software "depends.exe" in C# for Windows devs to troubleshoot dll load dependencies issues.

I did that, basically. Problem is there is a clblast.dll (on windows) that llama.dll depends on, and it llama-cpp-python always failed dependency resolve to find it. I copied the dll to the right folder, loading it manually via CDLL worked fine, and https://github.com/lucasg/Dependencies also confirmed the dll was findable. When loading DLL's in windows, it checks the same folder for dependency dll's (and a few other places).

GPTQ-for-LLaMa

75 2,913 8.6 Python

4 bits quantization of LLaMA using GPTQ

Thanks for the explanation. I think some repos, like text generation webui used gptq for llama (I don't know if it's this repo or another one), anyway most repo that I saw use external things (like gptq for llama)

gpt-llama.cpp

12 585 8.2 JavaScript

A llama.cpp drop-in replacement for OpenAI's GPT endpoints, allowing GPT-powered apps to run off local llama.cpp models instead of OpenAI.

sounds like you’re asking for exactly this? https://github.com/keldenl/gpt-llama.cpp

AutoGPTQ

19 3,744 9.5 Python

An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.

Instead of integrating GPTQ-for-Lllama, use AutoGPTQ instead.

text-generation-webui

876 35,862 9.9 Python

A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.

it does everything https://github.com/oobabooga/text-generation-webui

InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project