lmql vs llama.cpp

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

lmql		llama.cpp
	Project
30	Mentions	773
3,342	Stars	57,463
2.9%	Growth	-
9.5	Activity	10.0
6 days ago	Latest Commit	about 11 hours ago
Python	Language	C++
Apache License 2.0	License	MIT License

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

lmql

Posts with mentions or reviews of lmql. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-03-06.

Show HN: Fructose, LLM calls as strongly typed functions
10 projects | news.ycombinator.com | 6 Mar 2024
Prompting LLMs to constrain output
2 projects | /r/LocalLLaMA | 8 Dec 2023

have been experimenting with guidance and lmql. a bit too early to give any well formed opinions but really do like the idea of constraining llm output.
[D] Prompt Engineering Seems Like Guesswork - How To Evaluate LLM Application Properly?
1 project | /r/MachineLearning | 5 Dec 2023

the only time i've ever felt like it was anything other than guesswork was using LMQL . not coincidentally, LMQL works with LLMs as autocomplete engines rather than q&a ones.
Guidance for selecting a function-calling library?
3 projects | /r/LocalLLaMA | 15 Nov 2023

lqml
Show HN: Magentic – Use LLMs as simple Python functions
17 projects | news.ycombinator.com | 26 Sep 2023

This is also similar in spirit to LMQL
https://github.com/eth-sri/lmql
Show HN: LLMs can generate valid JSON 100% of the time
25 projects | news.ycombinator.com | 14 Aug 2023
LangChain Agent Simulation – Multi-Player Dungeons and Dragons
7 projects | news.ycombinator.com | 14 Aug 2023
The Problem with LangChain
14 projects | news.ycombinator.com | 14 Jul 2023

LLM calls are just function calls, so most functional composition is already afforded by any general-purpose language out there. If you need fancy stuff, use something like Python‘s functools.
Working on https://github.com/eth-sri/lmql (shameless plug, sorry), we have always found that compositional abstractions on top of LMQL are mostly there already, once you internalize prompts being functions.
Is there a UI that can limit LLM tokens to a preset list?
3 projects | /r/LocalLLaMA | 10 Jul 2023
Local LLMs: After Novelty Wanes
5 projects | /r/LocalLLaMA | 15 Jun 2023

LMQL is another.

llama.cpp

Posts with mentions or reviews of llama.cpp. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-04-21.

Better and Faster Large Language Models via Multi-Token Prediction
1 project | news.ycombinator.com | 1 May 2024

For anyone interested in exploring this, llama.cpp has an example implementation here:
https://github.com/ggerganov/llama.cpp/tree/master/examples/...
Llama.cpp Bfloat16 Support
1 project | news.ycombinator.com | 30 Apr 2024
Fine-tune your first large language model (LLM) with LoRA, llama.cpp, and KitOps in 5 easy steps
1 project | dev.to | 30 Apr 2024

Getting started with LLMs can be intimidating. In this tutorial we will show you how to fine-tune a large language model using LoRA, facilitated by tools like llama.cpp and KitOps.
GGML Flash Attention support merged into llama.cpp
1 project | news.ycombinator.com | 30 Apr 2024
Phi-3 Weights Released
1 project | news.ycombinator.com | 23 Apr 2024

well https://github.com/ggerganov/llama.cpp/issues/6849
Lossless Acceleration of LLM via Adaptive N-Gram Parallel Decoding
3 projects | news.ycombinator.com | 21 Apr 2024
Llama.cpp Working on Support for Llama3
1 project | news.ycombinator.com | 18 Apr 2024
Embeddings are a good starting point for the AI curious app developer
7 projects | news.ycombinator.com | 17 Apr 2024

Have just done this recently for local chat with pdf feature in https://recurse.chat. (It's a macOS app that has built-in llama.cpp server and local vector database)
Running an embedding server locally is pretty straightforward:
- Get llama.cpp release binary: https://github.com/ggerganov/llama.cpp/releases
Mixtral 8x22B
4 projects | news.ycombinator.com | 17 Apr 2024
Llama.cpp: Improve CPU prompt eval speed
1 project | news.ycombinator.com | 17 Apr 2024

What are some alternatives?

When comparing lmql and llama.cpp you can also consider the following projects:

guidance - A guidance language for controlling large language models.

ollama - Get up and running with Llama 3, Mistral, Gemma, and other large language models.

guidance - A guidance language for controlling large language models. [Moved to: https://github.com/guidance-ai/guidance]

gpt4all - gpt4all: run open-source LLMs anywhere

simpleaichat - Python package for easily interfacing with chat apps, with robust features and minimal code complexity.

text-generation-webui - A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.

NeMo-Guardrails - NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems.

GPTQ-for-LLaMa - 4 bits quantization of LLaMA using GPTQ

guardrails - Adding guardrails to large language models.

ggml - Tensor library for machine learning

basaran - Basaran is an open-source alternative to the OpenAI text completion API. It provides a compatible streaming API for your Hugging Face Transformers-based text generation models.

alpaca.cpp - Locally run an Instruction-Tuned Chat-Style LLM