Show HN: LLMs can generate valid JSON 100% of the time

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

llama.cpp

772 56,891 10.0 C++

LLM inference in C/C++

I may get heavily downvoted for my criticism here, but here we go again: yet another "innovation" that's fueled by the stupid money poured into AI startups in the past 2 years. Imagine thinking that adding regex on top of an LLM is worth $8.5M[1]. At least Llama's grammar-based sampling[2] is a bit more interesting but still essentially putting lipstick on a pig.
How is telling the language model "no, not like that, give me another token" at every step of token inference getting so many people ecstatic? The paper is basically undergrad-level excitement about something not even remotely interesting. Congratulations, you reinvented Markov chains (oh, sorry, "state machines") on top of LLMs.
I mean of course you can guarantee grammar and schema well-formedness as, duh, you have what essentially amounts to a post-processing step. Maybe I'm the idiot here, is anyone actually using any of these tools in production?
[1] https://www.benzinga.com/pressreleases/23/06/n32834246/norma...
[2] https://github.com/ggerganov/llama.cpp/pull/1773/files

outlines

31 5,649 9.7 Python

Structured Text Generation

We can extend our approach to grammar-based sampling, as explained in the paper linked above. Relevant PR: https://github.com/normal-computing/outlines/pull/178
Our method is much more efficient. llama.cpp loops over the entire vocabulary (~50k tokens) at each step to generate the mask. We generate an index at initialization, and building the masks at each step only requires a dictionary lookup (trade speed for memory). Sampling is just as fast as standard sampling.

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
guidance

23 17,357 9.8 Jupyter Notebook

A guidance language for controlling large language models.

OpenAI has this capability built in with functions[0], I believe! Building my own project[1] I have implemented functions in combination with guidance[2] and haven’t had a hiccup yet! I have a JSON parser function there, just in case, but it seems to be working reliably!
Here’s a bit more of a description of using the functions API for JSON returns: https://yonom.substack.com/p/native-json-output-from-gpt-4
[0] https://openai.com/blog/function-calling-and-other-api-updat...
[1] https://resgen.app
[2] https://github.com/guidance-ai/guidance

jsonformer

25 3,793 5.4 Jupyter Notebook

A Bulletproof Way to Generate Structured JSON from Language Models

I'm not sure how this is different than:
https://github.com/1rgs/jsonformer
or
https://github.com/newhouseb/clownfish
or
https://github.com/mkuchnik/relm
or
https://github.com/ggerganov/llama.cpp/pull/1773
or
https://github.com/Shopify/torch-grammar
Overall there are a ton of these logit based guidance systems, the reason they don't get tons of traction is the SOTA models are behind REST APIs that don't enable this fine-grained approach.
Those models perform so much better that people generally settle for just re-requesting until they get the correct format (and with GPT-4 that ends up being a fairly rare occurrence in my experience)

torch-grammar

3 63 5.2 Python

I'm not sure how this is different than:
https://github.com/1rgs/jsonformer
or
https://github.com/newhouseb/clownfish
or
https://github.com/mkuchnik/relm
or
https://github.com/ggerganov/llama.cpp/pull/1773
or
https://github.com/Shopify/torch-grammar
Overall there are a ton of these logit based guidance systems, the reason they don't get tons of traction is the SOTA models are behind REST APIs that don't enable this fine-grained approach.
Those models perform so much better that people generally settle for just re-requesting until they get the correct format (and with GPT-4 that ends up being a fairly rare occurrence in my experience)

json-schema-spec

29 3,219 7.9 JavaScript

The JSON Schema specification

Outlines is a Python library that focuses on text generation with large language models. Brandon and I are not LLM experts and started the project a few months ago because we wanted to understand better how the generation process works. Our original background is probabilistic, relational and symbolic programming.
Recently we came up with a fast way to generate text that matches a regex (https://blog.normalcomputing.ai/posts/2023-07-27-regex-guide...). The basic idea is simple: regular expressions have an equivalent Deterministic-Finite Automaton (DFA) representation. We can transform this DFA into a generative model: in each state we get a list of symbols which correspond to completions that partially match the regular expression. We mask the other symbols in the logits returned by a large language model, sample a new symbol and move to the next state. The subtelty is that language models work with tokens, not symbols, so we derive a new FSM whose alphabet is the model's vocabulary. We can do this in only one pass over the vocabulary.
Generating the token masks thus only requires a dictionary lookup at each state. Our method blows other libraries like Microsoft's guidance out of the water.
From there it was only a small leap to be able to generate text that follows a JSON schema (https://json-schema.org/), or is parseable into a Pydantic model (https://docs.pydantic.dev/latest/usage/models/). The method works with union types, optional types, nested schemas, arrays, everything. It is guaranteed that the output is parseable.
I think it's cool, and I've spent a lot of time watching even tiny models output valid JSON over the weekend. Hope you will to.
I look forward to feedback, bug reports, feature requests and discussions!

lmql

30 3,320 9.5 Python

A language for constraint-guided and efficient LLM programming.
SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
clownfish

11 302 4.3 Python

Constrained Decoding for LLMs against JSON Schema

I'm not sure how this is different than:
https://github.com/1rgs/jsonformer
or
https://github.com/newhouseb/clownfish
or
https://github.com/mkuchnik/relm
or
https://github.com/ggerganov/llama.cpp/pull/1773
or
https://github.com/Shopify/torch-grammar
Overall there are a ton of these logit based guidance systems, the reason they don't get tons of traction is the SOTA models are behind REST APIs that don't enable this fine-grained approach.
Those models perform so much better that people generally settle for just re-requesting until they get the correct format (and with GPT-4 that ends up being a fairly rare occurrence in my experience)

relm

3 86 5.1 Python

ReLM is a Regular Expression engine for Language Models (by mkuchnik)

I'm not sure how this is different than:
https://github.com/1rgs/jsonformer
or
https://github.com/newhouseb/clownfish
or
https://github.com/mkuchnik/relm
or
https://github.com/ggerganov/llama.cpp/pull/1773
or
https://github.com/Shopify/torch-grammar
Overall there are a ton of these logit based guidance systems, the reason they don't get tons of traction is the SOTA models are behind REST APIs that don't enable this fine-grained approach.
Those models perform so much better that people generally settle for just re-requesting until they get the correct format (and with GPT-4 that ends up being a fairly rare occurrence in my experience)

Constrained-Text-Generation-Studio

25 195 4.1 Python

Code repo for "Most Language Models can be Poets too: An AI Writing Assistant and Constrained Text Generation Studio" at the (CAI2) workshop, jointly held at (COLING 2022)
Constrained-Text-Genera

11 - -
TypeChat

12 7,840 9.1 TypeScript

TypeChat is a library that makes it easy to build natural language interfaces using types.

That re-prompting error on is what this new Microsoft library does, too: https://github.com/microsoft/TypeChat
Here's their prompt for that: https://github.com/microsoft/TypeChat/blob/c45460f4030938da3...
I think the approach using grammars (seen here, but also in things like https://github.com/ggerganov/llama.cpp/pull/1773 ) is a much more elegant solution.

llm-mlc

3 172 5.1 Python

LLM plugin for running models using MLC

I'm quite impressed with Llama 2 13B - the more time I spend with it the more I think it might be genuinely useful for more than just playing around with local LLMs.
I'm using the MLC version (since that works with a GPU on my M2 Mac) via my https://github.com/simonw/llm-mlc plugin.

flashtext

8 5,535 0.0 Python

Extract Keywords from sentence or Replace keywords in sentences.

I have some other comment on this thread where I point out why I don’t think it’s superficial. Would love to get your feedback on that if you feel like spending more time on this thread.
But it’s not obscure? FlashText was a somewhat popular paper at the time (2017) with a popular repo (https://github.com/vi3k6i5/flashtext). Their paper was pretty derivative of Aho-Corasick, which they cited. If you think they genuinely fucked up, leave an issue on their repo (I’m, maybe to your surprise lol, not the author).
Anyway, I’m not a fan of the whatabboutery here. I don’t think OG’s paper is up to snuff on its lit review - do you?

ad-llama

6 47 8.9 TypeScript

Structured inference with Llama 2 in your browser

Generating an FSM over the vocabulary is a really interesting approach to guided sampling! I'm hacking on a structured inference library (https://github.com/gsuuon/ad-llama) - I also tried to add a vocab preprocessing step to generate a valid tokens mask (just with regex or static strings initially) but discovered that doing so would cause unlikely / unnatural tokens to be masked rather than the token which represents the natural encoding given the existing sampled tokens.
Given the stateful nature of tokenizers, I decided that trying to preprocess the individual token ids was a losing battle. Even in the simple case of whitespace - tokenizer merges can really screw up generating a static mask, e.g. we expect a space next, but a token decodes to 'foo', but is actually a '_foo' and would've decoded with a whitespace if it were following a valid pair. When I go to construct the static vocab mask, it would then end up matching against 'foo' instead of ' foo'.
How did you work around this for the FSM approach? Does it somehow include information about merges / whitespace / tokenizer statefulness?

SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Prompt Engineering Guide

1 project | news.ycombinator.com | 30 Mar 2024
FLaNK Stack Weekly 12 February 2024

52 projects | dev.to | 12 Feb 2024
Resources to deepen LLMs understanding for software engineers

1 project | news.ycombinator.com | 16 Jan 2024
Step-by-Step Guide to building an Anomaly Detector using a LLM

1 project | dev.to | 10 Jan 2024
The Essential Guide to Prompt Engineering for Creators and Innovators

2 projects | dev.to | 2 Jan 2024

Show HN: LLMs can generate valid JSON 100% of the time

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
search-in-text json-schema chatgpt Deep Learning keyword-extraction
Post date: 14 Aug 2023

llama.cpp

outlines

InfluxDB

guidance

jsonformer

torch-grammar

json-schema-spec

lmql

SaaSHub

clownfish

relm

Constrained-Text-Generation-Studio

Constrained-Text-Genera

TypeChat

llm-mlc

flashtext

ad-llama

SaaSHub

Related posts

Prompt Engineering Guide

FLaNK Stack Weekly 12 February 2024

Resources to deepen LLMs understanding for software engineers

Step-by-Step Guide to building an Anomaly Detector using a LLM

The Essential Guide to Prompt Engineering for Creators and Innovators

Show HN: LLMs can generate valid JSON 100% of the time

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com search-in-text json-schema chatgpt Deep Learning keyword-extraction Post date: 14 Aug 2023

Related posts

Prompt Engineering Guide

FLaNK Stack Weekly 12 February 2024

Resources to deepen LLMs understanding for software engineers

Step-by-Step Guide to building an Anomaly Detector using a LLM

The Essential Guide to Prompt Engineering for Creators and Innovators

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
search-in-text json-schema chatgpt Deep Learning keyword-extraction
Post date: 14 Aug 2023