-
This is awesome, and simplifies lot of my workflows when using their APIs directly. I also want to give a shout out to Outlines team, https://github.com/outlines-dev/outlines, they've been doing structured outputs for the last 12 months, and their open source lib can be applied across all open weight and closed/API-based models. Most likely, Outlines heavily inspired the OpenAI team, maybe even using some of their codebase.
-
InfluxDB
InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
-
llama.cpp's GBNF grammar is generic, and indeed works with any model.
I can't speak for other approaches, but -- while llama.cpp's implementation is nice in that it always generates valid grammars token-by-token (and doesn't require any backtracking), it is tough in that -- in the case of ambiguous grammars (where we're not always sure where we're at in the grammar until it finishes generating), then it keeps all valid parsing option stacks in memory at the same time. This is good for the no-backtracking case, but it adds a (sometimes significant) cost in terms of being rather "explosive" in the memory usage (especially if one uses a particularly large or poorly-formed grammar). Creating a grammar that is openly hostile and crashes the inference server is not difficult.
People have done a lot of work to try and address some of the more egregious cases, but the memory load can be significant.
One example of memory optimization: https://github.com/ggerganov/llama.cpp/pull/6616
I'm not entirely sure what other options there are for approaches to take, but I'd be curious to learn how other libraries (Outlines, jsonformer) handle syntax validation.
-
baml
The AI framework that adds the engineering to prompt engineering (Python/TS/Ruby/Java/C#/Rust/Go compatible)
thanks for the shoutout, we benchmarked our approach against other function-calling techniques and we've been able to beat all other approaches every time (even by 8%!) just by getting better at parsing the data and representing schemas with less tokens using type definitions instead of json schema.
You can take a look at our BFCL results on that site or the github: https://github.com/BoundaryML/baml
We'll be publishing our comparison against OpenAI structured outputs in the next 2 days, and a deeper dive into our results, but we aim to include this kind of constrained generation as a capability in the BAML DSL anyway longterm!
-
typedai
TypeScript AI platform with AI chat, Autonomous agents, Software developer agents, chatbots and more
https://github.com/TrafficGuard/nous/blob/main/src/swe/codeE...
This gets the diff and asks questions like:
- Are there any redundant changes in the diff?
-
I was impressed by Microsoft’s AICI where the idea is a WASM program can choose the next tokens. And relatedly their Guidance[1] framework which can use CFGs and programs for local inference to even speed it up with context aware token filling. I hope this implies API-based LLMs may be moving in a similar direction.
[1] https://github.com/guidance-ai/guidance
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives