chain-of-thought-hub
lmql
chain-of-thought-hub | lmql | |
---|---|---|
10 | 30 | |
2,371 | 3,320 | |
- | 2.9% | |
6.9 | 9.5 | |
10 days ago | about 1 month ago | |
Jupyter Notebook | Python | |
MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
chain-of-thought-hub
- Chain-Of-Thought Hub: Measuring LLMs' Reasoning Performance
-
All Model Leaderboards (that I know)
Chain-of-Thought Hub https://github.com/FranxYao/chain-of-thought-hub - these are mostly gathered although Yao Fu, the author is working on specific CoT runs
- It looks likely that the MMLU score on Hugginface's LLM leaderboard is wrong after all.
-
(2/2) May 2023
Chain-of-Thought Hub: Measuring LLMs' Reasoning Performance (https://github.com/FranxYao/chain-of-thought-hub)
-
Ask HN: Is it just me or GPT-4's quality has significantly deteriorated lately?
https://github.com/FranxYao/chain-of-thought-hub
- [N] Chain-of-Thought Hub: Measuring LLMs' Reasoning Performance
- Chain-of-Thought Hub: Measuring LLMs' Reasoning Performance
lmql
- Show HN: Fructose, LLM calls as strongly typed functions
-
Prompting LLMs to constrain output
have been experimenting with guidance and lmql. a bit too early to give any well formed opinions but really do like the idea of constraining llm output.
-
[D] Prompt Engineering Seems Like Guesswork - How To Evaluate LLM Application Properly?
the only time i've ever felt like it was anything other than guesswork was using LMQL . not coincidentally, LMQL works with LLMs as autocomplete engines rather than q&a ones.
-
Guidance for selecting a function-calling library?
lqml
-
Show HN: Magentic – Use LLMs as simple Python functions
This is also similar in spirit to LMQL
https://github.com/eth-sri/lmql
- Show HN: LLMs can generate valid JSON 100% of the time
- LangChain Agent Simulation – Multi-Player Dungeons and Dragons
-
The Problem with LangChain
LLM calls are just function calls, so most functional composition is already afforded by any general-purpose language out there. If you need fancy stuff, use something like Python‘s functools.
Working on https://github.com/eth-sri/lmql (shameless plug, sorry), we have always found that compositional abstractions on top of LMQL are mostly there already, once you internalize prompts being functions.
- Is there a UI that can limit LLM tokens to a preset list?
-
Local LLMs: After Novelty Wanes
LMQL is another.
What are some alternatives?
DB-GPT - AI Native Data App Development framework with AWEL(Agentic Workflow Expression Language) and Agents
guidance - A guidance language for controlling large language models.
llm-leaderboard - A joint community effort to create one central leaderboard for LLMs.
guidance - A guidance language for controlling large language models. [Moved to: https://github.com/guidance-ai/guidance]
tree-of-thoughts - Plug in and Play Implementation of Tree of Thoughts: Deliberate Problem Solving with Large Language Models that Elevates Model Reasoning by atleast 70%
simpleaichat - Python package for easily interfacing with chat apps, with robust features and minimal code complexity.
airoboros - Customizable implementation of the self-instruct paper.
NeMo-Guardrails - NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems.
llm-humaneval-benchmarks
guardrails - Adding guardrails to large language models.
GirlfriendGPT - Girlfriend GPT is a Python project to build your own AI girlfriend using ChatGPT4.0
basaran - Basaran is an open-source alternative to the OpenAI text completion API. It provides a compatible streaming API for your Hugging Face Transformers-based text generation models.