Our great sponsors
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
I couldn't agree more. You should check out LLMWare's SLIM agents (https://github.com/llmware-ai/llmware/tree/main/examples/SLI...). It's focusing on pretty much exactly this and chaining multiple local LLMs together.
A really good topic that ties in with this is the need for deterministic sampling (I may have the terminology a bit incorrect) depending on what the model is indended for. The LLMWare team did a good 2 part video on this here as well (https://www.youtube.com/watch?v=7oMTGhSKuNY)
I think dedicated miniture LLMs are the way forward.
Disclaimer - Not affiliated with them in any way, just think it's a really cool project.
If I'm reading this correctly, they had to discard Llama 2 answers and only use GPT-3.5 given answers to test the hypothesis.
GPT-3.5 answering questions through the OAI API alone is not an acceptable method of testing problem solving ability across a range of temperatures. OpenAI does some blackbox wizardry on their end.
There are many complex and clever sampling techniques for which temperature is just one (possibly dynamic) component
One example from the llama.cpp codebase is dynamic temperature sampling
https://github.com/ggerganov/llama.cpp/pull/4972/files
Not sure what you mean by whole model state given that there are tens of thousands of possible tokens and the models have billions of parameters in XX,XXX-dimensional space. How many queries across how many sampling methods might you need? Err..how much time? :)
Related posts
- Show HN: LLMWare – Small Specialized Function Calling 1B LLMs for Multi-Step RAG
- Show HN: LLMWare – Integrated Solution for RAG in Finance and Legal
- Llmware.ai – AI Tools for Financial, Legal and Compliance
- Show HN: macOS GUI for running LLMs locally
- Ask HN: What are the capabilities of consumer grade hardware to work with LLMs?