Local-LLM-Comparison-Colab-UI
Local-LLM-Comparison-Colab-UI | simple-proxy-for-tavern | |
---|---|---|
20 | 21 | |
886 | 111 | |
- | - | |
9.1 | 8.0 | |
4 days ago | 10 months ago | |
Jupyter Notebook | JavaScript | |
- | GNU Affero General Public License v3.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Local-LLM-Comparison-Colab-UI
- Mistral 7B OpenOrca outclasses Llama 2 13B variants
-
GPT-4 API general availability
In terms of speed, we're talking about 140t/s for 7B models, and 40t/s for 33B models on a 3090/4090 now.[1] (1 token ~= 0.75 word) It's quite zippy. llama.cpp performs close on Nvidia GPUs now (but they don't have a handy chart) and you can get decent performance on 13B models on M1/M2 Macs.
You can take a look at a list of evals here: https://llm-tracker.info/books/evals/page/list-of-evals - for general usage, I think home-rolled evals like llm-jeopardy [2] and local-llm-comparison [3] by hobbyists are more useful than most of the benchmark rankings.
That being said, personally I mostly use GPT-4 for code assistance to that's what I'm most interested in, and the latest code assistants are scoring quite well: https://github.com/abacaj/code-eval - a recent replit-3b fine tune the human-eval results for open models (as a point of reference, GPT-3.5 gets 60.4 on pass@1 and 68.9 on pass@10 [4]) - I've only just started playing around with it since replit model tooling is not as good as llamas (doc here: https://llm-tracker.info/books/howto-guides/page/replit-mode...).
I'm interested in potentially applying reflexion or some of the other techniques that have been tried to even further increase coding abilities. (InterCode in particular has caught my eye https://intercode-benchmark.github.io/)
[1] https://github.com/turboderp/exllama#results-so-far
[2] https://github.com/aigoopy/llm-jeopardy
[3] https://github.com/Troyanovsky/Local-LLM-comparison/tree/mai...
[4] https://github.com/nlpxucan/WizardLM/tree/main/WizardCoder
-
Best 7B model
The best 7B I tried is WizardLM. It's my go-to model.
-
UltraLM-13B reaches top of AlpacaEval leaderboard
If you want to try it out, you can use Google Colab here with Oobabooga Text Generation UI: Link (Remember to check the instruction template and generation parameters)
-
wizardLM-7B.q4_2
I'm really impressed by wizardLM-7B.q4_2 (GPT4all) running on my 8gb M2 Mac Air. Fast response, fewer hallucinations than other 7B models I've tried. GPT4All's beta document collection and query function is respectable--going to test it more tomorrow. FWIW wizardLM-7B.q4_2 was ranked very high here https://github.com/Troyanovsky/Local-LLM-comparison.
-
Help me discover new LLMs for school project
I made a series of Colab notebooks for different models: https://github.com/Troyanovsky/Local-LLM-comparison
-
Nous Hermes 13b is very good.
I found it performing very well too in my testing (Repo). It's my second favorite model after WizardLM-13B.
- How to train 7B models with small documents?
-
What are your favorite LLMs?
My entire list at: Local LLM Comparison Repo
-
Announcing Nous-Hermes-13b (info link in thread)
I just tried HyperMantis and updated the results in the repo. It performs not bad but worse than Nous-Hermes-13B.
simple-proxy-for-tavern
-
ST Proxy Down after updating?
Been using 1.9 for a while now, but heard about all the nice new extras in 1.1 so I decided to upgrade today. Silly me! Well ST Itself is running fine I've got that up no issues, extras is installed and koboldcpp is up to date...but it won't connect ST to KCPP. That being https://github.com/anon998/simple-proxy-for-tavern. It's worked like a dream for weeks up until today when I went to update and now the two can't talk. I did a fresh git pull of a new version - nothing. Rolled back to 1.9.7 - and it works like a charm. Anyone else had these issues?
-
Silly Tavern Proxy for OpenAI Down with latest update?
https://github.com/anon998/simple-proxy-for-tavern/issues/25 It threw this error at me for a moment when I tried open AI, but not when running my Local LLM Via Kobold.
-
GPT4all and koboldcpp/etc
Still, nothing beats the SillyTavern + simple-proxy-for-tavern setup for me. But currently there's even a known issue with that and koboldcpp regarding sampler order used in the proxy presets (PR for fix is waiting to be merged, until it's merged, manually changing the presets may be required).
- What's the closest thing we have to GPT4's code interpreter right now?
-
HOW IN GOD NAME DO I CONNECT THE API FOR SIMPLE PROXY FOR TAVERN??
I’m using this https://github.com/anon998/simple-proxy-for-tavern
-
What is the best text web ui currently?
SillyTavern is just a frontend so it's as fast or slow as your backend. With simple-proxy-for-tavern you can use llama.cpp directly, no Python involved, so SillyTavern will be as fast as llama.cpp itself.
- Dual 3090 and NVlink (or not) for 65B models with ooba and 4bit 65B models
-
Oogabooga and llama.cpp in longer conversations answers take forever.....
I also use the simple-proxy-for-tavern in between koboldcpp and the frontend, which does some magic behind the scenes to improve things further. Takes some time to understand and configure it all, but once done, it's definitely worth the effort, as nothing beats that setup for roleplaying and chat.
-
koboldcpp-1.33 Ultimate Edition released!
Really? Then we definitely have different experiences (or different ways to interact) with Guanaco. It's been the most unrestricted model I've tried, and I tried them all, but I'm using SillyTavern and the simple-proxy-for-tavern which combined with a little prompting liberates basically any model.
-
The best 13B model for rolepay?
Why reinvent the wheel? Just use SillyTavern, ideally with the simple-proxy-for-tavern. That does it all, and more.
What are some alternatives?
langflow - ⛓️ Langflow is a dynamic graph where each node is an executable unit. Its modular and interactive design fosters rapid experimentation and prototyping, pushing hard on the limits of creativity.
koboldcpp - A simple one-file way to run various GGML and GGUF models with KoboldAI's UI
private-gpt - Interact with your documents using the power of GPT, 100% privately, no data leaks
SillyTavern - LLM Frontend for Power Users. [Moved to: https://github.com/SillyTavern/SillyTavern]
GPTQ-for-LLaMa - 4 bits quantization of LLaMa using GPTQ
SillyTavern-Extras - Extensions API for SillyTavern.
SillyTavern - LLM Frontend for Power Users.
alpaca_eval - An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
lollms-webui - Lord of Large Language Models Web User Interface
can-ai-code - Self-evaluating interview for AI coders
gpt-code-ui - An open source implementation of OpenAI's ChatGPT Code interpreter