promptfoo
WizardVicunaLM
promptfoo | WizardVicunaLM | |
---|---|---|
20 | 12 | |
2,830 | 708 | |
21.2% | - | |
9.9 | 6.8 | |
7 days ago | 12 months ago | |
TypeScript | ||
MIT License | - |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
promptfoo
- Google CodeGemma: Open Code Models Based on Gemma [pdf]
- AI Infrastructure Landscape
- Promptfoo – Testing and Evaluation for LLMs
-
Show HN: Prompt-Engineering Tool: AI-to-AI Testing for LLM
Super interesting. We've been experimenting with [promptfoo](https://github.com/promptfoo/promptfoo) at my work, and this looks very similar.
- GitHub – promptfoo/promptfoo: Test your prompts
-
I asked 60 LLMs a set of 20 questions
In case anyone's interested in running their own benchmark across many LLMs, I've built a generic harness for this at https://github.com/promptfoo/promptfoo.
I encourage people considering LLM applications to test the models on their _own data and examples_ rather than extrapolating general benchmarks.
This library supports OpenAI, Anthropic, Google, Llama and Codellama, any model on Replicate, and any model on Ollama, etc. out of the box. As an example, I wrote up an example benchmark comparing GPT model censorship with Llama models here: https://promptfoo.dev/docs/guides/llama2-uncensored-benchmar.... Hope this helps someone.
- Ask HN: Prompt Manager for Developers
- DeepEval – Unit Testing for LLMs
- Show HN: Knit – A Better LLM Playground
- Show HN: CLI for testing and evaluating LLM outputs
WizardVicunaLM
-
WizardLM-13B-V1.0-Uncensored
HELP! I need some clarification. I'm familiar with Wizard-Vicuna-13b-Uncensored which is EHartford's uncensoring of WizardVicunaLM.
-
Ask HN: Should I cancel my GPT-4 subscription and get Copilot instead?
> I’m also open to open source models but I hear they’re not even as good as gpt3.5.
WizardVicunaLM claims ~97% performance relative to GPT3.5: https://github.com/melodysdreamj/WizardVicunaLM
It's not particularly great at generating code, but it's uncensored and writes fantastic prose. I've been using it for the last week and I'm really satisfied with where it stands.
> It’s sad that we’re stuck in this monopoly of powerful LLMs.
Won't anyone just sponsor a few months of dedicated GPU training, finetuning and quantizing so they can be held legally accountable for it's output?
I wouldn't hold my breath.
-
Wizard-Vicuna-30B-Uncensored
Also, just noticed that you may have forgotten to update the readme, which references 13b, not 30b, thought maybe that was intentional. (If you linked directly to the Github ("WizardVicunaLM"), that would make it a bit easier for people like me to follow))
-
Where we’re at with self-hosted AI today?
There are a lot of options. Right now I'm using WizardVicunaLM to great success: https://github.com/melodysdreamj/WizardVicunaLM
It combines the uncensored WizardLM data with the Vicuna tuning to create a surprisingly high-performance model. If the chart on their GitHub page is to be believed, their model approaches GPT-3.5 performance.
-
WizardLM-30B-Uncensored
Here is the codebase and dataset for WizardVicuna https://github.com/melodysdreamj/WizardVicunaLM https://github.com/lm-sys/FastChat https://huggingface.co/datasets/RyokoAI/ShareGPT52K
- LLM that combines the principles of wizardLM and vicunaLM
-
[P] airoboros 7b - instruction tuned on 100k synthetic instruction/responses
I used the same questions from WizardVicunaLM:
- Is there a "rut" that we're in on the way to general AI?
- WizardLM-13B-Uncensored
-
Weekly Megathread
https://github.com/melodysdreamj/WizardVicunaLM - Combining WizardLM and Vicuña Principle. Made by u/Clear-Jelly2873
What are some alternatives?
shap-e - Generate 3D objects conditioned on text or images
text-generation-webui - A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.
prompt-engineering - Tips and tricks for working with Large Language Models like OpenAI's GPT-4.
llama.cpp - LLM inference in C/C++
WizardLM - Family of instruction-following LLMs powered by Evol-Instruct: WizardLM, WizardCoder and WizardMath
koboldcpp - A simple one-file way to run various GGML and GGUF models with KoboldAI's UI
chat-ui - Open source codebase powering the HuggingChat app
nsfw-prompt-detection-sd - NSFW Prompt Detection for Stable Diffusion
litellm - Call all LLM APIs using the OpenAI format. Use Bedrock, Azure, OpenAI, Cohere, Anthropic, Ollama, Sagemaker, HuggingFace, Replicate (100+ LLMs)
ChainForge - An open-source visual programming environment for battle-testing prompts to LLMs.
airoboros - Customizable implementation of the self-instruct paper.