LLM-As-Chatbot
hh-rlhf
LLM-As-Chatbot | hh-rlhf | |
---|---|---|
3 | 6 | |
3,242 | 1,447 | |
- | 2.5% | |
9.0 | 3.6 | |
6 months ago | 8 months ago | |
Python | ||
Apache License 2.0 | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
LLM-As-Chatbot
-
OpenAI's GPT-4 Red Teamer Nathan Labenz: the GPT-4 base model recommends assassinating humans, naming specific targets
The first one is from https://github.com/deep-diver/Alpaca-LoRA-Serve
-
Show HN: ChatLLaMA – A ChatGPT style chatbot for Facebook's LLaMA
this is useless because it doesn't handle context:
Q: Name five genres of music.
A: Jazz, country, hip-hop, blues, classical.
Q: Name a famous artist from the third genre.
A: Salvador Dalí.
Whereas this one actually supports context: https://github.com/deep-diver/Alpaca-LoRA-Serve
- Show HN: Finetune LLaMA-7B on commodity GPUs using your own text
hh-rlhf
-
Meta wants its open source AI model to be as capable as OpenAI’s best model
If you ask an LLM to complete a sentence like '[Insert name] stole the fruit (true/false):'
An aligned LLM will be biased towards refusing to answer at all with something like: "I can't tell you because I don't know them."
An "uncensored" LLM will very happily return <"true"> or <"false"> with a probability attached to each. Even OpenAI's GPT-3 does with a low enough temperature.
_
Of course, LLM attention doesn't work like that. The tokens are just a bag of numbers:
- The fact the name 'John' is mentioned in the Bible a lot affects the distribution when you ask if any John stole, because John is always [7554]
- The fact that 'Olf' is part of Adolf and Adolf Hitler is mentioned in a lot of negative sentences will drag the distribution, because 'Olf' is always [4024] and Adolf is always [324, 4024]
You could have asked something with no logical probability difference at all, like:
- 'The store attendant's name was [name], did the child in Long Island drop his ball (true/false):'
And unless you train the model to give you disclaimers it still follows the instruction faithfull and returns true/false with probabilities, demonstrating a deep regression in reasoning...
That's why for models past a certain size, alignment increases performance: https://arxiv.org/abs/2204.05862.
- Training a Helpful and Harmless Assistant with Reinforcement Learning from Human
-
OpenDILab Awesome Paper Collection: RL with Human Feedback (3)
Title: Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback
-
Show HN: ChatLLaMA – A ChatGPT style chatbot for Facebook's LLaMA
It just hasn't been prompted or fine-tuned to have the neutral, self effacing personality of ChatGPT.
It's doing the pure, "try to guess the most likely next token" task on which they were both trained (https://heartbeat.comet.ml/causal-language-modeling-with-gpt...) (before the reinforcement from human feedback to make them more tool-like https://arxiv.org/abs/2204.05862), with a bit of randomness added for variety's sake (https://huggingface.co/blo1g/how-to-generate).
-
[D] Is Anthropic influential in research?
They have done good work like releasing their paper and dataset for training an assistant RLHF model. https://github.com/anthropics/hh-rlhf
-
[R] Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned - Anthropic - Ganguli et al 2022
Github: https://github.com/anthropics/hh-rlhf
What are some alternatives?
alpaca-lora - Instruct-tune LLaMA on consumer hardware
nebuly - The user analytics platform for LLMs
simple-llm-finetuner - Simple UI for LLM Model Finetuning
stanford_alpaca - Code and documentation to train Stanford's Alpaca models, and generate the data.
peft - 🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
awesome-RLHF - A curated list of reinforcement learning with human feedback resources (continually updated)
alpaca-7b-truss
alpaca.cpp - Locally run an Instruction-Tuned Chat-Style LLM (Android/Linux/Windows/Mac)
text-generation-webui - A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.