Show HN: ChatLLaMA – A ChatGPT style chatbot for Facebook's LLaMA

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • nebuly

    The user analytics platform for LLMs

  • How does it differentiate from the original ChatLLaMA? https://github.com/nebuly-ai/nebullvm/tree/main/apps/acceler...

  • alpaca-7b-truss

  • ChatLLaMA is an experimental chatbot interface for interacting with variants of Facebook's LLaMA. Currently, we support the 7 billion parameter variant that was fine-tuned on the Alpaca dataset. This early versions isn't as conversational as we'd like, but over the next week or so, we're planning on adding support for the 30 billion parameter variant, another variant fine-tuned on LAION's OpenAssistant dataset and more as we explore what this model is capable of.

    If you want deploy your own instance is the model powering the chatbot and build something similar we've open sourced the Truss here: https://github.com/basetenlabs/alpaca-7b-truss

    We'd love to hear any feedback you have. You can reach me on Twitter @aaronrelph or Abu (the engineer behind this) @aqaderb.

    Disclaimer: We both work at Baseten. This was a weekend project. Not trying to shill anything; just want to build and share cool stuff.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • hh-rlhf

    Human preference data for "Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback"

  • It just hasn't been prompted or fine-tuned to have the neutral, self effacing personality of ChatGPT.

    It's doing the pure, "try to guess the most likely next token" task on which they were both trained (https://heartbeat.comet.ml/causal-language-modeling-with-gpt...) (before the reinforcement from human feedback to make them more tool-like https://arxiv.org/abs/2204.05862), with a bit of randomness added for variety's sake (https://huggingface.co/blo1g/how-to-generate).

  • til

    Today I Learned (by simonw)

  • stanford_alpaca

    Code and documentation to train Stanford's Alpaca models, and generate the data.

  • The original Alpaca repo has the training script. The readme has the torchrun command and arguments used for train.py. https://github.com/tatsu-lab/stanford_alpaca/blob/main/train...

  • alpaca-lora

    Instruct-tune LLaMA on consumer hardware

  • did you use the cleaned and improved alpaca dataset from https://github.com/tloen/alpaca-lora/issues/28 ?

  • chatllama

    ChatLLaMA 📢 Open source implementation for LLaMA-based ChatGPT runnable in a single GPU. 15x faster training process than ChatGPT

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • text-generation-webui

    A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.

  • most implementations do, like https://github.com/oobabooga/text-generation-webui

    this might be a hallucinated answer, due to the very small model size of 7b. try the 13b-4bit, it's much better!

  • LLM-As-Chatbot

    LLM as a Chatbot Service

  • this is useless because it doesn't handle context:

    Q: Name five genres of music.

    A: Jazz, country, hip-hop, blues, classical.

    Q: Name a famous artist from the third genre.

    A: Salvador Dalí.

    Whereas this one actually supports context: https://github.com/deep-diver/Alpaca-LoRA-Serve

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts