[R]Comprehensive List of Instruction Datasets for Training LLM Models (GPT-4 & Beyond)

This page summarizes the projects mentioned and recommended in the original post on /r/MachineLearning

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • awesome-totally-open-chatgpt

    A list of totally open alternatives to ChatGPT

  • instruct-eval

    This repository contains code to quantitatively evaluate instruction-tuned models such as Alpaca and Flan-T5 on held-out tasks.

  • Great resource! I’ve recently also benchmarked many of the popular instruction models here: https://github.com/declare-lab/flan-eval

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Eval mmlu result against various infer methods (HF_Causal, VLLM, AutoGPTQ, AutoGPTQ-exllama)

    1 project | /r/LocalLLaMA | 8 Sep 2023
  • [D] Red Pajamas Instruct 7B. Is it really that bad or some some ggml/quantization artifact? Vicuna-7b has no issue writing stories and even does basic text transformation. Yet RP refuses to do anything most of the time. It does generate a story if you run it as a raw model, but gets into a loop.

    1 project | /r/MachineLearning | 27 May 2023
  • [P] The first RedPajama models are here! The 3B and 7B models are now available under Apache 2.0, including instruction-tuned and chat versions. These models aim replicate LLaMA as closely as possible.

    1 project | /r/MachineLearning | 6 May 2023
  • Best Instruct-Trained Alternative to Alpaca/Vicuna?

    2 projects | /r/LanguageTechnology | 23 Apr 2023
  • Llama.cpp Bfloat16 Support

    1 project | news.ycombinator.com | 30 Apr 2024