Finetuning Large Language Models -- An introduction to the core ideas and approaches

This page summarizes the projects mentioned and recommended in the original post on /r/learnmachinelearning

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • notebooks

    Repo for various jupyter notebooks. (by cmauck10)

  • Cool read! I just finished up a notebook where I show how noisy labels can drastically impact the performance of Open AI LLMs. I first fine-tune the well-known Davinci model (the backbone of ChatGPT) on the original data and report an accuracy of 63%. I then use the open-source package cleanlab to find examples that are incorrectly labeled and drop them from the training data. This step increases the fine-tuning accuracy to 66% (better accuracy with less data). Finally, I correct the mislabeled examples and fine-tuning accuracy jumps to 77%!

  • examples

    Notebooks demonstrating example applications of the cleanlab library (by cleanlab)

  • Cool read! I just finished up a notebook where I show how noisy labels can drastically impact the performance of Open AI LLMs. I first fine-tune the well-known Davinci model (the backbone of ChatGPT) on the original data and report an accuracy of 63%. I then use the open-source package cleanlab to find examples that are incorrectly labeled and drop them from the training data. This step increases the fine-tuning accuracy to 66% (better accuracy with less data). Finally, I correct the mislabeled examples and fine-tuning accuracy jumps to 77%!

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Blog-Reading Chatbot with GPT-4o

    1 project | dev.to | 14 May 2024
  • The First Convolutional-KANs

    2 projects | news.ycombinator.com | 14 May 2024
  • Show HN: It's Elementary: Meet Quantum Computing [pdf]

    1 project | news.ycombinator.com | 14 May 2024
  • Show HN: Move data from any Vector DB to any other Vector DB

    1 project | news.ycombinator.com | 13 May 2024
  • How to mix and match dev tools for AI Agents

    1 project | news.ycombinator.com | 13 May 2024