Finetuning Large Language Models

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

openai-cookbook

215 55,954 9.5 MDX

Examples and guides for using the OpenAI API

I still don't have a really good answer to this question:
If you want to be able to do Q&A against an existing corpus of documentation, can fine-tuning an LLM on that documentation get good results, or is that a waste of time compared to the trick where you search for relevant content and paste that into a prompt along with your question?
I see many people get excited about fine-tuning because they want to solve this problem.
The best answer I've seen so far is in https://github.com/openai/openai-cookbook/blob/main/examples...
> though fine-tuning can feel like the more natural option—training on data is how GPT learned all of its other knowledge, after all—we generally do not recommend it as a way to teach the model knowledge. Fine-tuning is better suited to teaching specialized tasks or styles, and is less reliable for factual recall. [...] In contrast, message inputs are like short-term memory. When you insert knowledge into a message, it’s like taking an exam with open notes. With notes in hand, the model is more likely to arrive at correct answers.

tabnine-intellij

1 501 9.3 Kotlin

Jetbrains IDEs client for TabNine. Compatible with all IntelliJ-based IDEs. https://plugins.jetbrains.com/plugin/12798-tabnine

This is why Tabnine got e super excited, until I ran across the issue where they think their results are better than what the IDE gives you, which is incredibly annoying. https://github.com/codota/tabnine-intellij/issues/18 . I would be happy to pay, but it seems they are convinced their way is best.
I honestly think that if you could have all your private code indexed and accessible, this would be a game changer as it has way better context.

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
nextjs-openai-doc-search

8 1,487 5.9 TypeScript

Template for building your own custom ChatGPT style doc search powered by Next.js, OpenAI, and Supabase.

> the trick where you search for relevant content and paste that into a prompt
Supabase Clippy was the first docs site to ship this experience to production as far as I can tell: https://supabase.com/blog/chatgpt-supabase-docs
I believe they called it "context injection" and I have been following suit in my own writing on the topic.
I am prototyping experiences like Supabase Clippy and am also very interested in fine-tuning for docs Q&A. But my main question is: what exactly would the fine-tuning inputs and outputs look like for docs Q&A?
From my blog:
> AI is all about prediction. Given this temperature, this wind, this day of the year, what is the chance of rain? Temperature, wind, and date are your inputs. Chance of rain is your desired output. Now, try to apply this same type of thinking towards documentation. What are your inputs? What’s your output? The page title and code block could be your inputs. Whether or not the code builds could be your output. Or maybe the code block should be the output? This is why I keep saying that applying fine-tuning to docs is tricky. What are the inputs and outputs?
https://technicalwriting.tools/posts/ten-principles-response...
(I am an AI n00b and have not looked deeply into how fine-tuning works but it's high on my list to experiment with OpenAI's fine-tuning API. Please LMK if I am getting any fundamentals wrong.)

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Best Authentication Library in 2023 ?

3 projects | /r/nextjs | 23 Jun 2023
We made a AI powered assistant using OpenAI, ruby and redis

3 projects | /r/ChatGPTCoding | 11 May 2023
Show HN: Gromit, the OS, AI powered assistant for your website/app

3 projects | news.ycombinator.com | 11 May 2023
Knowledge retrieval architectures for LLMs (2023)

1 project | news.ycombinator.com | 27 Apr 2023
Supabase kit for building ChatGPT apps

1 project | /r/Supabase | 18 Apr 2023

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
AI jetbrains chatgpt jetbrains-plugin Nextjs
Post date: 22 Apr 2023

openai-cookbook

tabnine-intellij

InfluxDB

nextjs-openai-doc-search

Related posts

Best Authentication Library in 2023 ?

We made a AI powered assistant using OpenAI, ruby and redis

Show HN: Gromit, the OS, AI powered assistant for your website/app

Knowledge retrieval architectures for LLMs (2023)

Supabase kit for building ChatGPT apps