[D] The best way to train an LLM on company data

This page summarizes the projects mentioned and recommended in the original post on /r/MachineLearning

SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • sketch

    AI code-writing assistant that understands data content

    Please look at sketch and langchain pandas/SQL plugins. I have seen excellent results with both of these approaches. Both of these approaches will require you to send metadata to openAI.

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  • llama-peft-tuner

    Tune LLaMa-7B on Alpaca Dataset using PEFT / LORA Based on @zphang's https://github.com/zphang/minimal-llama scripts.

  • simple-llm-finetuner

    Discontinued Simple UI for LLM Model Finetuning

    So as far as set up goes, you just need to: “”” Git clone https://github.com/lxe/simple-llama-finetuner Cd simple-llama-finetuner Pip install -r requirements.txt Python app.py ## if you’re on a remote machine (Paperspace is my go to) then you may need to edit the last line of this script to set ‘share=True’ in the launch args “””

  • sidekick

    Discontinued Universal APIs for unstructured data. Sync documents from SaaS tools to a SQL or vector database, where they can be easily queried by AI applications [Moved to: https://github.com/psychic-api/psychic] (by ai-sidekick)

    A project I’m working on helps with ETL for retrieval augmented generation: https://github.com/ai-sidekick/sidekick

  • azure-search-openai-demo

    A sample app for the Retrieval-Augmented Generation pattern running in Azure, using Azure AI Search for retrieval and Azure OpenAI large language models to power ChatGPT-style and Q&A experiences.

    What some people have done is to use Azure Cognitive Search as a pre-cursor to the LLM.

  • lora

    Using Low-rank adaptation to quickly fine-tune diffusion models. (by cloneofsimo)

    It's really not helpful to make strong assertions like this without referring to specific, verifiable sources. Fine-tuning very typically is done in a way where certain layers/parameters of the model are frozen. This is done to avoid the sort of loss we are discussing. The LoRA paper itself states that LoRA "freezes the pre-trained model weights".

  • xTuring

    Build, customize and control you own LLMs. From data pre-processing to fine-tuning, xTuring provides an easy way to personalize open-source LLMs. Join our discord community: https://discord.gg/TgHXuSJEk6

    I'm currently working on an open-source project for building and controlling LLMs: https://github.com/stochasticai/xturing

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Survey: Training language models to follow instructions with human feedback

    2 projects | dev.to | 20 Aug 2024
  • TypeGPT: Define GPT Output Schemas with Python Classes

    1 project | news.ycombinator.com | 20 Jul 2024
  • Every Way to Get Structured Output from LLMs

    8 projects | news.ycombinator.com | 18 Jun 2024
  • A Comprehensive Guide to the llm-chain Rust crate

    1 project | dev.to | 6 Jun 2024
  • RAG, fine-tuning, API calling and gptscript for Llama 3 running locally

    2 projects | news.ycombinator.com | 24 May 2024

Did you konow that Python is
the 1st most popular programming language
based on number of metions?