MosaicML Agrees to Join Databricks to Power Generative AI for All

This page summarizes the projects mentioned and recommended in the original post on /r/LocalLLaMA

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • llm-foundry

    LLM training code for Databricks foundation models

  • Yes? Their github is under Apache, their base model is under apache, the training data is not theirs, and they provide scripts how to convert it for the pretrain step. They have scripts for pretraining and finetuning as well. Basically for everything.

  • open_llama

    OpenLLaMA, a permissively licensed open source reproduction of Meta AI’s LLaMA 7B trained on the RedPajama dataset

  • Compare it to openllama. It github doesn't have a single script on how to do anything.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • RedPajama-Data

    The RedPajama-Data repository contains code for preparing large datasets for training large language models.

  • Compare it to red pajama, which has scripts only for preprocessing.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts