Databricks Releases 15K Record Training Corpus for Instruction Tuning LLMs

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • dolly

    Databricks’ Dolly, a large language model trained on the Databricks Machine Learning Platform

  • I'm going through the dataset with your datasette tool and it looks like it might be a good idea to clean things up a bit. There are many duplicates[1], creepypastas[2] and other strange things in there.

    [1] https://lite.datasette.io/?json=https%3A%2F%2Fraw.githubuser...

    [2] https://lite.datasette.io/?json=https://github.com/databrick...

  • DeepSpeedExamples

    Example models using DeepSpeed

  • can you compare your dolly offering with https://github.com/microsoft/DeepSpeedExamples/blob/master/a...

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • ggml

    Tensor library for machine learning

  • it's probably simple for Dolly v1 (?) since it was a fine-tuned version of GPT-J

    https://github.com/ggerganov/ggml/tree/master/examples/gpt-j

    AFAIK there is no .cpp version of Pythia-12B yet

  • LLaMA_MPS

    Discontinued Run LLaMA inference on Apple Silicon GPUs.

  • I saw this: https://github.com/jankais3r/LLaMA_MPS

    it runs slightly slower on the GPU than under llama.cpp but uses much less power doing so

    I would guess the slowness is due to immaturity of the PyTorch MPS backend, the asitop graphs show it doing a bunch of cpu along with the gpu, so it might be inefficiently falling back to cpu for some ops and swapping layers back and forth (I have no idea, just guessing)

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts