Finetuning on multiple GPUs

This page summarizes the projects mentioned and recommended in the original post on /r/LocalLLaMA

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • bitsandbytes

    Accessible large language models via k-bit quantization for PyTorch.

  • If it also has QLoRA that would be the best but afaik it's not implemented in bitsandbytes yet?

  • text-generation-webui-testing

    A fork of textgen that still supports V1 GPTQ, 4-bit lora and other GPTQ models besides llama.

  • i've never tried that particular one. everything else I threw at it trained through : https://github.com/Ph0rk0z/text-generation-webui-testing/ successfully.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • GPTQ-Merged

    trying to make sense of it

  • Probably need to add universal support to the native functions because it uses llama only. If you edit the load_llama functions in autograd py to use generic stuff like this: https://github.com/Ph0rk0z/GPTQ-Merged/blob/dual-model/src/alpaca_lora_4bit/autograd_4bit.py it has a good chance of working. Might need to also add trust_remote_code.

  • axolotl

    Go ahead and axolotl questions

  • Finetuning on multiple GPUs works pretty much out of the box for every finetune project I've tried. Here's the best finetune codebase I'd found that supports QLoRA: https://github.com/OpenAccess-AI-Collective/axolotl

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts