Is it possible to train a Lora on a 6GB vram GPU?

Our great sponsors

InfluxDB - Power Real-Time Data Analytics at Scale

WorkOS - The modern identity platform for B2B SaaS

SaaSHub - Software Alternatives and Reviews

Our great sponsors

doki-rnn

1 8 10.0

A DDLC mod using a neural net that was trained to code Ren'Py script.

I want to fine-tune OpenLlaMA 3B and make something similar to this project but on top of Llama model (https://github.com/stephwag/doki-rnn). But I don't have a very powerful GPU. It is GTX 1660 with 6GB vram. I can easily run 13B models in GGML formats but can't make a Lora for 3B model. For the first test I tried to create a small lora trained on 10 letters in Oobabooga WebUI. I tried to load the model in GPTQ and GGML formats, but got only a few errors. When I try with GGML format I get the error "LlamaCppModel' object has no attribute 'decode'". When I try with GPTQ-for-Llama format using monkey_patch I get the error "NotImplementedError". When I try with AutoGPTQ format using monkey_patch I get the error "Target module QuantLinear() is not supported". As I understand it, to create a lora in Oobabooga you need to load the model in Transformers format, but I can't to load the model in Transformers format because of Out Of Memory error. If I load it in 4-bit or 8-bit I get error "size mismatch for base_model"

qlora

80 9,388 7.4 Jupyter Notebook

QLoRA: Efficient Finetuning of Quantized LLMs
InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

First impressions: GPU + GCP Batch
2 projects | dev.to | 26 Apr 2024
Searchformer: Beyond a* Better Planning with Transformers via Search Dynamics
1 project | news.ycombinator.com | 26 Apr 2024
Voxel51 Filtered Views Newsletter – April 26, 2024
1 project | dev.to | 26 Apr 2024
DataFrameAndNotebooksAmsterdam2024 – Discovering why trains come in late in NL
1 project | news.ycombinator.com | 25 Apr 2024
Why Vector Compression Matters
3 projects | dev.to | 24 Apr 2024

Is it possible to train a Lora on a 6GB vram GPU?

This page summarizes the projects mentioned and recommended in the original post on /r/LocalLLaMA Post date: 28 Jun 2023

doki-rnn

qlora

InfluxDB

Related posts