-
For training: This is just an inherently technical process but looking at existing workflows helps pull the piece together and here is a good resource I got started with (https://github.com/philschmid/deep-learning-pytorch-huggingface). There is also a ton of resources on Github for scripts, walkthroughs and more (eg: https://github.com/Lightning-AI/lit-llama/blob/main/howto/train_redpajama.md)
-
Judoscale
Save 47% on cloud hosting with autoscaling that just works. Judoscale integrates with Django, FastAPI, Celery, and RQ to make autoscaling easy and reliable. Save big, and say goodbye to request timeouts and backed-up task queues.
-
lit-llama
Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.
For training: This is just an inherently technical process but looking at existing workflows helps pull the piece together and here is a good resource I got started with (https://github.com/philschmid/deep-learning-pytorch-huggingface). There is also a ton of resources on Github for scripts, walkthroughs and more (eg: https://github.com/Lightning-AI/lit-llama/blob/main/howto/train_redpajama.md)
-
You should also try dabbling in AI art. Full motion video is becoming increasingly prevalent (albeit a bit rough as it's still growing). Stable Diffusion Automatic1111 is free. Get to downloading, and try LoRA's with a Stable Diffusion XL checkpoint from Civitai. The future is now, old man.
-
Like most things in IT and programming, this imo is also best learned by doing rather than reading a bunch of technical papers. I recommend starting with ChatGPT and testing out varieties of prompts asking it to fill in your knowledge. There are great documentation by both OpenAI and Microsoft Azure on using these models with examples. For open-source models (and transformers in general) HuggingFace is a great resource and you can get started on some smaller models by downloading it and trying right away. Karpathy's nanoGPT video on Youtube is also very useful to get started.
-
-
TBH, for your coding assistant use-case I would not start out training my own model. Check out https://github.com/paul-gauthier/aider - it's fantastic and it beats most commercial coding assistants, if not all when it comes to work on an existing code base. It works best with OpenAI. OSS models are possible but difficult to do.