kamal
LLaMA-Factory
kamal | LLaMA-Factory | |
---|---|---|
79 | 6 | |
12,390 | 46,721 | |
2.1% | 11.0% | |
9.6 | 9.9 | |
4 days ago | 6 days ago | |
Ruby | Python | |
MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
kamal
- .NET on Heroku Now Generally Available
-
Ask HN: What's the ideal stack for a solo dev in 2025
As it's just you I'd stick with Ruby on Rails 8[1] as you already know it and I think it could realistically easily achieve what you're proposing.
There's lots of libraries to for calling out external AI services. e.g. something like FastMCP[2] From the sound of it that's all you need.
I'd use Hotwire[3] for the frontend and Hotwire Native if you want to rollout an app version quickly. I'd back it with SolidCache, SolidQueue, etc
I'd use Kamal[4] to run it on cheap hosting using on something cheap from Hetzner.
1. https://rubyonrails.org/
2. https://github.com/yjacquin/fast-mcp
3. https://hotwired.dev/
4. https://kamal-deploy.org/
-
5 Awesome Railway Alternatives
Kamal, free and open-source from Basecamp, deploys containerized web apps to your servers with Docker. Zero-downtime, works with cheap hosts like DigitalOcean or Hetzner. Deploys in 20 seconds.
-
30,656 Pages of Books About the .NET Ecosystem: C#, Blazor, ASP.NET, & T-SQL
This book claims to cover continuous delivery, cloud-native applications, and Docker. I've really enjoyed using Kamal, so I'd like to get it working with ASP.NET. Perhaps this book will help me reach that goal.
-
Deploy from local to production (self-hosted)
What’s the advantage of this over Kamal? (https://kamal-deploy.org/)
-
Building a simple URL Shortener with Rails 8: A Step-by-Step Guide
Kamal Deployment Tool
-
Deploy a web app on VPS with Docker
You can read more about Kamal at the official docs. There are many concepts that I didn't introduce in this post, for example: manage accessories (db/redis/search), manage logging, ...
- Kamal: Deploy Web Apps Anywhere
- Kamal – Deploy web apps anywhere
-
Render raises $80M in Series C financing
I’ve been pretty vocal about using http://kamal-deploy.org/ more and more, and Render less.
Kamal really is fantastic, PaaS niceties w/o the PaaS tax.
LLaMA-Factory
-
Fine-tune Google's Gemma 3
Take a look at the hardware requirements at https://github.com/hiyouga/LLaMA-Factory?tab=readme-ov-file#...
A 'LoRA' is a memory-efficient type of fine tuning that only tunes a small fraction of the LLM's parameters. And 'quantisation' reduces an LLM to, say, 4 bits per parameter. So it's feasible to fine-tune a 7B parameter model at home.
Anything bigger than 7B parameters and you'll want to look at renting GPUs on a platform like Runpod. In the current market, there are used 4090s selling on ebay right now for $2100 while runpod will rent you a 4090 for $0.34/hr - you do the math.
It's certainly possible to scale model training to span multiple nodes, but generally scaling through bigger GPUs and more GPUs per machine is easier.
-
ORPO, DPO, and PPO: Optimizing Models for Human Preferences
Implementation: ORPO has been integrated into popular fine-tuning libraries like TRL, Axolotl, and LLaMA-Factory.
- Llama-Factory: A WebUI for Efficient Fine-Tuning of 100 LLMs
- FLaNK-AIM Weekly 06 May 2024
-
Show HN: GPU Prices on eBay
Depends what model you want to train, and how well you want your computer to keep working while you're doing it.
If you're interested in large language models there's a table of vram requirements for fine-tuning at [1] which says you could do the most basic type of fine-tuning on a 7B parameter model with 8GB VRAM.
You'll find that training takes quite a long time, and as a lot of the GPU power is going on training, your computer's responsiveness will suffer - even basic things like scrolling in your web browser or changing tabs uses the GPU, after all.
Spend a bit more and you'll probably have a better time.
[1] https://github.com/hiyouga/LLaMA-Factory?tab=readme-ov-file#...
- FLaNK Weekly 31 December 2023
What are some alternatives?
OpenVoice - Instant voice cloning by MIT and MyShell. Audio foundation model.
kaytu - Kaytu's AI platform boosts cloud efficiency by analyzing historical usage and delivering intelligent recommendations—such as optimizing instance sizes—that maintain reliability. Pay for what you need, without compromising your apps.
coolify - An open-source & self-hostable Heroku / Netlify / Vercel alternative.
efficient-kan - An efficient pure-PyTorch implementation of Kolmogorov-Arnold Network (KAN).
whisper-plus - WhisperPlus: Faster, Smarter, and More Capable 🚀
promptbench - A unified evaluation framework for large language models