Our great sponsors
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
Sounds like you might be interested in the Tigris preview:
- https://www.tigrisdata.com/
- https://benhoyt.com/writings/flyio-and-tigris/ (discussed here: https://news.ycombinator.com/item?id=39360870)
- https://fly.io/docs/reference/tigris/
As far as I know, Fly uses Firecracker for their VMs. I've been following Firecracker for a while now (even using it in a project), and they don't support GPUs out of the box (and have no plan to support it [1]).
I'm curious to know how Fly figured their own GPU support with Firecracker. In the past they had some very detailed technical posts on how they achieved certain things, so I'm hoping we'll see one on their GPU support in the future!
[1]: https://github.com/firecracker-microvm/firecracker/issues/11...
How difficult world it be to set up Folding@home on these? https://foldingathome.org
Because I have secret magical powers that you probably don't, it's basically free for me. Here's the breakdown though:
The application server uses Deno and Fresh (https://fresh.deno.dev) and requires a shared-1x CPU at 512 MB of ram. That's $3.19 per month as-is. It also uses 2GB of disk volume, which would cost $0.30 per month.
As far as post generation goes: when I first set it up it used GPT-3.5 Turbo to generate prose. That cost me rounding error per month (maybe like $0.05?). At some point I upgraded it to GPT-4 Turbo for free-because-I-got-OpenAI-credits-on-the-drama-day reasons. The prose level increase wasn't significant.
With the GPU it has now, a cold load of the model and prose generation run takes about 1.5 minutes. If I didn't have reasons to keep that machine pinned to a GPU (involving other ridiculous ventures), it would probably cost about 5 minutes per day (increased the time to make the math easier) of GPU time with a 40 GB volume (I now use Nous Hermes Mixtral at Q5_K_M precision, so about 32 GB of weights), so something like $6 per month for the volume and 2.5 hours of GPU time, or about $6.25 per month on an L40s.
In total it's probably something like $15.75 per month. That's a fair bit on paper, but I have certain arrangements that make it significantly less cheap for me. I could re-architect Arsène to not have to be online 24/7, but it's frankly not worth it when the big cost is the GPU time and weights volume. I don't know of a way to make that better without sacrificing model quality more than I have to.
For a shitpost though, I think it'd totally worth it to pay that much. It's kinda hilarious and I feel like it makes for a decent display of how bad things could get if we go full "AI replaces writers" like some people seem to want for some reason I can't even begin to understand.
I still think it's funny that I have to explicitly tell people to not take financial advice from it, because if I didn't then they will.
Related posts
- Show HN: BewCloud is a simpler alternative to Nextcloud written in TypeScript
- Time-series data ingestion from Rust WebAssembly application, leveraging GreptimeDB and WasmEdge
-
denodb VS denodata - a user suggested alternative
2 projects | 29 Nov 2023
- A self-hosted back end for Deno KV
- Standalone, open source, self-hostable Deno KV binary