Workers AI: serverless GPU-powered inference on Cloudflare’s global network

Our great sponsors

InfluxDB - Power Real-Time Data Analytics at Scale

WorkOS - The modern identity platform for B2B SaaS

SaaSHub - Software Alternatives and Reviews

Our great sponsors

discourse-ai

2 54 9.7 Ruby

Embedding cost and model choice makes this a very compelling choice. I'm working on leveraging embeddings in https://github.com/discourse/discourse-ai where it powers offering related topics, semantic search, tag and category recommendations among other things.
A cheap offering like this can make it a lot more reasonable for self-hosters.

whisper-turbo

11 1,554 8.9 TypeScript

Cross-Platform, GPU Accelerated Whisper 🏎️

Whisper large is only 1.5B params, why not run it client side with something like https://github.com/FL33TW00D/whisper-turbo
(Disclaimer: I am the author)

InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
get-beam

8 89 8.2 Shell

Run GPU inference and training jobs on serverless infrastructure that scales with you.

Serverless only works if the cold boot is fast. For context, my company runs a serverless cloud GPU product called https://beam.cloud, which we've optimized for fast cold start. We see Whisper in production cold start in under 10s (across model sizes). A lot of our users are running semi-real time STT, and this seems to be working well for them.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project