Workers AI: serverless GPU-powered inference on Cloudflare’s global network

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • InfluxDB - Collect and Analyze Billions of Data Points in Real Time
  • Onboard AI - Learn any GitHub repo in 59 seconds
  • SaaSHub - Software Alternatives and Reviews
  • discourse-ai

    Embedding cost and model choice makes this a very compelling choice. I'm working on leveraging embeddings in https://github.com/discourse/discourse-ai where it powers offering related topics, semantic search, tag and category recommendations among other things.

    A cheap offering like this can make it a lot more reasonable for self-hosters.

  • whisper-turbo

    Cross-Platform, GPU Accelerated Whisper 🏎️

    Whisper large is only 1.5B params, why not run it client side with something like https://github.com/FL33TW00D/whisper-turbo

    (Disclaimer: I am the author)

  • InfluxDB

    Collect and Analyze Billions of Data Points in Real Time. Manage all types of time series data in a single, purpose-built database. Run at any scale in any environment in the cloud, on-premises, or at the edge.

  • get-beam

    This is the home of the Beam CLI binaries, as well as a collection of example apps built with Beam

    Serverless only works if the cold boot is fast. For context, my company runs a serverless cloud GPU product called https://beam.cloud, which we've optimized for fast cold start. We see Whisper in production cold start in under 10s (across model sizes). A lot of our users are running semi-real time STT, and this seems to be working well for them.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts