Whisper.api: An open source, self-hosted speech-to-text with fast transcription

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • whisper.api

    This project provides an API with user level access support to transcribe speech to text using a finetuned and processed Whisper ASR model.

  • wscribe-editor

    web based editor for subtitles and transcripts

  • Nice! This will be very useful for me. Think I can run this locally can spin a basic telegram bot around it for personal use.

    One issue I faced with all the whisper based transcript generators is that there seems to be no good way to make editing/correcting the generated text with word level timestamp. I created a small web based tool[0] for that.

    By any chance if anyone is looking to edit transcripts generated using whisper, you'd probably find it useful.

    [0] https://github.com/geekodour/wscribe-editor

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • faster-whisper

    Faster Whisper transcription with CTranslate2

  • One caveat here is that whisper.cpp does not offer any CUDA support at all, acceleration is only available for Apple Silicon.

    If you have Nvidia hardware the ctranslate2 based faster-whisper is very very fast: https://github.com/guillaumekln/faster-whisper

  • willow-inference-server

    Open source, local, and self-hosted highly optimized language inference server supporting ASR/STT, TTS, and LLM across WebRTC, REST, and WS

  • ctranslate2 is incredible, I don’t know why it doesn’t get more attention.

    We use it for our Willow Inference Server which has an API that can be used directly like OP project and supports all Whisper models, TTS, etc:

    https://github.com/toverainc/willow-inference-server

    The benchmarks are pretty incredible (largely thanks to ctranslate2).

  • whisper-realtime

    Whisper runs in realtime on a laptop GPU (8GB)

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Distil-Whisper: distilled version of Whisper that is 6 times faster, 49% smaller

    14 projects | news.ycombinator.com | 31 Oct 2023
  • [D] What is the most efficient version of OpenAI Whisper?

    7 projects | /r/MachineLearning | 12 Jul 2023
  • VLLM: 24x faster LLM serving than HuggingFace Transformers

    3 projects | news.ycombinator.com | 20 Jun 2023
  • Show HN: Willow Inference Server: Optimized ASR/TTS/LLM for Willow/WebRTC/REST

    3 projects | news.ycombinator.com | 23 May 2023
  • Easy video transcription and subtitling with Whisper, FFmpeg, and Python

    1 project | news.ycombinator.com | 6 Apr 2024