Otter.ai has saved reporters hours transcribing interviews. Caveat emptor

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • silero-models

    Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple

  • Silero[0] seems to have decent performance (although you will have to some minimal coding). I believe there are better ones if you're willing to tinker a bit more.

    [0]: https://github.com/snakers4/silero-models

  • vscode-ltex

    LTeX: Grammar/spell checker :mag::heavy_check_mark: for VS Code using LanguageTool with support for LaTeX :mortar_board:, Markdown :pencil:, and others

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • mp4grep

    mp4grep is a CLI for transcribing and searching audio/video files

  • The output is more intended for captioning so it's lots of short phrases with timestamps and no punctuation, but it'll give you a quick taste of what Vosk can do.

    [1] https://github.com/o-oconnell/mp4grep

  • TTS

    🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

  • The Mozilla DeepSpeech spin-off Coqui has an STT that is locally installable:

    https://coqui.ai/

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts