silero-vad vs subsai

silero-vad

Silero VAD: pre-trained enterprise-grade Voice Activity Detector (by snakers4)

subsai

🎞️ Subtitles generation tool (Web-UI + CLI + Python package) powered by OpenAI's Whisper and its variants 🎞️ (by abdeladim-s)

CLI Webui Whisper whisper-ai Subtitles subtitles-generator

Source Code

abdeladim-s.github.io

Suggest alternative

Edit details

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

silero-vad		subsai
	Project
10	Mentions	3
2,866	Stars	1,068
-	Growth	-
6.9	Activity	8.4
10 days ago	Latest Commit	26 days ago
Python	Language	Python
MIT License	License	GNU General Public License v3.0 only

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

silero-vad

Posts with mentions or reviews of silero-vad. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-11-06.

New models and developer products announced at OpenAI DevDay
8 projects | news.ycombinator.com | 6 Nov 2023

>How do you detect speech starting and stopping?
https://github.com/snakers4/silero-vad
[Discussion] Video Translation Task
2 projects | /r/MachineLearning | 13 Jul 2023

you could look into https://github.com/guillaumekln/faster-whisper especially the VAD section (Voice Activity Detector) using https://github.com/snakers4/silero-vad
Using Whisper to transcribe the entire Forensic Files series
5 projects | /r/DataHoarder | 4 Jun 2023

I also had the same synchronization issue, so I wrote a WebUI/CLI that uses Silero-VAD that first splits the audio whenever there a silent portion (or every 30 seconds), and I haven't experienced it since:
Whisper - A new free AI model from OpenAI that can transcribe Japanese (and many other languages) at up to "human level" accuracy
5 projects | /r/LearnJapanese | 22 Sep 2022

By the way, I've updated the WebUI to now also support using Silero VAD to break up the audio into distinct sections, and run Whisper on each section and then combine them into one single transcript/SRT file.
[P] A more detailed post about Silero VAD on The Gradient
1 project | /r/MachineLearning | 19 Feb 2022

The VAD is always available on Github
Silero VAD: pre-trained enterprise-grade voice activity detector
1 project | news.ycombinator.com | 30 Dec 2021
[P] Silero VAD: One voice detector to rule them all
2 projects | /r/MachineLearning | 18 Dec 2021

I also pinned some interesting comments here regarding mobile and IOT usage here - https://github.com/snakers4/silero-vad/issues/37
One voice detector to rule them all
1 project | news.ycombinator.com | 7 Dec 2021

subsai

Posts with mentions or reviews of subsai. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-06-04.

Porting CP/M to the Brother SuperPowerNote Z80 laptop thing [video]
1 project | news.ycombinator.com | 13 Dec 2023

Adding Whisper subtitles was really easy and they're dramatically better than the automatic Google ones (I did it via https://github.com/abdeladim-s/subsai, which was really easy to use). So there is now a reasonably good transcript available in the video comments.
Using Whisper to transcribe the entire Forensic Files series
5 projects | /r/DataHoarder | 4 Jun 2023

take a look at https://github.com/abdeladim-s/subsai
Any good FOSS tool for transcribing audio/video locally?
2 projects | /r/opensource | 22 Apr 2023

I found subsai very intuitive. Gives you the option to use whisper or different implementations of whisper under the hood.

What are some alternatives?

When comparing silero-vad and subsai you can also consider the following projects:

whisper - Robust Speech Recognition via Large-Scale Weak Supervision

yt-whisper - Using OpenAI's Whisper to automatically generate YouTube subtitles

cheetah - On-device streaming speech-to-text engine powered by deep learning

subgen - Autogenerate subtitles using OpenAI Whisper Model via Jellyfin, Plex, Emby, Tautulli, or Bazarr

kaldi-active-grammar - Python Kaldi speech recognition with grammars that can be set active/inactive dynamically at decode-time

ecoute - Ecoute is a live transcription tool that provides real-time transcripts for both the user's microphone input (You) and the user's speakers output (Speaker) in a textbox. It also generates a suggested response using OpenAI's GPT-3.5 for the user to say based on the live transcription of the conversation.

GassistPi - Google Assistant for Single Board Computers

whisper-auto-transcribe - Auto transcribe tool based on whisper

mr-robot - A multi-utility discord bot. Playback hilarious voice tracks on-demand, wiki for anything, turn on/off IoT enabled devices, and more!

auto-subtitle - Automatically generate and overlay subtitles for any video.

hollow-knight-voice-commands - A fun little python tool to play Hollow Knight with only voice commands

mycroft-precise - A lightweight, simple-to-use, RNN wake word listener

silero-vad vs whisper subsai vs yt-whisper silero-vad vs cheetah subsai vs subgen silero-vad vs kaldi-active-grammar subsai vs ecoute silero-vad vs GassistPi subsai vs whisper-auto-transcribe silero-vad vs mr-robot subsai vs auto-subtitle silero-vad vs hollow-knight-voice-commands silero-vad vs mycroft-precise

Compare silero-vad vs subsai and see what are their differences.

silero-vad

subsai

silero-vad

subsai

What are some alternatives?