yt-whisper vs gentle

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

yt-whisper		gentle
	Project
3	Mentions	12
1,316	Stars	1,385
-	Growth	0.9%
0.0	Activity	2.3
4 months ago	Latest Commit	10 days ago
Python	Language	Python
MIT License	License	MIT License

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

yt-whisper

Posts with mentions or reviews of yt-whisper. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-12-30.

me in real life
6 projects | /r/Piracy | 30 Dec 2022

#28 transcribe local files
YouTubeTranscript.com
8 projects | news.ycombinator.com | 18 Dec 2022

Or even better, yt-whisper, which uses OpenAI's Whisper speech to text. I guess it'd be better to first check whether the video has captions first before Whispering, so maybe both your command and this one could be used together.
https://github.com/m1guelpf/yt-whisper
[P] Transcribe any podcast episode in just 1 minute with optimized OpenAI/whisper
4 projects | /r/MachineLearning | 6 Nov 2022

With minimal changes to https://github.com/m1guelpf/yt-whisper i got a setup to transcribe subs from YouTube videos or local files bit it might take an hour or so running the large model on my CPU.

gentle

Posts with mentions or reviews of gentle. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-12-18.

I'm looking for a way to automate an animation based on an audio file, so that it "flipbooks" a character's mouth just by flipping between a handful of frames based on the audio file's volume.
1 project | /r/learnpython | 16 Feb 2023

Gentle actually works on Linux, there's just no pre-built binary: https://github.com/lowerquality/gentle
Looking for a tool that can synthesis my own voice in text to speech.
1 project | /r/VocalSynthesis | 25 Jan 2023

I just found Gentle yesterday and it looks like it might be a free tool to do what you’re looking for: https://lowerquality.com/gentle/
YouTubeTranscript.com
8 projects | news.ycombinator.com | 18 Dec 2022

Thank you!
Yes, exactly. We do forced alignment when you edit your transcript. The new words don't have any timestamps, so we need to align them. For short sections we use interpolation. If we need align whole sections we use Gentle[^1].
[^1]: https://github.com/lowerquality/gentle
[D] Voice recording to phonemes with timestamps? (Colab notebook, or...?)
1 project | /r/MachineLearning | 18 Nov 2022

Gentle also has web interface, only for English. Uses DNN acoustic models.
Could I use modern voice-to-text tools to generate LIP files?
1 project | /r/skyrimmods | 7 Jul 2022

I found this old wiki page about the LIP file format used in Fallout 2: https://falloutmods.fandom.com/wiki/LIP_File_Format. FO2 != SSE, but it seems like a LIP file is primarily: what phonemes to use, and when. That could be pretty easily generated by a modern tool like Gentle (built on Kaldi). The trick would then be to tranform Gentle's output to whatever Skyrim expects in a LIP.
The HTML Element
1 project | news.ycombinator.com | 4 Dec 2021

This is neat, and immediately made me think of the annotations that show up when you hit the play button on https://lowerquality.com/gentle/ , but it turns out those are made with absolutely-positioned divs and a lot of offline-precalculated px math.
Automatic lip-sync test, I love how easy it is to hack stuff like this together using the Python API.
1 project | /r/blender | 8 Jun 2021

More info: I used gentle to convert the audio (and a text document with what was said) into a list of phonemes and the times at which they were said. I then used a Python script inside Blender to convert this list into an animation by smoothly gliding to whichever shape key has the current phoneme in its name.
Can anyone please translate this circular writing? Thanks!
1 project | /r/learn_arabic | 4 May 2021
Any software that can annotate (grapheme/phonogram) in a word with the matching phoneme?
1 project | /r/LanguageTechnology | 3 May 2021

Gentle by lowerquality didn't help? (https://github.com/lowerquality/gentle) It returns time aligned phoneme sequences for each word, like 'ice' -> 'ai': t0, 's': t1. I suppose it doesn't tell you which exact letters are paired, but it matches individual words with phonemes using a set vocabulary, the CMU one. (http://www.speech.cs.cmu.edu/tools/lextool.html)
Has anyone used aeneas or Festival TTS for word-level forced alignment? Struggling to get accurate results. Does Festival need to be installed?
1 project | /r/speechrecognition | 12 Apr 2021

We’ve had good results with gentle force alignment. https://github.com/lowerquality/gentle

What are some alternatives?

When comparing yt-whisper and gentle you can also consider the following projects:

subsai - 🎞️ Subtitles generation tool (Web-UI + CLI + Python package) powered by OpenAI's Whisper and its variants 🎞️

audio_alignment - Align various Sanskrit texts and audio

ChatGPT-YouTube-summarizer - This Chrome extension lets you summarize YouTube videos using the ChatGPT.

web-align-audio-text - Ramayana audio/text alignment website

tldwol - Web API that summarizes multimedia from various sources using modern AI tools.

zeroth - Kaldi-based Korean ASR (한국어 음성인식) open-source project

auto-subtitle - Automatically generate and overlay subtitles for any video.

YouWhisper - Convert YouTube videos to text using openai/whisper

malayalam_english_subtitle_generator - Malayalam to English Subtitle Generator for audio files using OpenAI's Whisper.

yt-whisper vs subsai gentle vs audio_alignment yt-whisper vs ChatGPT-YouTube-summarizer gentle vs web-align-audio-text yt-whisper vs tldwol gentle vs zeroth yt-whisper vs auto-subtitle gentle vs YouWhisper yt-whisper vs malayalam_english_subtitle_generator gentle vs ChatGPT-YouTube-summarizer yt-whisper vs audio_alignment yt-whisper vs YouWhisper

Compare yt-whisper vs gentle and see what are their differences.

yt-whisper

gentle

yt-whisper

gentle

What are some alternatives?