-
Presumably that used whisper's bundled tiny model, which is no better than youtube CC. A beef I have with whisper-cpp is that they totally outsource model management.
With mlx_whisper, I just have to tell it to use a model and it will download it if it's not already present: https://github.com/llimllib/yt-transcribe/blob/244841f83d833...
so if I add whisper.cpp as a dependency, I also have to add huggingface-cli or something similar
-
InfluxDB
InfluxDB high-performance time series database. Collect, organize, and act on massive volumes of high-resolution data to power real-time intelligent systems.
-
-
-
Not as convenient, but you could also have the user manually install the model, like whisper does.
Just forward the error message output by whisper, or even make a more user-friendly error message with instructions on how/where to download the models.
Whisper does provide a simple bash script to download models: https://github.com/ggerganov/whisper.cpp/blob/master/models/...
(As a Windows user, I can run bash scripts via Git Bash for Windows[1])
[1]: https://git-scm.com/download/win
-
Well, thanks to you I found out whisper generates decent audio transcriptions using a local LLM (relatively) easily, even on my 6+ year-old laptop.
(I used to upload videos to YouTube just to get the auto captions.)
I did some investigation, and it would not be difficult to convert the whisper LRC subtitle output into the format my fork of oTranscribe expects.
I already made a simple tool to convert YouTube TTML/SBV subtitle output: https://github.com/Leftium/otrgen
-
CodeRabbit
CodeRabbit: AI Code Reviews for Developers. Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.