SpeechRecognition
whisper-diarization
Our great sponsors
SpeechRecognition | whisper-diarization | |
---|---|---|
16 | 5 | |
8,040 | 1,985 | |
- | - | |
8.7 | 7.2 | |
11 days ago | about 2 months ago | |
Python | Jupyter Notebook | |
BSD 3-clause "New" or "Revised" License | BSD 2-clause "Simplified" License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
SpeechRecognition
-
help with script (beginner)
Start and Stop Listening Example
-
MacWhisper: Transcribe audio files on your Mac
There is a great library that has support not only with OpenAIs whisper but many others that also work offline. https://github.com/Uberi/speech_recognition
-
Unpopular Opinion: a lot of Obsidian community make Obsidian sound like something cringey/productivity guru-y
This is the library: https://github.com/Uberi/speech_recognition
-
Nvim-VoiceRec : Add Speech-To-Text To Neovim! (useful for gpt)
It is python remote plugin that is a tin wrapper around speech_recognition package.
- Speech-to-text software
-
Voice commands in Doom Eternal possible?
I am less familiar with speech recognition myself. I have implemented something similar many years ago, back when Google had a REST API that allowed you to upload audio and they would respond with the recognized words/sentence. I think they still have the same API available, though. They limited how much you could send, but for voice commands it was pretty solid. However, SpeechRecognition looks like a library worth trying out for this, as that seems like it could do offline processing depending on the underlying library. They also have some examples to look at.
-
Build Simple CLI-Based Voice Assistant with PyAudio, Speech Recognition, pyttsx3 and SerpApi
SpeechRecognition
- Need help with speech recognition
-
Wiki for the podcast
I found this one here
-
How to use my speaker as input and my mic as output?
https://github.com/Uberi/speech_recognition/blob/master/reference/library-reference.rst this might help. I guess your best bet is to rtfm.
whisper-diarization
-
MacWhisper: Transcribe audio files on your Mac
https://github.com/MahmoudAshraf97/whisper-diarization
This project has been alright for transcribing audio with speaker diarization. A big finicky. The OpenAI model is better than other paid products(Descript, Riverside) so Iām looking forward to trying MacWhisper.
-
Faster Whisper Transcription with CTranslate2
The project page mentions whisper-diarization (speaker recognition) as a user of faster-whisper. I've been in the market for that, definitely going to try it out.
https://github.com/MahmoudAshraf97/whisper-diarization
- GitHub - MahmoudAshraf97/whisper-diarization: Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
-
AI or technique for distinguishing between speakers for podcast?
Here
- Services for transcription
What are some alternatives?
pydub - Manipulate audio with a simple and easy high level interface
faster-whisper - Faster Whisper transcription with CTranslate2
pyAudioAnalysis - Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications
whisper-youtube - š Youtube Videos Transcription with OpenAI's Whisper
allosaurus - Allosaurus is a pretrained universal phone recognizer for more than 2000 languages
vosk-api - Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
aeneas - aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)
speechbrain - A PyTorch-based Speech Toolkit
speech-to-text-websockets-python
audapolis - an editor for spoken-word audio with automatic transcription
speechpy - :speech_balloon: SpeechPy - A Library for Speech Processing and Recognition: http://speechpy.readthedocs.io/en/latest/
tinydiarize - Minimal extension of OpenAI's Whisper adding speaker diarization with special tokens