SpeechRecognition
pydub
Our great sponsors
SpeechRecognition | pydub | |
---|---|---|
16 | 25 | |
7,976 | 8,262 | |
- | - | |
7.5 | 0.0 | |
about 1 month ago | 4 days ago | |
Python | Python | |
BSD 3-clause "New" or "Revised" License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
SpeechRecognition
-
MacWhisper: Transcribe audio files on your Mac
There is a great library that has support not only with OpenAIs whisper but many others that also work offline. https://github.com/Uberi/speech_recognition
-
Nvim-VoiceRec : Add Speech-To-Text To Neovim! (useful for gpt)
It is python remote plugin that is a tin wrapper around speech_recognition package.
-
Build Simple CLI-Based Voice Assistant with PyAudio, Speech Recognition, pyttsx3 and SerpApi
SpeechRecognition
-
BOUNTY OFFERED: Help Me Solve a Linux/AlsaMixer os.system() command issue
I found this github issue from a couple years ago: https://github.com/Uberi/speech_recognition/issues/78
-
Python and Speech recognition
Iβm not who you replied to but I saw the Sphinx integration has a keyword recognizer api: https://github.com/Uberi/speech_recognition/blob/master/examples/special_recognizer_features.py
pydub
- Looking for help with a winamp project please.
-
ChatGPT and Whisper APIs
I doubt it will matter if you're breaking up mid sentence if you pass in the previous as a prompt and split words. This is how Whisper does it internally.
It's not absolutely perfect, but splitting on the word boundary is one line of code with the same package in their docs: https://github.com/jiaaro/pydub/blob/master/API.markdown#sil...
25MB is also a lot. That's 30 minutes to an hour on MP3 at reasonable compression. A 2 hour movie would have three splits.
-
FFmpeg 6.0
Even given an option it can be difficult to find the corresponding documentation, if only because of the many different submodules and encoders and decoders and filters that have o-so-slightly different options. That said, I've just switched from pydub to ffmpeg-python (due to memory issues of the former[1]) and judging from the Jupiter notebook[2] it seems a much more intuitive method of constructing ffmpeg pipelines.
[1] https://github.com/jiaaro/pydub/issues/135
[2] https://github.com/kkroening/ffmpeg-python/tree/master/examp...
-
Download & Trim MP3 from Youtube with Python
With the file downloaded, we're now going to arbitrarily slice it locally (you might have considered wheter it is possible to simply download a clip from youtube; all reliable methods I've found will essentially boil down to downloading the whole and then editing locally). For that we'll use the pydub library:
-
I made a cross-platform command-line app called maestro to play music!
Uses https://github.com/cheofusi/just_playback to play sound. It's actually surprising how hard it was to find a cross-platform Python module to play sound that doesn't require an external dependency like ffmpeg. Even then, modules like https://github.com/jiaaro/pydub don't support features like seeking/scrubbing, which was a must-have for my project.
- What library can be used to cut a mp3 by a certain time segment
-
Removing sections of video that matches a certain frame.
If there isn't sound during the black screen you can use the pydub module's pysilence.detect_silence() method to detect the start and end of video silence. And use a video editing module such as MoviePy to remove those sections.
-
Looking for an audio/music lib
At the moment Pydub looks like the closest match, but if it requires an ffmpeg install it's going to be a problem for pyinstaller building .exes/.apps. Same for some PortAudio wrappers I've found.
-
My first impressions of Pedalboard
Oh, there are plenty of people like me with elaborate systems to do manipulation of audio as numpy arrays, but in terms of modules that people use, the top module for audio processing is probably https://github.com/jiaaro/pydub, which has its own interface and then there are a couple of wrappers for ffmpeg and libav, also not numpy.
-
2021 Apr 5 Stickied ππππππππ thread - Boot problems? Display problems? Networking problems? Need ideas? Get help with these and other questions! π³πΆπΆπ² π―π¬πΉπ¬ ππ°πΉπΊπ»
I'm working on a project that will have the pi download some snippets of audio from dropbox then stitch them together by crossfading them in and out to create a longer track and maybe also add some synth lines then reupload to dropbox. I've looked at some frameworks that could help like the pydub and ffmpeg for audio and dropbox-uploader. Wondering if there are other frameworks out there that might be better suited. Would prefer Java but not opposed to Python. Also if there are any relevant precedents/projects and tutorials please lmk. Thanks!
What are some alternatives?
librosa - Python library for audio and music analysis
pyAudioAnalysis - Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications
ffmpeg-python - Python bindings for FFmpeg - with complex filtering support
mutagen - Python module for handling audio metadata
allosaurus - Allosaurus is a pretrained universal phone recognizer for more than 2000 languages
speech-to-text-websockets-python
aeneas - aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)
audioread - cross-library (GStreamer + Core Audio + MAD + FFmpeg) audio decoding for Python
speechpy - :speech_balloon: SpeechPy - A Library for Speech Processing and Recognition: http://speechpy.readthedocs.io/en/latest/
Watson Developer Cloud Python SDK - :snake: Client library to use the IBM Watson services in Python and available in pip as watson-developer-cloud