pyAudioAnalysis vs SpeechRecognition

pyAudioAnalysis

Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications (by tyiannak)

Source Code

Suggest alternative

Edit details

SpeechRecognition

Speech recognition module for Python, supporting several engines and APIs, online and offline. (by Uberi)

Audio Speech Data Python speech-recognition speech-to-text

Source Code

pypi.python.org

Suggest alternative

Edit details

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

pyAudioAnalysis		SpeechRecognition
	Project
11	Mentions	16
5,668	Stars	8,040
-	Growth	-
5.0	Activity	8.7
26 days ago	Latest Commit	8 days ago
Python	Language	Python
Apache License 2.0	License	BSD 3-clause "New" or "Revised" License

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

pyAudioAnalysis

Posts with mentions or reviews of pyAudioAnalysis. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-07-10.

How would I compare two voice recordings of the same sentence and advise one speaker how to get closer to the second?
2 projects | /r/learnpython | 10 Jul 2023

I actually came up with an el cheapo version of what I want to accomplish that isn't perfect but without any research can implement it and it may actually prove useful to language learners. PM me if you're interested in hearing it and critiquing it. I can share here that I'm using this guy's multiple repos though: https://github.com/tyiannak/pyAudioAnalysis
How do I run code only when an audio file has bass
1 project | /r/learnpython | 27 May 2023
A Python library for audio feature extraction, classification, segmentation and applications
1 project | /r/coding | 26 Mar 2023
Phonetic search for audio files
1 project | /r/audio | 16 Jan 2023

Update: From one researcher to another. I was referred to a Python Audio AI project . Once I determine exactly which module to use I should be smooth sailing. I'll send more updates soon.
Clustering songs with different lengths
1 project | /r/MLQuestions | 23 Aug 2022

Hey folks, I'm looking into clustering audio files with features extracted by pyAudioAnalysis. However, every feature (I'm interested in MFCC, spectral centroid and spread, and BPM) is extracted for each frame of the song (by default 0.05s, excluding BPM that relates to the whole) so tracks with different lengths produce arrays with different shapes.
AUDIO ANALYSIS WITH LIBROSA
2 projects | dev.to | 23 May 2022

To learn more about pyAudioAnalysis here you go.
Creating Audio Features with PyAudio Analysis
1 project | dev.to | 6 Apr 2022

Humans are great at classifying noises. We can hear a chirp and surmise that it belongs to a bird, we can hear an abstract noise and classify it as as speech with a particular meaning and definition. This relationship between humans and audio classification forms the basis of speech and human communication as a whole. Translating this incredible ability to computers on the other hand can be a difficult challenge to say the least. Whilst we can naturally decompose signals, how do we teach computers to do this, and how do we show what parts of the signal matter and what parts of the signal are irrelevant or noisy? This is where PyAudio Analysis comes in. PyAudio Analysis is an open source Python project by Theodoros Giannakopoulos, a Principle researcher of multimodal machine learning at the Multimedia Analysis Group of the Computational Intelligence Lab (MagCIL). The package aims to simplify the feature extraction and classification process by providing a number of helpful tools at can sift through the signal and create relevant features. These features can then be used to train models for classification tasks.
[P] Feature extraction for acoustic signals
1 project | /r/MachineLearning | 16 Jan 2022

This might be relevant, which has a set of feature extraction methods implemented: https://github.com/tyiannak/pyAudioAnalysis/wiki/3.-Feature-Extraction
Hacker News top posts: Dec 11, 2021
3 projects | /r/hackerdigest | 11 Dec 2021

A library for audio feature extraction, regression, classification, segmentation\ (2 comments)
Audio feature extraction, classification, segmentation and applications
2 projects | news.ycombinator.com | 9 Dec 2021

SpeechRecognition

Posts with mentions or reviews of SpeechRecognition. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-08-23.

help with script (beginner)
1 project | /r/learnpython | 7 Dec 2023

Start and Stop Listening Example
MacWhisper: Transcribe audio files on your Mac
8 projects | news.ycombinator.com | 23 Aug 2023

There is a great library that has support not only with OpenAIs whisper but many others that also work offline. https://github.com/Uberi/speech_recognition
Unpopular Opinion: a lot of Obsidian community make Obsidian sound like something cringey/productivity guru-y
1 project | /r/ObsidianMD | 14 May 2023

This is the library: https://github.com/Uberi/speech_recognition
Nvim-VoiceRec : Add Speech-To-Text To Neovim! (useful for gpt)
4 projects | /r/neovim | 28 Apr 2023

It is python remote plugin that is a tin wrapper around speech_recognition package.
Speech-to-text software
1 project | /r/opensource | 15 Feb 2023
Voice commands in Doom Eternal possible?
1 project | /r/linux_gaming | 23 Dec 2022

I am less familiar with speech recognition myself. I have implemented something similar many years ago, back when Google had a REST API that allowed you to upload audio and they would respond with the recognized words/sentence. I think they still have the same API available, though. They limited how much you could send, but for voice commands it was pretty solid. However, SpeechRecognition looks like a library worth trying out for this, as that seems like it could do offline processing depending on the underlying library. They also have some examples to look at.
Build Simple CLI-Based Voice Assistant with PyAudio, Speech Recognition, pyttsx3 and SerpApi
7 projects | dev.to | 28 Nov 2022

SpeechRecognition
Need help with speech recognition
1 project | /r/learnpython | 4 Jul 2022
Wiki for the podcast
1 project | /r/Cortex | 3 Apr 2022

I found this one here
How to use my speaker as input and my mic as output?
1 project | /r/Python | 1 Jan 2022

https://github.com/Uberi/speech_recognition/blob/master/reference/library-reference.rst this might help. I guess your best bet is to rtfm.

What are some alternatives?

When comparing pyAudioAnalysis and SpeechRecognition you can also consider the following projects:

librosa - Python library for audio and music analysis

pydub - Manipulate audio with a simple and easy high level interface

allosaurus - Allosaurus is a pretrained universal phone recognizer for more than 2000 languages

pyAcoustics - A collection of python scripts for extracting and analyzing acoustics from audio files.

aeneas - aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)

mingus - Mingus is a music package for Python

speech-to-text-websockets-python

Watson Developer Cloud Python SDK - :snake: Client library to use the IBM Watson services in Python and available in pip as watson-developer-cloud

speechpy - :speech_balloon: SpeechPy - A Library for Speech Processing and Recognition: http://speechpy.readthedocs.io/en/latest/