Why, in 2022, is there no high quality method for voice control of a PC?

Our great sponsors

InfluxDB - Power Real-Time Data Analytics at Scale

WorkOS - The modern identity platform for B2B SaaS

SaaSHub - Software Alternatives and Reviews

Our great sponsors

kaldi-active-grammar

10 329 0.0 Python

Python Kaldi speech recognition with grammars that can be set active/inactive dynamically at decode-time

"Everything other than talon has terrible latency": False! I develop kaldi-active-grammar (https://github.com/daanzu/kaldi-active-grammar), a free and open source speech recognition backend, which has extremely low latency. You can adjust how aggressive the VAD (voice activity detection) is to suit your preference, but the speech engine latency can be almost negligible, especially for voice commands (vs prose dictation). However, I agree that "most existing speech recognition engines were not designed with the kind of latency you want for quick one syllable commands", and that low latency is pivotal to being productive with voice commands. I also agree with your other points.

cursorless-talon

1 99 7.7 Python

The cursor never loved you anyway (by cursorless-dev)
InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
nerd-dictation

28 1,158 3.6 Python

Simple, hackable offline speech to text - using the VOSK-API.

I’ve been messing around with https://github.com/ideasman42/nerd-dictation which, with the big model, gives surprisingly accurate local detections. Definitely more diy/hacker focused than actually being a solution though.

Voice-Recognition-using-Deep-Learning

1 1 10.0 Python

Voice Recognition using Deep Learning

Now top that off with accents, like a Hispanic person, or regional slang. Deep Learning kits like https://github.com/FreddieAbad/Voice-Recognition-using-Deep-... are making headway but still far from general voice recognition

Voice-Recognition-using-Deep-

1 - -

Now top that off with accents, like a Hispanic person, or regional slang. Deep Learning kits like https://github.com/FreddieAbad/Voice-Recognition-using-Deep-... are making headway but still far from general voice recognition

cursorless

22 1,069 9.5 TypeScript

Don't let the cursor slow you down

Thanks jiehong!
The reason that the hats are always present is that the way to code faster by voice than be keyboard is to speak fluently, minimising pauses, the way we speak regular human languages. If we had to say a command and then wait for the hats to appear, that would break the chain.
Re mapping, we use something called the "Command server", which allows us to use file-based RPC to run commands in VSCode. That way it is easy to send more complex commands, which are required by Cursorless
IntelliJ support is definitely one of the most requested features; once I'm done rewriting some of the core engine I'll probably take a swing at that. Here's the issue that tracks extracting cursorless into a node.js server so that it can be used by other editors: https://github.com/pokey/cursorless-vscode/issues/435

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Launch HN: Aqua Voice (YC W24) – Voice-driven text editor
3 projects | news.ycombinator.com | 26 Mar 2024
Cursorless: Voice Coding at the Speed of Thought
1 project | news.ycombinator.com | 31 Jan 2024
Cursorless is alien magic from the future – Xe Iaso
4 projects | news.ycombinator.com | 9 Nov 2023
Best Emacs tools and set ups for RSI…??
1 project | /r/emacs | 30 Sep 2023
Hands-Free Coding (2020)
1 project | news.ycombinator.com | 19 May 2023

Why, in 2022, is there no high quality method for voice control of a PC?

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
Voice kaldi-asr VSCode speech-recognition vscode-extension
Post date: 28 Jan 2022

kaldi-active-grammar

cursorless-talon

InfluxDB

nerd-dictation

Voice-Recognition-using-Deep-Learning

Voice-Recognition-using-Deep-

cursorless

Related posts

Why, in 2022, is there no high quality method for voice control of a PC?

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com Voice kaldi-asr VSCode speech-recognition vscode-extension Post date: 28 Jan 2022

kaldi-active-grammar

cursorless-talon

InfluxDB

nerd-dictation

Voice-Recognition-using-Deep-Learning

Voice-Recognition-using-Deep-

cursorless

Related posts

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
Voice kaldi-asr VSCode speech-recognition vscode-extension
Post date: 28 Jan 2022