Show HN: Willow – Open-Source Privacy-Focused Voice Assistant Hardware

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • willow

    Open source, local, and self-hosted Amazon Echo/Google Home competitive Voice Assistant alternative

  • The Home Assistant project (as part of the "year of voice") is working on wake word, etc for Raspberry Pi from what I understand. However, as someone who's tried to do exactly this on a Raspberry Pi before supporting wake word and getting clean audio from 25 feet away with background noise, acoustic echo, etc with a random collection of software and hardware is very challenging. I have an entire graveyard of mic arrays from seeed and others myself :).

    Espressif really did us all a solid with this hardware and their ADF and SR frameworks.

    Whether it's cost, being fully assembled and ready to go, and even wake word, AEC, AGC, BSS, NS, etc at least as of now the ESP BOX is essentially impossible to compete with in terms of hardware in the open ecosystem.

    I talk about this and more on our wiki pages[0] (check out "Hardware" and "Home Assistant"). In short, the Espressif frameworks we use /technically/ support the "regular" ESP32 but it's so limited (and the ESP BOX/ESP S3 is so cheap) we're not super interested in supporting it.

    We're aiming for an end-user experience that's competitive with Echo, Google Home, etc in every possible way - quality, reliability, functionality, and cost.

    [0] - https://github.com/toverainc/willow/wiki/

  • esp-box

    The ESP-BOX is a new generation AIoT development platform released by Espressif Systems.

  • As the Home Assistant project says, it's the year of voice!

    I love Home Assistant and I've always thought the ESP BOX[0] hardware is cool. I finally got around to starting a project to use the ESP BOX hardware with Home Assistant and other platforms. Why?

    - It's actually "Alexa/Echo competitive". Wake word detection, voice activity detection, echo cancellation, automatic gain control, and high quality audio for $50 means with Willow and the support of Home Assistant there are no compromises on looks, quality, accuracy, speed, and cost.

    - It's cheap. With a touch LCD display, dual microphones, speaker, enclosure, buttons, etc it can be bought today for $50 all-in.

    - It's ready to go. Take it out of the box, flash with Willow, put it somewhere.

    - It's not creepy. Voice is either sent to a self-hosted inference server or commands are recognized locally on the ESP BOX.

    - It doesn't hassle or try to sell you. If I hear "Did you know?" one more time from Alexa I think I'm going to lose it.

    - It's open source.

    - It's capable. This is the first "release" of Willow and I don't think we've even begun scratching the surface of what the hardware and software components are capable of.

    - It can integrate with anything. Simple on the wire format - speech output text is sent via HTTP POST to whatever URI you configure. Send it anywhere, and do anything!

    - It still does cool maker stuff. With 16 GPIOs exposed on the back of the enclosure there are all kinds of interesting possibilities.

    This is the first (and VERY early) release but we're really interested to hear what HN thinks!

    [0] - https://github.com/espressif/esp-box

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • piper

    A fast, local neural text to speech system (by rhasspy)

  • esp-sr

    Speech recognition

  • For wake word and voice activity detection, audio processing, etc we use the ESP SR (speech recognition) framework from Espressif[0].

    For speech to text there are two options and more to come:

    1) Completely on device command recognition using the ESP SR Multinet 6 model. Willow will (currently) pull your light and switch entities from Home Assistant and generate the grammar and command definition required by Multinet. We want to develop a Willow Home Assistant component that will provide tighter Willow integration with HA and allow users to do this point and click with dynamic updates for new/changed entities, different kinds of entities, etc all in the HA dashboard/config.

    The only "issue" with Multinet is that it only supports 400 defined commands. You're not going to get something like "What's the weather like in $CITY?" out of it.

    For that we have:

    2-?) Our own highly optimized inference server using Whisper, LLamA/Vicuna, and Speecht5 from transformers (more to come soon). We're open sourcing it next week. Willow streams audio after wake in realtime, gets the STT output, and sends it wherever you want. With the Willow Home Assistant component (doesn't exist yet) it will sit in between our inference server implementation doing STT/TTS or any other STT/TTS implementation supported by Home Assistant and handle all of this for you.

    [0] - https://github.com/espressif/esp-sr

  • mycroft-core

    Mycroft Core, the Mycroft Artificial Intelligence platform.

  • This project reminds me of MyCroft https://github.com/MycroftAI/mycroft-core.

  • vosk-api

    Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

  • first, good initiative! thanks for sharing. i think you gotta be more diligent and careful with the problem statement.

    checking the weather in Sofia, Bulgaria requires cloud, current information. it's not "random speech". ESP SR capability issues don't mean that you cannot process it locally.

    the comment was on "voice processing" i.e. sending speech to the cloud, not sending a call request to get the weather information.

    besides, local intent detection, beyond 400 commands, there are great local STT options, working better than most cloud STTs for "random speech"

    https://github.com/alphacep/vosk-api

  • noise

    Go implementation of the Noise Protocol Framework

  • With regard to this:

    > - On the wire/protocol stuff. We're doing pretty rudimentary "open new connection, stream voice, POST somewhere". This adds extra latency and CPU usage because of repeated TLS handshakes, etc. We have plans to use Websockets and what-not to cut down on this.

    I've recently used the Noise protocol[1] to do some encrypted communication between two services I control but separated by the internet.

    It was surprisingly easy!

    [1]: https://noiseprotocol.org/

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • nixpkgs-esp-dev

    Nix flake and overlay for ESP8266 and ESP32 development.

  • If you are open to Nix, you can try https://github.com/mirrexagon/nixpkgs-esp-dev. I used it for a small project a while ago and the experience was pretty good.

  • esp-web-tools

    Open source tools to allow working with ESP devices in the browser

  • Some feedback to make your project easier to install and integrate better with Home Assistant (I'm the founder):

    Home Assistant is building a voice assistant as part of our Year of the Voice theme. https://www.home-assistant.io/blog/2023/04/27/year-of-the-vo...

    As part of our recent chapter 2 milestone, we introduced new Assist Pipelines. This allows users to configure multiple voice assistants. Your project is using the old "conversation" API. Instead it should use our new assist pipelines API. Docs: https://developers.home-assistant.io/docs/voice/pipelines/

    You can even off-load the STT and TTS fully to Home Assistant and only focus on wake words.

    You will see a lot higher adoption rate if users can just buy the ESP BOX and install the software on it without installing/compiling stuff. That's exactly why we created ESP Web Tools. It offers projects to offer browser-based installation directly from their website. https://esphome.github.io/esp-web-tools/

    If you're going the ESP Web Tools route (and you should!), we've also created Improv Wi-Fi, a small protocol to configure Wi-Fi on the ESP device. This will allow ESP Web Tools to offer an onboarding wizard in the browser once the software has been installed. More info at https://www.improv-wifi.com/

    Good luck!

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts