AI Runner: real time voice to text conversation with PC preview [video]

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

airunner

25 192 9.9 Python

Stable Diffusion and LLMs offline on your own hardware

Hi all, I've been working on this application for quite some time. It started out as a Stable Diffusion art app, and is now transitioning into a full featured AI assistant of sorts.
The video features me talking to the computer using my headphones. It records my speech, translates bytes to text, passes that to the LLM which generates text, and then uses another model for text to speech.
The video also shows me asking for an image at which point the LLM generates a prompt. Stable Diffusion is loaded and the prompt is passed to SD to generate the image.
The models I'm using:
- TTS: SpeechT5
- LLM: Mistral 7b
- Stable Diffusion: Turbo
- STT: whisper-tiny
- Vision: various, still in development
As I mentioned there at the end, vision is still in development. I have a working prototype in which images are taken every second, translated into text and then passed to my chat prompt. It works OK but is often wrong.
The project is open source under GPL-3, written with Python using PyQT6. You can find it here:
https://github.com/Capsize-Games/airunner
The compiled stable version is available for download on itch, but only includes image generation capabilities, everything else is in the unreleased 3.0.0 version
https://capsizegames.itch.io/ai-runner

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project