-
SurveyJS
Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App. With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js.
-
willow
Open source, local, and self-hosted Amazon Echo/Google Home competitive Voice Assistant alternative
As soon as they release the API, we can build an AI "bartender". Combine the voice output and input with NeRF talking heads such as from Diarupt or https://github.com/harlanhong/awesome-talking-head-generatio....
You will now be able to feed it images and responses of the customers. Give it a function to call complementaryDrink(customerId)
As soon as they release the API, we can build an AI "bartender". Combine the voice output and input with NeRF talking heads such as from Diarupt or https://github.com/harlanhong/awesome-talking-head-generatio....
You will now be able to feed it images and responses of the customers. Give it a function to call complementaryDrink(customerId)
openai chatgpt seems to be stuck in a "Look, cool demo" mode.
1. According to demo, they seem to pair voice input with TTS output. What if I wanna use voice to describe a program I want it to write?
2. Furthermore, if you gonna do a voice assistant, why not go the full way with wake-words and VAD?
3. Not releasing it to everyone is potentially a way to create a hype cycle prior to users discovering that the multimodality is rather meh.
4. The bike demo could actually use visual feedback to see what it's talking about ala segment anything. It's pretty confusing to get a paragraph explanation of what tool to pick.
In my https://chatcraft.org, we added voice incrementally. So i can swap typing and voice. We can also combine it with function-calling, etc. We also use openai apis. Except in our case there is no weird waitlist. You pop in your api key and get access to voice input immediately.
Also curious to hear about your setup. Using whisper too? When I was experimenting with it there was still a lot of annoyance about hallucinations and I was hard coding some "if last phrase is 'thanks for watching', ignore last phrase"
I was just googling a bit to see what's out there now for whisper/llama combos and came across this: https://github.com/yacineMTB/talk
There's a demo linked on the github page that seems relatively fast at responding conversationally, but still maybe 1-2 seconds at times. Impressive it's entirely offline.
Here's a link to a project that claims half second latency for the transcription part: https://github.com/gaborvecsei/whisper-live-transcription