talk
awesome-talking-head-generation | talk | |
---|---|---|
2 | 3 | |
1,153 | 557 | |
- | - | |
6.8 | 8.1 | |
10 days ago | 7 months ago | |
TypeScript | ||
- | - |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
awesome-talking-head-generation
-
Ask HN: How does Heygen AI video generation works?
I assume it's a SOTA version of "talking head generation" or something related.
https://paperswithcode.com/task/talking-head-generation
https://github.com/harlanhong/awesome-talking-head-generatio...
-
ChatGPT can now see, hear, and speak – openai.com
As soon as they release the API, we can build an AI "bartender". Combine the voice output and input with NeRF talking heads such as from Diarupt or https://github.com/harlanhong/awesome-talking-head-generatio....
You will now be able to feed it images and responses of the customers. Give it a function to call complementaryDrink(customerId)
talk
-
ChatGPT can now see, hear, and speak – openai.com
Also curious to hear about your setup. Using whisper too? When I was experimenting with it there was still a lot of annoyance about hallucinations and I was hard coding some "if last phrase is 'thanks for watching', ignore last phrase"
I was just googling a bit to see what's out there now for whisper/llama combos and came across this: https://github.com/yacineMTB/talk
There's a demo linked on the github page that seems relatively fast at responding conversationally, but still maybe 1-2 seconds at times. Impressive it's entirely offline.
- Is anyone doing always-on voice to text with a local llama at home?
-
Giving LLM’s a <Backspace> Token
Here’s a project attempting to do just this!
https://github.com/yacineMTB/talk
What are some alternatives?
chatcraft.org - Developer-oriented ChatGPT clone
llama_farm - Use local llama LLM or openai to chat, discuss/summarize your documents, youtube videos, and so on.
CVPR2022-DaGAN - Official code for CVPR2022 paper: Depth-Aware Generative Adversarial Network for Talking Head Video Generation
willow - Open source, local, and self-hosted Amazon Echo/Google Home competitive Voice Assistant alternative
vllm - A high-throughput and memory-efficient inference and serving engine for LLMs
whisper-live-transcription - Live-Transcription (STT) with Whisper PoC
nerd-dictation - Simple, hackable offline speech to text - using the VOSK-API.
awesome-talking-head-generatio