Do we have good, gpu accelerated, text-to-speech, speech-to-text, image/video-to-text face/object recognition that is open source and self-hosted ?

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

Photonix

54 1,765 0.0 Python

A modern, web-based photo management server. Run it on your home server and it will let you find the right photo from your collection on any device. Smart filtering is made possible by object recognition, face recognition, location awareness, color analysis and other ML algorithms.
CompreFace

28 4,057 7.7 Java

Leading free and open-source face recognition system
InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
whisper

344 60,617 6.4 Python

Robust Speech Recognition via Large-Scale Weak Supervision
whisper.cpp

187 31,426 9.8 C

Port of OpenAI's Whisper model in C/C++
TTS

231 29,420 9.4 Python

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

For text to speech I can recommend this: https://github.com/coqui-ai/TTS

faster-whisper

23 8,899 8.1 Python

Faster Whisper transcription with CTranslate2

Or https://github.com/guillaumekln/faster-whisper

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

[D] What is the most efficient version of OpenAI Whisper?

7 projects | /r/MachineLearning | 12 Jul 2023
Does openai whisper works on termux ?

2 projects | /r/termux | 26 May 2023
Serverless voice chat with Vicuna-13B

9 projects | news.ycombinator.com | 25 Apr 2023
Show HN: Ermine.ai – Record and transcribe speech, 100% client-side (WASM)

5 projects | news.ycombinator.com | 4 Apr 2023
[P] rwkv.cpp: FP16 & INT4 inference on CPU for RWKV language model

10 projects | /r/MachineLearning | 2 Apr 2023

Do we have good, gpu accelerated, text-to-speech, speech-to-text, image/video-to-text face/object recognition that is open source and self-hosted ?

This page summarizes the projects mentioned and recommended in the original post on /r/selfhosted
face-recognition Python Deep Learning speech-to-text Transformer
Post date: 25 Mar 2023

Photonix

CompreFace

InfluxDB

whisper

whisper.cpp

TTS

faster-whisper

Related posts

[D] What is the most efficient version of OpenAI Whisper?

Does openai whisper works on termux ?

Serverless voice chat with Vicuna-13B

Show HN: Ermine.ai – Record and transcribe speech, 100% client-side (WASM)

[P] rwkv.cpp: FP16 & INT4 inference on CPU for RWKV language model

Do we have good, gpu accelerated, text-to-speech, speech-to-text, image/video-to-text face/object recognition that is open source and self-hosted ?

This page summarizes the projects mentioned and recommended in the original post on /r/selfhosted face-recognition Python Deep Learning speech-to-text Transformer Post date: 25 Mar 2023

Photonix

CompreFace

InfluxDB

whisper

whisper.cpp

TTS

faster-whisper

Related posts

[D] What is the most efficient version of OpenAI Whisper?

Does openai whisper works on termux ?

Serverless voice chat with Vicuna-13B

Show HN: Ermine.ai – Record and transcribe speech, 100% client-side (WASM)

[P] rwkv.cpp: FP16 &amp; INT4 inference on CPU for RWKV language model

This page summarizes the projects mentioned and recommended in the original post on /r/selfhosted
face-recognition Python Deep Learning speech-to-text Transformer
Post date: 25 Mar 2023

[P] rwkv.cpp: FP16 & INT4 inference on CPU for RWKV language model