Do we have good, gpu accelerated, text-to-speech, speech-to-text, image/video-to-text face/object recognition that is open source and self-hosted ?

This page summarizes the projects mentioned and recommended in the original post on /r/selfhosted

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • Photonix

    A modern, web-based photo management server. Run it on your home server and it will let you find the right photo from your collection on any device. Smart filtering is made possible by object recognition, face recognition, location awareness, color analysis and other ML algorithms.

  • CompreFace

    Leading free and open-source face recognition system

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • whisper

    Robust Speech Recognition via Large-Scale Weak Supervision

  • whisper.cpp

    Port of OpenAI's Whisper model in C/C++

  • TTS

    πŸΈπŸ’¬ - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

  • For text to speech I can recommend this: https://github.com/coqui-ai/TTS

  • faster-whisper

    Faster Whisper transcription with CTranslate2

  • Or https://github.com/guillaumekln/faster-whisper

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • [D] What is the most efficient version of OpenAI Whisper?

    7 projects | /r/MachineLearning | 12 Jul 2023
  • Does openai whisper works on termux ?

    2 projects | /r/termux | 26 May 2023
  • Serverless voice chat with Vicuna-13B

    9 projects | news.ycombinator.com | 25 Apr 2023
  • Show HN: Ermine.ai – Record and transcribe speech, 100% client-side (WASM)

    5 projects | news.ycombinator.com | 4 Apr 2023
  • [P] rwkv.cpp: FP16 & INT4 inference on CPU for RWKV language model

    10 projects | /r/MachineLearning | 2 Apr 2023