How can I create a dataset to refine Whisper AI from old videos with subtitles?

This page summarizes the projects mentioned and recommended in the original post on /r/OpenAI

SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • finetuner

    Discontinued :dart: Task-oriented embedding tuning for BERT, CLIP, etc.

    You can try creating your own dataset. Get some audio data that you want, preprocess it, and then create a custom dataset you can use to fine tune. You could use finetuners like these if you want as well.

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  • community-events

    Place where folks can contribute to 🤗 community events

    For the training, I extremely recommend checking out the Whisper Fine-Tuning Event. It has a python script to train in one command, tons of tips, even a walkthrough video.

  • mimic-recording-studio

    Mimic Recording Studio is a Docker-based application you can install to record voice samples, which can then be trained into a TTS voice with Mimic2

    I weirdly can't find a great off-the-shelf app for this. l'd love to know if anyone finds one. Most stuff seems to be for recording data for Text To Speech (going the other way). Mimic Recording Studio looks the best. Then there's speech training recorder and TTS Dataset Creator (video). You don't have to worry about audio quality as much as they do.

  • speech-training-recorder

    Simple GUI application to help record audio dictated from given text prompts, for use with training speech recognition or speech synthesis.

    I weirdly can't find a great off-the-shelf app for this. l'd love to know if anyone finds one. Most stuff seems to be for recording data for Text To Speech (going the other way). Mimic Recording Studio looks the best. Then there's speech training recorder and TTS Dataset Creator (video). You don't have to worry about audio quality as much as they do.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Mimic 3 by Mycroft

    6 projects | news.ycombinator.com | 29 Jun 2022
  • How do you think search will change with technology like ChatGPT, Bing’s new AI search engine and the upcoming Google Bard?

    1 project | /r/singularity | 21 Feb 2023
  • Combining multiple lists into one, meaningfully

    1 project | /r/GPT3 | 17 Feb 2023
  • speech_recognition not able to convert the full live audio to text. Please help me to fine-tune it.

    1 project | /r/MLQuestions | 17 Feb 2023
  • Questions about fine-tuned results. Should the completion results be identical to fine-tune examples?

    1 project | /r/OpenAI | 17 Feb 2023