distil-whisper
setfit
distil-whisper | setfit | |
---|---|---|
9 | 13 | |
3,225 | 2,014 | |
7.0% | 5.5% | |
8.9 | 9.2 | |
15 days ago | 20 days ago | |
Python | Jupyter Notebook | |
MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
distil-whisper
- FLaNK Stack 05 Feb 2024
-
Distil-Whisper: a distilled variant of Whisper that is 6x faster
Training code will be released in the Distil-Whisper repository this week, enabling anyone in the community to distill a Whisper model in their choice of language!
- FLaNK Stack Weekly 06 Nov 2023
-
AI — weekly megathread!
Hugging Face released Distil-Whisper, a distilled version of Whisper that is 6 times faster, 49% smaller, and performs within 1% word error rate (WER) on out-of-distribution evaluation sets [Details].
- Distil-Whisper: distilled version of Whisper that is 6 times faster, 49% smaller
- Distil-Whisper is up to 6x faster than Whisper while performing within 1% Word-Error-Rate on out-of-distribution eval sets
-
Distilling Whisper on 20,000 hours of open-sourced audio data
- GitHub page: https://github.com/huggingface/distil-whisper/tree/main
-
Talk-Llama
Is https://github.com/huggingface/distil-whisper on its way to whisper.cpp?
setfit
- FLaNK Stack 05 Feb 2024
- Smarter Summaries with Finetuning GPT-3.5 and Chain of Density
-
[Discussion] Convince me that this training set contamination is fine (or not)
It did, sorry for the hasty edits! I removed that part b/c I realized that there isn't a compelling-enough reason for me to believe that text similarity is clearly inappropriate. In fact, you can train the Pr(condition | chat) classifier I suggested above using similarity training! Use SetFit for that. In the end you'll get a classifier and a similarity model.
-
Ask HN: What's the best framework for text classification (few-shot learning)?
[3] https://github.com/huggingface/setfit
-
Is it worth using LLMs like GPT-3 for text classification?
There's also kinda related approaches like SetFit which calculate embeddings from pretrained transformer models then then fit a classifier on top of the embeddings. I've yet to try it but it supposedly works well with very few labelled examples.
- LLMs for Text Classification (7B parameters)
- GPT-3 vs GPT-Neo / GPT-J for startup classification
-
Ideas on how to improve classification and scoring using Mean Pooled Sentence Embeddings
You could have a look at setfit.
-
SetFit (Sentence Transformer Fine-tuning) - Fewshot Learning without prompts [D]
Found relevant code at https://github.com/huggingface/setfit + all code implementations here
-
Most Popular AI Research Sept 2022 - Ranked Based On Total GitHub Stars
Efficient Few-Shot Learning Without Prompts https://github.com/huggingface/setfit https://arxiv.org/abs/2209.11055v1
What are some alternatives?
WhisperInput - Offline voice input panel & keyboard with punctuation for Android.
iris - Transformers are Sample-Efficient World Models. ICLR 2023, notable top 5%.
pyvideotrans - Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言,并添加配音
whisper - Robust Speech Recognition via Large-Scale Weak Supervision
vid2cleantxt - Python API & command-line tool to easily transcribe speech-based video files into clean text
VToonify - [SIGGRAPH Asia 2022] VToonify: Controllable High-Resolution Portrait Video Style Transfer
faster-whisper - Faster Whisper transcription with CTranslate2
motion-diffusion-model - The official PyTorch implementation of the paper "Human Motion Diffusion Model"
json-masker - High-performance JSON masker library in Java with no runtime dependencies
git-re-basin - Code release for "Git Re-Basin: Merging Models modulo Permutation Symmetries"
willow - Open source, local, and self-hosted Amazon Echo/Google Home competitive Voice Assistant alternative
storydalle