Whisper vs Cgml

Whisper

High-performance GPGPU inference of OpenAI's Whisper automatic speech recognition (ASR) model (by Const-me)

Suggest topics

Source Code

Suggest alternative

Edit details

Cgml

GPU-targeted vendor-agnostic AI library for Windows, and Mistral model implementation. (by Const-me)

Suggest topics

Source Code

Suggest alternative

Edit details

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

Whisper		Cgml
	Project
32	Mentions	22
7,182	Stars	38
-	Growth	-
6.5	Activity	8.6
7 months ago	Latest Commit	4 months ago
C++	Language	C++
Mozilla Public License 2.0	License	GNU Lesser General Public License v3.0 only

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

Whisper

Posts with mentions or reviews of Whisper. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-12-17.

Nvidia Speech and Translation AI Models Set Records for Speed and Accuracy
1 project | news.ycombinator.com | 18 Apr 2024

I've been using WhisperDesktop ( https://github.com/Const-me/Whisper ) with great success on a 3090 for fast & accurate transcription of often poor quality euro-english hours long multispeaker audio files. If there's an easy way to compare I'm certainly going to give this a try.
AMD's CDNA 3 Compute Architecture
7 projects | news.ycombinator.com | 17 Dec 2023

Why would you want OpenCL? Pretty sure D3D11 compute shaders gonna be adequate for a Torch backend, and they even work on Linux with Wine: https://github.com/Const-me/Whisper/issues/42 Native Vulkan compute shaders would be even better.
Why would you want unified address space? At least in my experience, it’s often too slow to be useful. DMA transfers (CopyResource in D3D11, copy command queue in D3D12, transfer queue in VK) are implemented by dedicated hardware inside GPUs, and are way more efficient.
Amazon Bedrock Is Now Generally Available
2 projects | news.ycombinator.com | 28 Sep 2023

https://github.com/ggerganov/whisper.cpp
https://github.com/Const-me/Whisper
I had fun with both of these. They will both do realtime transcription. Bit you will have to download the training data sets…
Why Nvidia Keeps Winning: The Rise of an AI Giant
3 projects | news.ycombinator.com | 6 Jul 2023

Gamers don’t care about FP64 performance, and it seems nVidia is using that for market segmentation. The FP64 performance for RTX 4090 is 1.142 TFlops, for RTX 3090 Ti 0.524 TFlops. AMD doesn’t do that, FP64 performance is consistently better there, and have been this way for quite a few years. For example, the figure for 3090 Ti (a $2000 card from 2022) is similar to Radeon RX Vega 56, a $400 card from 2017 which can do 0.518 TFlops.
And another thing: nVidia forbids usage of GeForce cards in data centers, while AMD allows that. I don’t know how specifically they define datacenter, whether it’s enforceable, or whether it’s tested in courts of various jurisdictions. I just don’t want to find out answers to these questions at the legal expenses of my employer. I believe they would prefer to not cut corners like that.
I think nVidia only beats AMD due to the ecosystem: for GPGPU that’s CUDA (and especially the included first-party libraries like BLAS, FFT, DNN and others), also due to the support in popular libraries like TensorFlow. However, it’s not that hard to ignore the ecosystem, and instead write some compute shaders in HLSL. Here’s a non-trivial open-source project unrelated to CAE, where I managed to do just that with decent results: https://github.com/Const-me/Whisper That software even works on Linux, probably due to Valve’s work on DXVK 2.0 (a compatibility layer which implements D3D11 on top of Vulkan).
Ask HN: What is your recommended speech to text/audio transcription tool?
1 project | news.ycombinator.com | 12 Jun 2023

Currently, I use a GUI for Whisper AI (https://github.com/Const-me/Whisper) to upload MP3s of interviews to get text transcripts. However, I'm hoping to find another tool that would recognize and split out the text per speaker.
Does such a thing exist?
Da audio a testo, consigli?
1 project | /r/Universitaly | 8 Jun 2023
Ask HN: Any recommendations for cheap, high-quality transcription software
2 projects | news.ycombinator.com | 29 May 2023

I just used Whisper over the weekend to transcribe 5 hours of meeting, worked nicely and it can be run on a single GPU locally. https://github.com/ggerganov/whisper.cpp
There are a few wrappers available with GUI like https://github.com/Const-me/Whisper
Voice recognition software for German
2 projects | /r/software | 20 May 2023
Const-me/Whisper: High-performance GPGPU inference of OpenAI's Whisper automatic speech recognition (ASR) model
1 project | /r/thirdbrain | 15 May 2023
I built a massive search engine to find video clips by spoken text
3 projects | /r/videos | 10 May 2023

Cgml

Posts with mentions or reviews of Cgml. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-04-30.

Asynchronous Programming in C#
9 projects | news.ycombinator.com | 30 Apr 2024

> Meant no offense
None taken.
> computervison project in c#
Yeah, for CV applications nuget.org is indeed not particularly great. Very few people are using C# for these things, people typically choose something else like Python and OpenCV.
BTW, same applies to ML libraries, most folks are using Python/Torch/CUDA stack. For that hobby project https://github.com/Const-me/Cgml/ I had to re-implement the entire tech stack in C#/C++/HLSL.
Groq CEO: 'We No Longer Sell Hardware'
2 projects | news.ycombinator.com | 7 Apr 2024

> If there is a future with this idea, its gotta be just shipping the LLM with game right?
That might be a nice application for this library of mine: https://github.com/Const-me/Cgml/
That’s an open source Mistral ML model implementation which runs on GPUs (all of them, not just nVidia), takes 4.5GB on disk, uses under 6GB of VRAM, and optimized for interactive single-user use case. Probably fast enough for that application.
You wouldn’t want in-game dialogues with the original model though. Game developers would need to finetune, retrain and/or do something else with these weights and/or my implementation.
Ask HN: How to get started with local language models?
6 projects | news.ycombinator.com | 17 Mar 2024

If you just want to run Mistral on Windows, you could try my port: https://github.com/Const-me/Cgml/tree/master/Mistral/Mistral...
The setup is relatively easy: install .NET runtime, download 4.5 GB model file from BitTorrent, unpack a small ZIP file and run the EXE.
OpenAI postmortem – Unexpected responses from ChatGPT
1 project | news.ycombinator.com | 22 Feb 2024

Speaking about random sampling during inference, most ML models are doing it rather inefficiently.
Here’s a better way: https://github.com/Const-me/Cgml/blob/master/Readme.md#rando...
My HLSL is easily portable to CUDA, which has `__syncthreads` and `atomicInc` intrinsics.
Nvidia's Chat with RTX is a promising AI chatbot that runs locally on your PC
7 projects | news.ycombinator.com | 13 Feb 2024
AMD Funded a Drop-In CUDA Implementation Built on ROCm: It's Open-Source
23 projects | news.ycombinator.com | 12 Feb 2024

I did a few times with Direct3D 11 compute shaders. Here’s an open-source example: https://github.com/Const-me/Cgml
Pretty sure Vulkan gonna work equally well, at the very least there’s an open source DXVK project which implements D3D11 on top of Vulkan.
Brave Leo now uses Mixtral 8x7B as default
7 projects | news.ycombinator.com | 27 Jan 2024

Here’s an example of a custom 4 bits/weight codec for ML weights:
https://github.com/Const-me/Cgml/blob/master/Readme.md#bcml1...
llama.cpp does it slightly differently but still, AFAIK their quantized data formats are conceptually similar to my codec.
Efficient LLM inference solution on Intel GPU
3 projects | news.ycombinator.com | 20 Jan 2024
Vcc – The Vulkan Clang Compiler
9 projects | news.ycombinator.com | 9 Jan 2024

> the API was high-friction due to the shader language, and the glue between shader and CPU
Direct3D 11 compute shaders share these things with Vulkan, yet D3D11 is relatively easy to use. For example, see that library which implements ML-targeted compute shaders for C# with minimal friction: https://github.com/Const-me/Cgml The backend implemented in C++ is rather simple, just binds resources and dispatches these shaders.
I think the main usability issue with Vulkan is API design. Vulkan was only designed with AAA game engines in mind. The developers of these game engines have borderline unlimited budgets, and their requirements are very different from ordinary folks who want to leverage GPU hardware.
I made an app that runs Mistral 7B 0.2 LLM locally on iPhone Pros
12 projects | news.ycombinator.com | 7 Jan 2024

Minor update https://github.com/Const-me/Cgml/releases/tag/1.1a Can’t edit that comment anymore, too late.

What are some alternatives?

When comparing Whisper and Cgml you can also consider the following projects:

whisper.cpp - Port of OpenAI's Whisper model in C/C++

PowerInfer - High-speed Large Language Model Serving on PCs with Consumer-grade GPUs

whisper - Robust Speech Recognition via Large-Scale Weak Supervision

ollama - Get up and running with Llama 3, Mistral, Gemma, and other large language models.

TransformerEngine - A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference.

mlx - MLX: An array framework for Apple silicon

just-an-email - App to share files & texts between your devices without installing anything

EmotiVoice - EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine

ggml - Tensor library for machine learning

llamafile - Distribute and run LLMs with a single file.

beaker - An experimental peer-to-peer Web browser

clspv - Clspv is a compiler for OpenCL C to Vulkan compute shaders