Top 5 gemm Open-Source Projects
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
laser
The HPC toolbox: fused matrix multiplication, convolution, data-parallel strided tensor primitives, OpenMP facilities, SIMD, JIT Assembler, CPU detection, state-of-the-art vectorized BLAS for floats and integers (by mratsim)
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Project mention: Creando Subtítulos Automáticos para Vídeos con Python, Faster-Whisper, FFmpeg, Streamlit, Pillow | dev.to | 2024-04-29
git clone https://github.com/CNugteren/CLBlast.git cd CLBlast cmake . cmake --build . --config Release mkdir install cmake --install . --prefix ~/CLBlast/install cp libclblast.so* $PREFIX/lib cp ./include/clblast.h ../llama.cpp
It depends.
You need 2~3 accumulators to saturate instruction-level parallelism with a parallel sum reduction. But the compiler won't do it because it only creates those when the operation is associative, i.e. (a+b)+c = a+(b+c), which is true for integers but not for floats.
There is an escape hatch in -ffast-math.
I have extensive benches on this here: https://github.com/mratsim/laser/blob/master/benchmarks%2Ffp...
gemm related posts
Index
What are some of the best open-source gemm projects? This list will help you:
Project | Stars | |
---|---|---|
1 | CTranslate2 | 2,825 |
2 | how-to-optimize-gemm | 1,618 |
3 | CLBlast | 997 |
4 | blislab | 416 |
5 | laser | 261 |
Sponsored