oneDNN
oneMKL
Our great sponsors
oneDNN | oneMKL | |
---|---|---|
5 | 2 | |
3,456 | 565 | |
2.5% | 3.7% | |
10.0 | 8.5 | |
3 days ago | 7 days ago | |
C++ | C++ | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
oneDNN
-
Blaze: A High Performance C++ Math library
If you are talking about non-small matrix multiplication in MKL, is now in opensource as a part of oneDNN. It literally has exactly the same code, as in MKL (you can see this by inspecting constants or doing high-precision benchmarks).
For small matmul there is libxsmm. It may take tremendous efforts make something faster than oneDNN and libxsmm, as jit-based approach of https://github.com/oneapi-src/oneDNN/blob/main/src/gpu/jit/g... is too flexible: if someone finds a better sequence, oneDNN can reuse it without major change of design.
But MKL is not limited to matmul, I understand it...
-
Arc & Deep Learning Frameworks
For completeness, it looks like this question was posted to the oneDNN GitHub repo and the response was to stay tune for updates.
- Keeping POWER relevant in the open source world
-
Intel oneDNN 2.5 released with experimental RISC-V support
From the release note of oneDNN v2.5:
-
Is gpu hardware tied to cpu ISA ?
Intel are trying to support their oneAPI compute framework on Arm and IBM POWER and z/Architecture (s390x) but since they ever released only a single discrete GPU with the Xe architecture it's unclear whether they'll support Xe GPU compute on e.g. ARM https://github.com/oneapi-src/oneDNN
oneMKL
-
Stable Diffusion on AMD RDNA™ 3 Architecture
I think there's already been work done to just use intel MKL on any device: https://github.com/oneapi-src/oneMKL
- Developing in heterogeneous environment with the best HPC libraries
What are some alternatives?
CTranslate2 - Fast inference engine for Transformer models
kokkos-kernels - Kokkos C++ Performance Portability Programming Ecosystem: Math Kernels - Provides BLAS, Sparse BLAS and Graph Kernels
oneDPL - oneAPI DPC++ Library (oneDPL) https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/dpc-library.html
peakperf - Achieve peak performance on x86 CPUs and NVIDIA GPUs
highway - Highway - A Modern Javascript Transitions Manager
nekRS - our next generation fast and scalable CFD code
asmjit - Low-latency machine code generation
ArrayFire - ArrayFire: a general purpose GPU library.
librealsense - Intel® RealSense™ SDK
monolish - monolish: MONOlithic LInear equation Solvers for Highly-parallel architecture
Reloaded-II - Next Generation Universal .NET Core Powered Mod Loader compatible with anything X86, X64.
LSQR-CUDA - This is a LSQR-CUDA implementation written by Lawrence Ayers under the supervision of Stefan Guthe of the GRIS institute at the Technische Universität Darmstadt. The LSQR library was authored Chris Paige and Michael Saunders.