clvk
VkFFT
clvk | VkFFT | |
---|---|---|
4 | 37 | |
315 | 1,443 | |
- | - | |
8.8 | 8.1 | |
7 days ago | about 1 month ago | |
C++ | C++ | |
Apache License 2.0 | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
clvk
-
LangChain / LlamaCpp on M1 GPU (MPS)?
I tried very similar thing. My purpose was to run llama-cpp-python with CLBlast GPU acceleration via clvk on VulkanSDK on my M1 Max computer. I downloaded VulkanSDK for macOS, cloned clvk(https://github.com/kpet/clvk) and CLBlast. Build was successful but there is a problem; when clCreateCommandQueue function was called with CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE option(in ggml-opencl.c of llama.cpp) , an error was happened and I do not know how to handle it.
- Resources for Vulkan GPGPU searched
-
Cross Platform GPU-Capable Framework?
OpenCL really is your best bet for a cross-platform GPU-capable framework. OpenCL 3.0 cleared out a lot of the cruft from OpenCL 2.x so it's seeing a lot more adoption. The most cross-platform solution is still OpenCL 1.2, largely for MacOS, but OpenCL 3.0 is becoming more and more common for Windows and Linux and multiple devices. Even on platforms without native OpenCL support there are compatibility layers that implement OpenCL on top of DirectX (OpenCLOn12) or Vulkan (clvk and clspv).
-
How does GPU programming work?
What we really need is CLVK, but it seems pretty limited. I'd prefer a clang based compiler which can accept opencl c++ personally, because a brand new compiler is not ideal
VkFFT
-
[P] - VkFFT now supports quad precision (double-double) FFT computation on GPU
Hello, I am the creator of the VkFFT - GPU Fast Fourier Transform library for Vulkan/CUDA/HIP/OpenCL/Level Zero and Metal. In the latest update, I have added support for quad-precision double-double emulation for FFT calculation on most modern GPUs. I understand that modern ML is going in the opposite low-precision direction, but I still think that it may be useful to have this functionality at least for some prototyping and development of concepts.
- VkFFT now supports quad precision (double-double) FFT computation on GPU
-
VkFFT: Vulkan/CUDA/Hip/OpenCL/Level Zero/Metal Fast Fourier Transform Library
Not quite what I asked for, but close enough for now...
https://github.com/DTolm/VkFFT/discussions/126
-
Implementing complex numbers (and FFT) elegantly with just algebraic datatypes (no machine floats)
Source - I have made a somewhat functional programming-like FFT library (https://github.com/DTolm/VkFFT/tree/develop) which also operates on abstract data containers. Maybe it can be interesting to you from the algorithmic point of view.
-
how does Vulkan compare to CUDA?
VkFFT is a use-case I've heard of where Vulkan-Compute is faster than its Cuda and OpenCL counter-part: https://github.com/DTolm/VkFFT
-
VkFFT now supports Apple Metal API - M1 Pro GPU FFT benchmarking
Hello, I am the creator of the VkFFT - GPU Fast Fourier Transform library for Vulkan/CUDA/HIP/OpenCL and Level Zero. In the latest update, I have added support for Apple Metal API, which will allow VkFFT to run natively on modern Apple SoC. I have tested it on MacBook Pro with an M1 Pro 8c CPU/14c GPU SoC single precision on 1D batched FFT test of all systems from 2 to 4096. Achieved bandwidth is calculated as 2*system size divided by the time taken per FFT - minimum memory that has to be transferred between DRAM and GPU:
-
Any good compute shader tutorials?
Another possible project to look at is https://github.com/DTolm/VkFFT
- VkFFT now supports Rader's algorithm - A100 and MI250 benchmarks: Part 2
- VkFFT now supports Rader's algorithm - A100 and MI250 benchmarks
What are some alternatives?
clspv - Clspv is a compiler for OpenCL C to Vulkan compute shaders
wgpu - Cross-platform, safe, pure-rust graphics api.
kompute - General purpose GPU compute framework built on Vulkan to support 1000s of cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). Blazing fast, mobile-enabled, asynchronous and optimized for advanced GPU data processing usecases. Backed by the Linux Foundation.
alpaka - Abstraction Library for Parallel Kernel Acceleration :llama:
rust-gpu - 🐉 Making Rust a first-class language and ecosystem for GPU shaders 🚧
vuh - Vulkan compute for people
cuda-samples - Samples for CUDA Developers which demonstrates features in CUDA Toolkit
GLSL - GLSL Shading Language Issue Tracker
rocFFT - Next generation FFT implementation for ROCm
ocl - OpenCL for Rust
xNVMe - Portable and high-performance libraries and tools for NVMe devices as well as support for traditional/legacy storage devices/interfaces.