VkFFT
clspv
Our great sponsors
VkFFT | clspv | |
---|---|---|
37 | 8 | |
1,432 | 568 | |
- | 1.9% | |
8.1 | 9.0 | |
6 days ago | 5 days ago | |
C++ | LLVM | |
MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
VkFFT
-
VkFFT: Vulkan/CUDA/Hip/OpenCL/Level Zero/Metal Fast Fourier Transform Library
Not quite what I asked for, but close enough for now...
-
VkFFT now supports Apple Metal API - M1 Pro GPU FFT benchmarking
Hello, I am the creator of the VkFFT - GPU Fast Fourier Transform library for Vulkan/CUDA/HIP/OpenCL and Level Zero. In the latest update, I have added support for Apple Metal API, which will allow VkFFT to run natively on modern Apple SoC. I have tested it on MacBook Pro with an M1 Pro 8c CPU/14c GPU SoC single precision on 1D batched FFT test of all systems from 2 to 4096. Achieved bandwidth is calculated as 2*system size divided by the time taken per FFT - minimum memory that has to be transferred between DRAM and GPU:
-
Any good compute shader tutorials?
Another possible project to look at is https://github.com/DTolm/VkFFT
-
[R] Differentiable Conv Layer using FFT
Source: I have some of these things implemented in VkFFT that confirm the mentioned scaling of execution times.
- Resources for Vulkan GPGPU searched
-
Where to Learn Vulkan for parallel computation (with references to porting from CUDA)
https://github.com/DTolm/VkFFT is a project to look at.
-
The AMD “Aldebaran” GPU That Won Exascale
Incorrect. Vulkan has compute shaders[1], which are generally usable. Libraries like VkFFT[2] demonstrate basic signal processing in Vulkan. This can certainly be expanded upon, & there are numerous other non-graphical uses.
There is a Vulkan ML TSG (Technical Subgroup), which is supposed to be working on ML. Even Nvidia is participating, with extensions like VK_NV_cooperative_matrix, which specifically target the ml tensor cores.
There's people could probably say this better/more specifically, but I'll give it a try: Vulkan is, above all, an general standard for dispatching & orchestrating work usually on a GPU. Right now that work is most of often graphics, but that is far from a limit.
SYCL is, imo, the opposite of where we need to go. It's the old historical legacy that CUDA has, of writing really dumb ignorant code & hoping the tools can make it run well on a GPU. Vulkan, on the other hand, asks us to consider deeply what near-to-the-metal resources we are going to need, and demands that we define, dispatch, & complete the actual processing engines on the GPU that will do the work. It's a much much much harder task, but it invites in fantastic levels of close optimization & tuning, allows for far more advanced pipelining & possibilities. If the future is good, it should abandon SYCL and CUDA, and bother to get good at Vulkan.
[1] https://vkguide.dev/docs/gpudriven/compute_shaders/
[2] https://github.com/DTolm/VkFFT
[3] https://www.khronos.org/assets/uploads/developers/presentati...
-
VkFFT now supports Discrete Cosine Transforms
VkFFT supports convolution calculations - see samples 7, 8 and 9 in the VkFFT repository.
clspv
-
Vcc – The Vulkan Clang Compiler
See https://github.com/google/clspv for an OpenCL implementation on Vulkan Compute. There are plenty of quirks involved because the two standards use different varieties of SPIR-V ("kernels" vs. "shaders") and provide different guarantees (Vulkan Compute doesn't care much about numerical accuracy). The Mesa folks are also looking into this as part of their RustiCL (a modern OpenCL implementation) and Zink (implementing OpenGL and perhaps OpenCL itself on Vulkan) projects.
-
AMD's CDNA 3 Compute Architecture
Vulkan Compute backends for numerical compute (as typified by both OpenCL and SYCL) are challenging, you can look at Google's cspv https://github.com/google/clspv project for the nitty gritty details. The lowest-effort path is actually via some combination of Rocm (for hardware that AMD bothers to support themselves) and the Mesa project's Rusticl backend (for everything else).
-
WSL with CUDA Support
D3D12 has more compute features than Vulkan has. It works out for DXVK because games often don’t use those, but it’ll cause much more issues with CLon12.
By the way, if you are ready to have a _limited_ implementation without a full feature set because of Vulkan API limitations, clvk is a thing. The list of limitations of that approach is at https://github.com/google/clspv/blob/master/docs/OpenCLCOnVu...
tldr: Vulkan and OpenCL SPIR-V dialects are different, and the former has significant limitations affecting this use case
- Resources for Vulkan GPGPU searched
-
Low overhead C++ interface for Apple's Metal API
For OpenCL on DX12, the test suite doesn't pass yet. Every Khronos OpenCL 1.2 CTS test passes on at least one hardware driver, but there's none that pass them all. That is why CLon12 isn't submitted to Khronos's compliant products list yet.
The pointer semantics that Vulkan has aren't very amenable to implementing a compliant OpenCL implementation on top of. There are also some other limitatons: https://github.com/google/clspv/blob/master/docs/OpenCLCOnVu....
-
[Hardware Unboxed] - Apple M1 Pro Review - Is It Really Faster than Intel/AMD?
Vulkan is much more limited, notably because of Vulkan's SPIR-V dialect limitations. That makes a compliant OpenCL 1.2 impl on top of Vulkan impossible. (see: https://github.com/google/clspv/blob/master/docs/OpenCLCOnVulkan.md)
-
Cross Platform GPU-Capable Framework?
OpenCL really is your best bet for a cross-platform GPU-capable framework. OpenCL 3.0 cleared out a lot of the cruft from OpenCL 2.x so it's seeing a lot more adoption. The most cross-platform solution is still OpenCL 1.2, largely for MacOS, but OpenCL 3.0 is becoming more and more common for Windows and Linux and multiple devices. Even on platforms without native OpenCL support there are compatibility layers that implement OpenCL on top of DirectX (OpenCLOn12) or Vulkan (clvk and clspv).
What are some alternatives?
wgpu - Cross-platform, safe, pure-rust graphics api.
kompute - General purpose GPU compute framework built on Vulkan to support 1000s of cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). Blazing fast, mobile-enabled, asynchronous and optimized for advanced GPU data processing usecases. Backed by the Linux Foundation.
OpenCLOn12 - The OpenCL-on-D3D12 mapping layer
rust-gpu - 🐉 Making Rust a first-class language and ecosystem for GPU shaders 🚧
cuda-samples - Samples for CUDA Developers which demonstrates features in CUDA Toolkit
rocFFT - Next generation FFT implementation for ROCm
GLSL - GLSL Shading Language Issue Tracker
alpaka - Abstraction Library for Parallel Kernel Acceleration :llama:
xNVMe - Portable and high-performance libraries and tools for NVMe devices as well as support for traditional/legacy storage devices/interfaces.
MoltenVK - MoltenVK is a Vulkan Portability implementation. It layers a subset of the high-performance, industry-standard Vulkan graphics and compute API over Apple's Metal graphics framework, enabling Vulkan applications to run on macOS, iOS and tvOS.
ROCm - AMD ROCm™ Software - GitHub Home [Moved to: https://github.com/ROCm/ROCm]