VkFFT
shaders
Our great sponsors
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
VkFFT
-
[P] - VkFFT now supports quad precision (double-double) FFT computation on GPU
Hello, I am the creator of the VkFFT - GPU Fast Fourier Transform library for Vulkan/CUDA/HIP/OpenCL/Level Zero and Metal. In the latest update, I have added support for quad-precision double-double emulation for FFT calculation on most modern GPUs. I understand that modern ML is going in the opposite low-precision direction, but I still think that it may be useful to have this functionality at least for some prototyping and development of concepts.
- VkFFT now supports quad precision (double-double) FFT computation on GPU
-
VkFFT: Vulkan/CUDA/Hip/OpenCL/Level Zero/Metal Fast Fourier Transform Library
Not quite what I asked for, but close enough for now...
https://github.com/DTolm/VkFFT/discussions/126
-
Implementing complex numbers (and FFT) elegantly with just algebraic datatypes (no machine floats)
Source - I have made a somewhat functional programming-like FFT library (https://github.com/DTolm/VkFFT/tree/develop) which also operates on abstract data containers. Maybe it can be interesting to you from the algorithmic point of view.
-
how does Vulkan compare to CUDA?
VkFFT is a use-case I've heard of where Vulkan-Compute is faster than its Cuda and OpenCL counter-part: https://github.com/DTolm/VkFFT
-
VkFFT now supports Apple Metal API - M1 Pro GPU FFT benchmarking
Hello, I am the creator of the VkFFT - GPU Fast Fourier Transform library for Vulkan/CUDA/HIP/OpenCL and Level Zero. In the latest update, I have added support for Apple Metal API, which will allow VkFFT to run natively on modern Apple SoC. I have tested it on MacBook Pro with an M1 Pro 8c CPU/14c GPU SoC single precision on 1D batched FFT test of all systems from 2 to 4096. Achieved bandwidth is calculated as 2*system size divided by the time taken per FFT - minimum memory that has to be transferred between DRAM and GPU:
-
Any good compute shader tutorials?
Another possible project to look at is https://github.com/DTolm/VkFFT
- VkFFT now supports Rader's algorithm - A100 and MI250 benchmarks: Part 2
- VkFFT now supports Rader's algorithm - A100 and MI250 benchmarks
shaders
-
Adding HLSL and DirectX Support to Clang and LLVM
It may be close to a technical impossibility, but the Circle compiler by Sean Baxter is attempting it. That's based on an aggressive "de-pointerization" (see [1] in particular for details). There's also academic work[2] to compile C++ to shaders. I agree that it's an open question how well that will work out.
Also as pointed out elsethread, now that buffer device address is starting to land, the friction to compile pointer-intense C++ code should decrease even more. These are exciting times!
[1]: https://github.com/seanbaxter/shaders#approaching-circle-sha...
[2]: https://arxiv.org/abs/2109.14682
-
Writing Vulkan SPIR-V shaders in C++?
You can use circle c++ shader https://github.com/seanbaxter/shaders but it's limited to look linux afaik?
-
Where to Learn Vulkan for parallel computation (with references to porting from CUDA)
First we have Circle C++ shaders, which pretty much would tick all the boxes. Problem is it's closed source and only compiles host code on linux. Closed source isn't the biggest of issues actually, but prevents anyone from fixing the developers issue with interfacing with the windows ABI and getting the thing working on windows (which itself isn't something they are able to fix because windows doesn't provide the documentation to work with their ABI). However you could use it separately to compile your SPIR-V for windows since SPIR-V doesn't care about platform itself.
-
Has anyone seriously considered C++AMP? Thoughts / Experiences?
Yes, Vulkan GPU source is split, though technically in a way that makes it more similar to CUDA. Vulkan uses an intermediate format instead of consuming text code directly, meaning new features are easier to add and frontend code doesn't need to be passed to the vendors driver compiler. SPIR-V is like DXIL or PTX code for CUDA, basically LLVM IR for GPUs. The CUDA compiler compiles your device code into PTX code, and it's what enables you to have "non split" source code. There's even an option to have separate PTX code in CUDA. There are few projects that aim to bring Vulkan SPIR-V into source, including Rust GPU for rust (though it will still have to be in a separate file) and Circle C++ shader for C++.
-
Circle, the C++ Automation Language
My favorite use is putting user-defined attributes on data members, and using reflection to generate a UI to manipulate those values. I do it with these shadertoys:
https://github.com/seanbaxter/shaders#reflection-and-attribu...
Just mark your declarations up with custom attributes:
-
Unified Shader Programming in C++
I'm confused what is novel about this paper. We already have unified shader programming with circle C++, with way more features, and instead of having an SPIR-V compiler, they made a source to source compiler... We have quite a few of those.
I think shader specialisation is handled pretty well in circle. Since you can essentially run arbitrary C++ code at compile time, selection and specialisation of a shader can even depend on hardware specific benchmarks. There is an extensive repo with examples here: https://github.com/seanbaxter/shaders. One example decodes a sprite sheet stored as a png at compile time and creates a specialised compute shader for it. You can also easily implement a control UI based on reflection of uniform shader parameters.
-
Embark Studios has rewritten all their renderer's shader code from GLSL to Rust
There's a project doing something similar for C++ called Circle which is pretty incredible. In its core Circle is an extension of standard C++ which adds a ton of metaprogramming facilities and other productivity enhancing features, things the base language sorely lacks like full compile-time execution of regular C++ code which lets you do anything you can normally do from runtime during compile-time (including file I/O and networking), reflection, typed enums, pattern matching, hygienic macros, list comprehensions and language-native ranges, first class paramater packs and much more.
-
Code generation using attributes
I use them to automatically generate an ImGui interface for controlling a shadertoy here: https://github.com/seanbaxter/shaders/blob/master/README.md#user-attributes-and-dear-imgui
What are some alternatives?
wgpu - Cross-platform, safe, pure-rust graphics api.
rust-gpu - 🐉 Making Rust a first-class language and ecosystem for GPU shaders 🚧
kompute - General purpose GPU compute framework built on Vulkan to support 1000s of cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). Blazing fast, mobile-enabled, asynchronous and optimized for advanced GPU data processing usecases. Backed by the Linux Foundation.
meta
bgfx - Cross-platform, graphics API agnostic, "Bring Your Own Engine/Framework" style rendering library.
cuda-samples - Samples for CUDA Developers which demonstrates features in CUDA Toolkit
circle - The compiler is available for download. Get it!
rocFFT - Next generation FFT implementation for ROCm
magnum - Lightweight and modular C++11 graphics middleware for games and data visualization
xNVMe - Portable and high-performance libraries and tools for NVMe devices as well as support for traditional/legacy storage devices/interfaces.
processing - Source code for the Processing Core and Development Environment (PDE)