are-we-learning-yet
VkFFT
Our great sponsors
are-we-learning-yet | VkFFT | |
---|---|---|
5 | 37 | |
422 | 1,440 | |
- | - | |
4.9 | 8.1 | |
3 days ago | 27 days ago | |
Rust | C++ | |
Creative Commons Attribution 4.0 | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
are-we-learning-yet
-
This year I tried solving AoC using Rust, here are my impressions coming from Python!
Also http://arewelearningyet.com
-
[D] Is Rust stable/mature enough to be used for production ML? Is making Rust-based python wrappers a good choice for performance heavy uses and internal ML dependencies in 2021?
Hey OP, you might want to check this site out: http://arewelearningyet.com
-
Is rust good for mathematical computing?
Note that you can update the page (adding packages or updating descriptions) via those github issues: https://github.com/anowell/are-we-learning-yet/issues
-
I wanted to share my experience of Rust as a deep learning researcher
Not sure if you’ve encountered it, but you should be aware of http://arewelearningyet.com
-
Announcing neuronika 0.1.0, a deep learning framework in Rust
Just make a PR: https://github.com/anowell/are-we-learning-yet
VkFFT
-
VkFFT: Vulkan/CUDA/Hip/OpenCL/Level Zero/Metal Fast Fourier Transform Library
Not quite what I asked for, but close enough for now...
-
VkFFT now supports Apple Metal API - M1 Pro GPU FFT benchmarking
Hello, I am the creator of the VkFFT - GPU Fast Fourier Transform library for Vulkan/CUDA/HIP/OpenCL and Level Zero. In the latest update, I have added support for Apple Metal API, which will allow VkFFT to run natively on modern Apple SoC. I have tested it on MacBook Pro with an M1 Pro 8c CPU/14c GPU SoC single precision on 1D batched FFT test of all systems from 2 to 4096. Achieved bandwidth is calculated as 2*system size divided by the time taken per FFT - minimum memory that has to be transferred between DRAM and GPU:
-
Any good compute shader tutorials?
Another possible project to look at is https://github.com/DTolm/VkFFT
-
[R] Differentiable Conv Layer using FFT
Source: I have some of these things implemented in VkFFT that confirm the mentioned scaling of execution times.
- Resources for Vulkan GPGPU searched
-
Where to Learn Vulkan for parallel computation (with references to porting from CUDA)
https://github.com/DTolm/VkFFT is a project to look at.
-
The AMD “Aldebaran” GPU That Won Exascale
Incorrect. Vulkan has compute shaders[1], which are generally usable. Libraries like VkFFT[2] demonstrate basic signal processing in Vulkan. This can certainly be expanded upon, & there are numerous other non-graphical uses.
There is a Vulkan ML TSG (Technical Subgroup), which is supposed to be working on ML. Even Nvidia is participating, with extensions like VK_NV_cooperative_matrix, which specifically target the ml tensor cores.
There's people could probably say this better/more specifically, but I'll give it a try: Vulkan is, above all, an general standard for dispatching & orchestrating work usually on a GPU. Right now that work is most of often graphics, but that is far from a limit.
SYCL is, imo, the opposite of where we need to go. It's the old historical legacy that CUDA has, of writing really dumb ignorant code & hoping the tools can make it run well on a GPU. Vulkan, on the other hand, asks us to consider deeply what near-to-the-metal resources we are going to need, and demands that we define, dispatch, & complete the actual processing engines on the GPU that will do the work. It's a much much much harder task, but it invites in fantastic levels of close optimization & tuning, allows for far more advanced pipelining & possibilities. If the future is good, it should abandon SYCL and CUDA, and bother to get good at Vulkan.
[1] https://vkguide.dev/docs/gpudriven/compute_shaders/
[2] https://github.com/DTolm/VkFFT
[3] https://www.khronos.org/assets/uploads/developers/presentati...
-
VkFFT now supports Discrete Cosine Transforms
VkFFT supports convolution calculations - see samples 7, 8 and 9 in the VkFFT repository.
What are some alternatives?
wgpu - Cross-platform, safe, pure-rust graphics api.
kompute - General purpose GPU compute framework built on Vulkan to support 1000s of cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). Blazing fast, mobile-enabled, asynchronous and optimized for advanced GPU data processing usecases. Backed by the Linux Foundation.
rust-gpu - 🐉 Making Rust a first-class language and ecosystem for GPU shaders 🚧
cuda-samples - Samples for CUDA Developers which demonstrates features in CUDA Toolkit
neuronika - Tensors and dynamic neural networks in pure Rust.
rocFFT - Next generation FFT implementation for ROCm
book - The Rust Programming Language
rust-bert - Rust native ready-to-use NLP pipelines and transformer-based models (BERT, DistilBERT, GPT2,...)
xNVMe - Portable and high-performance libraries and tools for NVMe devices as well as support for traditional/legacy storage devices/interfaces.
ROCm - AMD ROCm™ Software - GitHub Home [Moved to: https://github.com/ROCm/ROCm]