cuda-samples
VkFFT
cuda-samples | VkFFT | |
---|---|---|
15 | 37 | |
5,348 | 1,443 | |
3.7% | - | |
5.0 | 8.1 | |
22 days ago | about 1 month ago | |
C | C++ | |
GNU General Public License v3.0 or later | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
cuda-samples
-
Is anyone successfully using an RTX 3000-series under WSL2?
installing, building, and running WSL CUDA examples from https://github.com/nvidia/cuda-samples
-
Updated Install Instructions Dec 2022
After which nvcc should be accessible to new sessions, and you can build C++ cuda stuff like cuda-samples. Python packages like pytorch should also see CUDA and be able to use it.
-
Virtual Memory Management APIs for NVIDIA GPUs on Windows
I haven't found any note that these APIs do not support Windows, and it also seems that the memMapIPCDrv CUDA sample supports Windows.
-
ROS with CUDA on windows
I installed nvidia-cuda-toolkit and tried to build https://github.com/NVIDIA/cuda-samples but I'm getting stupid errors... it installed nvccat /usr/bin/nvcc and the samples expect /usr/local/cuda/bin/nvcc... symlinking it to that location and it dies with
-
Script to install nvidia drivers , cuda/nvcc, gcc11 and setup on Fedora 36
Can build the cuda-samples, then you have a working nvcc.
-
Can't get some CUDA Samples to work
I have installed cuda and cudnn, and was testing the installation with the cuda-samples, as the Arch Wiki suggested. But, I am not able to get samples like nbody, smokeparticles, Mandelbrot, etc. to run. Although devicequery works fine, and I get the expected output, so I think there is not a problem with my cuda installation.
-
Cuda application question
Hi, I don't have much experience with Nvidia Jetsons. You can find some examples on GitHub (here https://github.com/NVIDIA/cuda-samples). You can find CUDA implementations of most functions on the internet though, you just have to look for the specific thing you are looking for. Cuda kernels are not platform specific, they should work on GPUs and embedded developer boards without problems as long as you respect the limits imposed by the "compute capability" of your device, you just have to compile your code using the right architecture flag. The biggest limit you have to deal with when developing for Jetson nano is the low amount of memory.
- My GPU-accelerated raytracing renderer
-
Tutorial for ubuntu 20.04
—> git clone https://github.com/NVIDIA/cuda-samples.git —> cd cuda-samples/Samples/1_Utilities/deviceQuery/ —> make —> ./deviceQuery (Result=pass?good) —> cd ~/ —> wget https://repo.anaconda.com/archive/Anaconda3-2021.11-Linux-x86_64.sh
-
cuda_kde_depth_packet_processor.cu:39:10: fatal error: helper_math.h: File or directory not found
is this the source code that u are talking about ? : https://github.com/NVIDIA/cuda-samples ? I dont see any CMakeLists.txt inside...
VkFFT
-
[P] - VkFFT now supports quad precision (double-double) FFT computation on GPU
Hello, I am the creator of the VkFFT - GPU Fast Fourier Transform library for Vulkan/CUDA/HIP/OpenCL/Level Zero and Metal. In the latest update, I have added support for quad-precision double-double emulation for FFT calculation on most modern GPUs. I understand that modern ML is going in the opposite low-precision direction, but I still think that it may be useful to have this functionality at least for some prototyping and development of concepts.
- VkFFT now supports quad precision (double-double) FFT computation on GPU
-
VkFFT: Vulkan/CUDA/Hip/OpenCL/Level Zero/Metal Fast Fourier Transform Library
Not quite what I asked for, but close enough for now...
https://github.com/DTolm/VkFFT/discussions/126
-
Implementing complex numbers (and FFT) elegantly with just algebraic datatypes (no machine floats)
Source - I have made a somewhat functional programming-like FFT library (https://github.com/DTolm/VkFFT/tree/develop) which also operates on abstract data containers. Maybe it can be interesting to you from the algorithmic point of view.
-
how does Vulkan compare to CUDA?
VkFFT is a use-case I've heard of where Vulkan-Compute is faster than its Cuda and OpenCL counter-part: https://github.com/DTolm/VkFFT
-
VkFFT now supports Apple Metal API - M1 Pro GPU FFT benchmarking
Hello, I am the creator of the VkFFT - GPU Fast Fourier Transform library for Vulkan/CUDA/HIP/OpenCL and Level Zero. In the latest update, I have added support for Apple Metal API, which will allow VkFFT to run natively on modern Apple SoC. I have tested it on MacBook Pro with an M1 Pro 8c CPU/14c GPU SoC single precision on 1D batched FFT test of all systems from 2 to 4096. Achieved bandwidth is calculated as 2*system size divided by the time taken per FFT - minimum memory that has to be transferred between DRAM and GPU:
-
Any good compute shader tutorials?
Another possible project to look at is https://github.com/DTolm/VkFFT
- VkFFT now supports Rader's algorithm - A100 and MI250 benchmarks: Part 2
- VkFFT now supports Rader's algorithm - A100 and MI250 benchmarks
What are some alternatives?
catboost - A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.
wgpu - Cross-platform, safe, pure-rust graphics api.
geodesic_raytracing
kompute - General purpose GPU compute framework built on Vulkan to support 1000s of cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). Blazing fast, mobile-enabled, asynchronous and optimized for advanced GPU data processing usecases. Backed by the Linux Foundation.
hashcat - World's fastest and most advanced password recovery utility
rust-gpu - 🐉 Making Rust a first-class language and ecosystem for GPU shaders 🚧
nvidia-auto-installer-for-fedora-linux - A CLI tool which lets you install proprietary NVIDIA drivers and much more easily on Fedora Linux (32 or above and Rawhide)
rocFFT - Next generation FFT implementation for ROCm
RAJA - RAJA Performance Portability Layer (C++)
xNVMe - Portable and high-performance libraries and tools for NVMe devices as well as support for traditional/legacy storage devices/interfaces.
blender-cuda-subdivision-surface-gpu - A Blender 3.0.0 fork that will allow you to subdivide complex meshes using CUDA compatible GPUs. (WIP)
ROCm - AMD ROCm™ Software - GitHub Home [Moved to: https://github.com/ROCm/ROCm]