88
208
415
Mentions
@
|
Stars | Project | Description |
---|---|---|---|
10 | 19,286 | LLM training in simple, raw C/CUDA | |
147 | 15,468 | Instant neural graphics primitives: lightning fast NeRF and more | |
112 | 9,954 | A massively parallel, optimal functional runtime in Rust | |
2 | 6,083 | Code and data for paper "Deep Painterly Harmonization": https://arxiv.org/abs/1804.03189 | |
4 | 4,190 | The project is an official implementation of our CVPR2019 paper "Deep High-Resolution Representation Learning for Human Pose Estimation" | |
1 | 3,323 | Squeeze-and-Excitation Networks | |
6 | 1,593 | cuGraph - RAPIDS Graph Analytics Library | |
5 | 1,309 | CUDA Library Samples | |
5 | 1,115 | Tile primitives for speedy kernels | |
2 | 1,056 | FSA/FST algorithms, differentiable, with PyTorch compatibility. | |
1 | 1,008 | Efficient GPU kernels for block-sparse matrix multiplication and convolution | |
1 | 902 | CUDA accelerated rasterization of gaussian splatting | |
1 | 754 | Automatically exported from code.google.com/p/cuda-convnet2 | |
1 | 700 | NCCL Tests | |
1 | 646 | Code from the "CUDA Crash Course" YouTube series by CoffeeBeforeArch | |
3 | 627 | RAFT contains fundamental widely-used algorithms and primitives for machine learning and information retrieval. The algorithms are CUDA-accelerated and form building blocks for more easily writing high performance applications. | |
1 | 557 | Fast, gpu-based CSV parser | |
12 | 493 | Instant neural graphics primitives: lightning fast NeRF and more | |
1 | 461 | Code for KiloNeRF: Speeding up Neural Radiance Fields with Thousands of Tiny MLPs | |
2 | 443 | Flash Attention in ~100 lines of CUDA (forward pass only) |
Popular Cuda Topics
Latest Mentions
Latest mentioned Cuda repos
Stars | Project |
---|---|
3 | llm.c |
19,286 | llm.c |
9,954 | HVM |
1,115 | ThunderKittens |
41 | jaxsplat |
25 | simpleGEMM |
195 | CGBN |
902 | gsplat |
154 | cuda-checkpoint |
6 | cuda-1brc |
294 | dietgpu |
443 | flash-attention-minimal |
0 | blog-code |
627 | raft |
5 | tuna |
298 | NATTEN |
6 | build-nccl-tests-with-pytorch |
24 | GPUODEBenchmarks |
195 | RWKV-CUDA |
187 | causal-conv1d |
Latest Discoveries
Latest discovered Cuda repos
Stars | Project |
---|---|
3 | llm.c |
41 | jaxsplat |
1,115 | ThunderKittens |
25 | simpleGEMM |
195 | CGBN |
902 | gsplat |
154 | cuda-checkpoint |
6 | cuda-1brc |
19,286 | llm.c |
443 | flash-attention-minimal |
0 | blog-code |
5 | tuna |
298 | NATTEN |
6 | build-nccl-tests-with-pytorch |
24 | GPUODEBenchmarks |
187 | causal-conv1d |
67 | ABMGPU |
7 | gpu-desktop-calculator |
57 | gdlog |
22 | Harmonia_for_B_plus_trees |
Recently updated posts
-
Reproducing GPT-2 (124M) in llm.c in 90 minutes for $20
-
Jaxsplat: 3D Gaussian Splatting for Jax
-
Welcome to the Parallel Future of Computation
-
Bend a Parallel Language
-
Bend: A higher order language for the GPU