GPU-Puzzles
SmallPebble
Our great sponsors
GPU-Puzzles | SmallPebble | |
---|---|---|
12 | 6 | |
5,022 | 112 | |
- | - | |
3.4 | 0.0 | |
4 months ago | over 1 year ago | |
Jupyter Notebook | Python | |
MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
GPU-Puzzles
- Solve Puzzles. Learn CUDA
- GPU Puzzles
- Understanding Automatic Differentiation in 30 lines of Python
-
FlashAttention-2, 2x faster than FlashAttention
I found it helpful to start with CUDA on numba since it lets you write GPU kernels in python. Assuming you're like most ML engineers and you're more familiar with python than C++, this allows you to separately learn CUDA concepts from also learning C++ at the same time. There's also a set of GPU puzzles for beginners [1] using to get started with numba CUDA.
[1] https://github.com/srush/GPU-Puzzles
- [Computer Science] srush/GPU-Puzzles: Solve puzzles. Learn CUDA.
-
Build on AWS Weekly - S1 E2 - Breaking Blocks with Terraform
Are you having fun with Machine Learning? Go and teach yourself beginner GPU programming with this wonderful notebook: GitHub repo
- GPU-Puzzles: Solve Puzzles. Learn CUDA
-
[D] What are some good resources to learn CUDA programming?
Practice puzzles: https://github.com/srush/GPU-Puzzles
- Learn GPU programming in interactive fashion
SmallPebble
-
Fastest Autograd in the West
You can implement autograd as a library. Just take a look at this
https://github.com/sradc/SmallPebble
The first line of the description is:
> SmallPebble is a minimal automatic differentiation and deep learning library written from scratch in Python, using NumPy/CuPy.
-
Compiling ML models to C for fun
Thanks for this. My approach to speeding up an autodiff system like this was to write it in terms of nd-arrays rather than scalars, using numpy/cupy [1]. But it's still slower than deep learning frameworks that compile / fuse operations. Wondering how it compares to the approach in this post. (Might try to benchmark at some point.)
[1] https://github.com/sradc/SmallPebble
- Understanding Automatic Differentiation in 30 lines of Python
-
[P] SmallPebble - minimal(/toy) deep learning framework written from scratch in Python, using NumPy/CuPy. <700 loc.
Located here: https://github.com/sradc/SmallPebble
- Show HN: I wrote a minimal(/toy) deep learning library from scratch in Python
- SmallPebble – Minimal automatic differentiation implementation in Python, NumPy
What are some alternatives?
vscode-infracost - See cost estimates for Terraform right in your editor💰📉
MyGrad - Drop-in autodiff for NumPy.
triton - Development repository for the Triton language and compiler
chainer - A flexible framework of neural networks for deep learning
cutlass - CUDA Templates for Linear Algebra Subroutines
memoized_coduals - Shows that it is possible to implement reverse mode autodiff using a variation on the dual numbers called the codual numbers
carbon-lang - Carbon Language's main repository: documents, design, implementation, and related tools. (NOTE: Carbon Language is experimental; see README)
Tensor-Puzzles - Solve puzzles. Improve your pytorch.
terraform-minecraft - A Terraform Script that can deploy Minecraft Servers
owl - Owl - OCaml Scientific Computing @ https://ocaml.xyz
mercury-ad - Mercury library for automatic differentiation