tiny-cuda-nn
vectorflow
tiny-cuda-nn | vectorflow | |
---|---|---|
9 | 12 | |
3,397 | 1,290 | |
1.8% | 0.2% | |
5.9 | 0.0 | |
about 1 month ago | 10 months ago | |
C++ | D | |
GNU General Public License v3.0 or later | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
tiny-cuda-nn
-
[D] Have their been any attempts to create a programming language specifically for machine learning?
In the opposite direction from your question is a very interesting project, TinyNN all implemented as close to the metal as possible and very fast: https://github.com/NVlabs/tiny-cuda-nn
-
A CUDA-free instant NGP renderer written entirely in Python: Support real-time rendering and camera interaction and consume less than 1GB of VRAM
This repo only implemented the rendering part of the NGP but is more simple and has a lesser amount of code compared to the original (Instant-NGP and tiny-cuda-nn).
- Tiny CUDA Neural Networks: fast C++/CUDA neural network framework
- Making 3D holograms this weekend with the very “Instant” Neural Graphics Primitives by nvidia — made this volume from 100 photos taken with an old iPhone 7 Plus
- NVlabs/tiny-CUDA-nn: fast C++/CUDA neural network framework
-
Small Neural networks in Julia 5x faster than PyTorch
...a C++ library with a CUDA backend. But these high-performance building blocks might only be saturating the GPU fully if the data is large enough.
I haven't looked at implementing these things, but I imagine uf you have smaller networks and thus less data, the large building blocks may not be optimal. You may for example want to fuse some operations to reduce memory latency from repeated memory access.
In PyTorch world, there are approaches for small networks as well, there is https://github.com/NVlabs/tiny-cuda-nn - as far as I understand from the first link in the README, it makes clever use of the CUDA shared memory, which can hold all the weights of a tiny network (but not larger ones).
- [R] Instant Neural Graphics Primitives with a Multiresolution Hash Encoding (Training a NeRF takes 5 seconds!)
- Tiny CUDA Neural Networks
- Real-Time Neural Radiance Caching for Path Tracing
vectorflow
-
Programming languages endorsed for server-side use at Meta
>> Mozilla (of course)
Mozilla is a c++ and javascript shop. What do they ship in Rust? How much of Firefox is written in rust for example?
>> Microsoft, Meta, Google/Acrobat, Amazon
Large firms have lots of devs and consequently lots of toy projects. Is their usage of rust more significant than their use of D? I mean Meta was churning out projects in D a while back (warp, flint, etc) and looked like it might be going all in at one point (they even hired one of the leads on D lang).
>> That's practically all of FAANG
Who were we missing? Netflix, they’ve dabbled with D too: https://github.com/Netflix/vectorflow
Don’t misunderstand my point - it’s not that D is more popular than rust, it’s that rust is not used for real work in any significant capacity yet.
Where’s the big project written in rust? Servo and the rust compiler are the only two large rust projects on github.
-
Cloud TPU VMs are generally available
Thanks Zak, already applied.
Just wondering does TPU VM support Vectorflow?
https://github.com/Netflix/vectorflow
- Vectorflow is a minimalist neural network library optimized for sparse data and single machine environments open sourced by Netflix (r/MachineLearning)
- [P] Vectorflow is a minimalist neural network library optimized for sparse data and single machine environments open sourced by Netflix
- Vectorflow is a minimalist neural network library optimized for sparse data and single machine environments open sourced by Netflix
- Vectorflow: Minimalist neural network library faster than TensorFlow in D
-
Small Neural networks in Julia 5x faster than PyTorch
A library I designed a few years ago (https://github.com/Netflix/vectorflow) is also much faster than pytorch/tensorflow in these cases.
In "small" or "very sparse" setups, you're memory bound, not compute bound. TF and Pytorch are bad at that because they assume memory movements are worth it and do very little in-place operations.
Different tools for different jobs.
What are some alternatives?
instant-ngp - Instant neural graphics primitives: lightning fast NeRF and more
dcompute - DCompute: Native execution of D on GPUs and other Accelerators
blis - BLAS-like Library Instantiation Software Framework
diffrax - Numerical differential equation solvers in JAX. Autodifferentiable and GPU-capable. https://docs.kidger.site/diffrax/
LeNetTorch - PyTorch implementation of LeNet for fitting MNIST for benchmarking.
juliaup - Julia installer and version multiplexer
RecursiveFactorization
RecursiveFactorization.jl
ugrep - ugrep 5.1: A more powerful, ultra fast, user-friendly, compatible grep. Includes a TUI, Google-like Boolean search with AND/OR/NOT, fuzzy search, hexdumps, searches (nested) archives (zip, 7z, tar, pax, cpio), compressed files (gz, Z, bz2, lzma, xz, lz4, zstd, brotli), pdfs, docs, and more