tiny-cuda-nn vs RecursiveFactorization

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

tiny-cuda-nn		RecursiveFactorization
	Project
9	Mentions	3
3,397	Stars	-
1.8%	Growth	-
5.9	Activity	-
about 1 month ago	Latest Commit	-
C++	Language
GNU General Public License v3.0 or later	License	-

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

tiny-cuda-nn

Posts with mentions or reviews of tiny-cuda-nn. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-02-11.

[D] Have their been any attempts to create a programming language specifically for machine learning?
12 projects | /r/MachineLearning | 11 Feb 2023

In the opposite direction from your question is a very interesting project, TinyNN all implemented as close to the metal as possible and very fast: https://github.com/NVlabs/tiny-cuda-nn
A CUDA-free instant NGP renderer written entirely in Python: Support real-time rendering and camera interaction and consume less than 1GB of VRAM
1 project | /r/Python | 20 Dec 2022

This repo only implemented the rendering part of the NGP but is more simple and has a lesser amount of code compared to the original (Instant-NGP and tiny-cuda-nn).
Tiny CUDA Neural Networks: fast C++/CUDA neural network framework
1 project | news.ycombinator.com | 16 Oct 2022
Making 3D holograms this weekend with the very “Instant” Neural Graphics Primitives by nvidia — made this volume from 100 photos taken with an old iPhone 7 Plus
2 projects | /r/virtualreality | 28 May 2022
NVlabs/tiny-CUDA-nn: fast C++/CUDA neural network framework
1 project | news.ycombinator.com | 17 Apr 2022
Small Neural networks in Julia 5x faster than PyTorch
8 projects | news.ycombinator.com | 14 Apr 2022

...a C++ library with a CUDA backend. But these high-performance building blocks might only be saturating the GPU fully if the data is large enough.
I haven't looked at implementing these things, but I imagine uf you have smaller networks and thus less data, the large building blocks may not be optimal. You may for example want to fuse some operations to reduce memory latency from repeated memory access.
In PyTorch world, there are approaches for small networks as well, there is https://github.com/NVlabs/tiny-cuda-nn - as far as I understand from the first link in the README, it makes clever use of the CUDA shared memory, which can hold all the weights of a tiny network (but not larger ones).
[R] Instant Neural Graphics Primitives with a Multiresolution Hash Encoding (Training a NeRF takes 5 seconds!)
2 projects | /r/MachineLearning | 16 Jan 2022
Tiny CUDA Neural Networks
1 project | news.ycombinator.com | 18 Aug 2021
Real-Time Neural Radiance Caching for Path Tracing
1 project | news.ycombinator.com | 23 Jun 2021

RecursiveFactorization

Posts with mentions or reviews of RecursiveFactorization. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-05-01.

Can Fortran survive another 15 years?
7 projects | news.ycombinator.com | 1 May 2023

What about the other benchmarks on the same site? https://docs.sciml.ai/SciMLBenchmarksOutput/stable/Bio/BCR/ BCR takes about a hundred seconds and is pretty indicative of systems biological models, coming from 1122 ODEs with 24388 terms that describe a stiff chemical reaction network modeling the BCR signaling network from Barua et al. Or the discrete diffusion models https://docs.sciml.ai/SciMLBenchmarksOutput/stable/Jumps/Dif... which are the justification behind the claims in https://www.biorxiv.org/content/10.1101/2022.07.30.502135v1 that the O(1) scaling methods scale better than O(log n) scaling for large enough models? I mean.
> If you use special routines (BLAS/LAPACK, ...), use them everywhere as the respective community does.
It tests with and with BLAS/LAPACK (which isn't always helpful, which of course you'd see from the benchmarks if you read them). One of the key differences of course though is that there are some pure Julia tools like https://github.com/JuliaLinearAlgebra/RecursiveFactorization... which outperform the respective OpenBLAS/MKL equivalent in many scenarios, and that's one noted factor for the performance boost (and is not trivial to wrap into the interface of the other solvers, so it's not done). There are other benchmarks showing that it's not apples to apples and is instead conservative in many cases, for example https://github.com/SciML/SciPyDiffEq.jl#measuring-overhead showing the SciPyDiffEq handling with the Julia JIT optimizations gives a lower overhead than direct SciPy+Numba, so we use the lower overhead numbers in https://docs.sciml.ai/SciMLBenchmarksOutput/stable/MultiLang....
> you must compile/write whole programs in each of the respective languages to enable full compiler/interpreter optimizations
You do realize that a .so has lower overhead to call from a JIT compiled language than from a static compiled language like C because you can optimize away some of the bindings at the runtime right? https://github.com/dyu/ffi-overhead is a measurement of that, and you see LuaJIT and Julia as faster than C and Fortran here. This shouldn't be surprising because it's pretty clear how that works?
I mean yes, someone can always ask for more benchmarks, but now we have a site that's auto updating tons and tons of ODE benchmarks with ODE systems ranging from size 2 to the thousands, with as many things as we can wrap in as many scenarios as we can wrap. And we don't even "win" all of our benchmarks because unlike for you, these benchmarks aren't for winning but for tracking development (somehow for Hacker News folks they ignore the utility part and go straight to language wars...).
If you have a concrete change you think can improve the benchmarks, then please share it at https://github.com/SciML/SciMLBenchmarks.jl. We'll be happy to make and maintain another.
Yann Lecun: ML would have advanced if other lang had been adopted versus Python
9 projects | news.ycombinator.com | 22 Feb 2023
Small Neural networks in Julia 5x faster than PyTorch
8 projects | news.ycombinator.com | 14 Apr 2022

Ask them to download Julia and try it, and file an issue if it is not fast enough. We try to have the latest available.
See for example: https://github.com/JuliaLinearAlgebra/RecursiveFactorization...

What are some alternatives?

When comparing tiny-cuda-nn and RecursiveFactorization you can also consider the following projects:

instant-ngp - Instant neural graphics primitives: lightning fast NeRF and more

diffrax - Numerical differential equation solvers in JAX. Autodifferentiable and GPU-capable. https://docs.kidger.site/diffrax/

blis - BLAS-like Library Instantiation Software Framework

vectorflow

LeNetTorch - PyTorch implementation of LeNet for fitting MNIST for benchmarking.

juliaup - Julia installer and version multiplexer

KiteSimulators.jl - Simulators for kite power systems

RecursiveFactorization.jl

SciPyDiffEq.jl - Wrappers for the SciPy differential equation solvers for the SciML Scientific Machine Learning organization

tiny-cuda-nn vs instant-ngp RecursiveFactorization vs diffrax tiny-cuda-nn vs blis RecursiveFactorization vs vectorflow tiny-cuda-nn vs diffrax RecursiveFactorization vs LeNetTorch tiny-cuda-nn vs juliaup RecursiveFactorization vs KiteSimulators.jl tiny-cuda-nn vs RecursiveFactorization.jl RecursiveFactorization vs RecursiveFactorization.jl tiny-cuda-nn vs vectorflow RecursiveFactorization vs SciPyDiffEq.jl

Compare tiny-cuda-nn vs RecursiveFactorization and see what are their differences.

tiny-cuda-nn

RecursiveFactorization

tiny-cuda-nn

RecursiveFactorization

What are some alternatives?