shumai
tinygrad
shumai | tinygrad | |
---|---|---|
15 | 58 | |
1,122 | 17,800 | |
0.2% | - | |
2.2 | 9.7 | |
9 months ago | 10 months ago | |
TypeScript | Python | |
MIT License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
shumai
-
PyTorch Primitives in WebGPU for the Browser
https://github.com/tensorflow/tfjs/tree/master/tfjs-backend-...
([...], tflite-support, tflite-micro)
From facebookresearch/shumai (a JS tensor library) https://github.com/facebookresearch/shumai/issues/122 :
> It doesn't make sense to support anything besides WebGPU at this point. WASM + SIMD is around 15-20x slower on my machine[1]. Although WebGL is more widely supported today, it doesn't have the compute features needed for efficient modern ML (transformers etc) and will likely be a deprecated backend for other frameworks when WebGPU comes online.
tensorflow rust has a struct.Tensor:
-
Why do people curse JS so much, but also say it's better than Python
JS for ML actually does exist https://github.com/facebookresearch/shumai
-
Breaking Up with Python
> It's really a shame that data science, ML, and notebooks are so wrapped up in it. Otherwise we could jettison the whole thing into space
Although I personally feel Python has its place, I contribute to a project that hopes to diversify the ML/scientific computing space with a TypeScript tensor lib called Shumai: https://github.com/facebookresearch/shumai
-
Tinygrad: A simple and powerful neural network framework
Doesn’t really matter for large batch/large model training on GPUs that don’t need much coordination.
But Python speed is one of the main motivations for a JS/TS based ML lib I’m working on: https://github.com/facebookresearch/shumai
-
[D] Using JavaScript for ML Training/Research (not in the browser)
As a hedge against CPython never becoming fast, we're creating a project called Shumai that attempts to deeply integrate with a new JavaScript runtime (Bun[3]).
-
Python 3.11 is much faster than 3.8
You can expose objects. Here's how it is done in Bun: https://github.com/facebookresearch/shumai/blob/main/shumai/...
We've been using this feature heavily in Shumai.
I think you are vastly overestimating the complexity associated with this (user exposed ref-counting/garbage collection) and may not be totally up to date on what's implemented.
- Shumai: Fast Differentiable Tensor Library in TypeScript with Bun and Flashlight
- Shumai: A fast differentiable tensor library for research in TypeScript and JavaScript
-
7% Speedup from Switch to and
This thought is pretty much the exact motivation behind a recent effort I’m helping out with https://github.com/facebookresearch/shumai
tinygrad
- tinygrad: extreme simplicity, easiest framework to add new accelerators to
-
GGML – AI at the Edge
Might be a silly question but is GGML a similar/competing library to George Hotz's tinygrad [0]?
[0] https://github.com/geohot/tinygrad
-
Render neural network into CUDA/HIP code
at first glance i thought may its like tinygrad. but looks has many ops than that tiny grad but most maps to underlying hardware provided ops?
i wonder how well tinygrad's apporach will work out, ops fusion sounds easy, just a walk a graph, pattern match it and lower to hardware provided ops?
Anyway if anyone wants to understand the philosophy behind tinygrad, this file is great start https://github.com/geohot/tinygrad/blob/master/docs/abstract...
-
llama.cpp now officially supports GPU acceleration.
There are currently at least 3 ways to run llama on m1 with GPU acceleration. - mlc-llm (pre-built, only 1 model has been ported) - tinygrad (very memory efficient, not that easy to integrate into other projects) - llama-mps (original llama codebase + llama adapter support)
- George Hotz building an AMD competitor to Nvidia.
-
George Hotz ROCm adventures
Hopefully we will see now full support with AMD hardware on https://github.com/geohot/tinygrad. You can read more about it on https://tinygrad.org/
-
The Coming of Local LLMs
tinygrad
https://github.com/geohot/tinygrad/tree/master/accel/ane
But I have not tested it on Linux since Asahi has not yet added support.
llama.cpp runs at 18ms per token (7B) and 200ms per token (65B) without quantization.
- Everything we know about Apple's Neural Engine
- Everything we know about the Apple Neural Engine (ANE)
- How 'Open' Is OpenAI, Really?
What are some alternatives?
rosettaboy - A gameboy emulator in several different languages
Pytorch - Tensors and Dynamic neural networks in Python with strong GPU acceleration
jittor - Jittor is a high-performance deep learning framework based on JIT compiling and meta-operators.
llama.cpp - LLM inference in C/C++
openpilot - openpilot is an open source driver assistance system. openpilot performs the functions of Automated Lane Centering and Adaptive Cruise Control for 250+ supported car makes and models.
devdocs - API Documentation Browser
llama - Inference code for Llama models
FrameworkBenchmarks - Source for the TechEmpower Framework Benchmarks project
tensorflow_macos - TensorFlow for macOS 11.0+ accelerated using Apple's ML Compute framework.
GPTQ-for-LLaMa - 4 bits quantization of LLaMA using GPTQ