loop_tool
tinygrad
loop_tool | tinygrad | |
---|---|---|
4 | 58 | |
145 | 17,800 | |
- | - | |
0.0 | 9.7 | |
over 1 year ago | 10 months ago | |
C++ | Python | |
MIT License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
loop_tool
-
Tinygrad: A simple and powerful neural network framework
I've done some work in the past in representations and you actually can represent Conv and MatMul in more primitive ways. I ended up writing an IR called loop_tool that exposes this stuff pretty nicely:
https://github.com/facebookresearch/loop_tool/blob/main/pyth...
The idea is basically this: https://news.ycombinator.com/item?id=28883086
-
Interactive Loop Optimization
I just finished adding a basic WASM[1] backend + basic JavaScript frontend[2]. I'm in the process of adding in-browser optimization[3] and will hope to have a demo some time this week!
[1] https://github.com/facebookresearch/loop_tool/blob/main/src/...
[2] https://github.com/facebookresearch/loop_tool/blob/main/java...
[3] https://github.com/facebookresearch/loop_tool/blob/main/test...
- Loop_tool: A toolkit for loop-based computation
- Loop_tool tutorial – a lazy symbolic linear algebra toolkit
tinygrad
- tinygrad: extreme simplicity, easiest framework to add new accelerators to
-
GGML – AI at the Edge
Might be a silly question but is GGML a similar/competing library to George Hotz's tinygrad [0]?
[0] https://github.com/geohot/tinygrad
-
Render neural network into CUDA/HIP code
at first glance i thought may its like tinygrad. but looks has many ops than that tiny grad but most maps to underlying hardware provided ops?
i wonder how well tinygrad's apporach will work out, ops fusion sounds easy, just a walk a graph, pattern match it and lower to hardware provided ops?
Anyway if anyone wants to understand the philosophy behind tinygrad, this file is great start https://github.com/geohot/tinygrad/blob/master/docs/abstract...
-
llama.cpp now officially supports GPU acceleration.
There are currently at least 3 ways to run llama on m1 with GPU acceleration. - mlc-llm (pre-built, only 1 model has been ported) - tinygrad (very memory efficient, not that easy to integrate into other projects) - llama-mps (original llama codebase + llama adapter support)
- George Hotz building an AMD competitor to Nvidia.
-
George Hotz ROCm adventures
Hopefully we will see now full support with AMD hardware on https://github.com/geohot/tinygrad. You can read more about it on https://tinygrad.org/
-
The Coming of Local LLMs
tinygrad
https://github.com/geohot/tinygrad/tree/master/accel/ane
But I have not tested it on Linux since Asahi has not yet added support.
llama.cpp runs at 18ms per token (7B) and 200ms per token (65B) without quantization.
- Everything we know about Apple's Neural Engine
- Everything we know about the Apple Neural Engine (ANE)
- How 'Open' Is OpenAI, Really?
What are some alternatives?
thinc - 🔮 A refreshing functional take on deep learning, compatible with your favorite libraries
Pytorch - Tensors and Dynamic neural networks in Python with strong GPU acceleration
shumai - Fast Differentiable Tensor Library in JavaScript and TypeScript with Bun + Flashlight
llama.cpp - LLM inference in C/C++
black - The uncompromising Python code formatter
openpilot - openpilot is an open source driver assistance system. openpilot performs the functions of Automated Lane Centering and Adaptive Cruise Control for 250+ supported car makes and models.
llama - Inference code for Llama models
jittor - Jittor is a high-performance deep learning framework based on JIT compiling and meta-operators.
tensorflow_macos - TensorFlow for macOS 11.0+ accelerated using Apple's ML Compute framework.
nnabla - Neural Network Libraries
GPTQ-for-LLaMa - 4 bits quantization of LLaMA using GPTQ