Singeli
tinygrad
Our great sponsors
Singeli | tinygrad | |
---|---|---|
7 | 17 | |
92 | 23,864 | |
- | 5.8% | |
9.1 | 10.0 | |
about 2 months ago | 2 days ago | |
C | Python | |
ISC License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Singeli
- Singeli: High-level interface for low-level programming
-
YAML Parser for Dyalog APL
I don't put a lot of stock in the "write-only" accusation. I think it's mostly used by those who don't know APL because, first, it's clever, and second, they can't read the code. However, if I remember I implemented something in J 10 years ago, I will definitely dig out the code because that's the fastest way by far for me to remember how it works.
This project specifically looks to be done in a flat array style similar to Co-dfns[0]. It's not a very common way to use APL. However, I've maintained an array-based compiler [1] for several years, and don't find that reading is a particular difficulty. Debugging is significantly easier than a scalar compiler, because the computation works on arrays drawn from the entire source code, and it's easy to inspect these and figure out what doesn't match expectations. I wrote most of [2] using a more traditional compiler architecture and it's easier to write and extend but feels about the same for reading and small tweaks. See also my review [3] of the denser compiler and precursor Co-dfns.
As for being read by others, short snippets are definitely fine. Taking some from the last week or so in the APL Farm, {⍵÷⍨+/|-/¯9 ¯11+.○?2⍵2⍴0} and {(⍸⍣¯1+\⎕IO,⍺)⊂[⎕IO]⍵} seemed to be easily understood. Forum links at [4]; the APL Orchard is viewable without signup and tends to have a lot of code discussion. There are APL codebases with many programmers, but they tend to be very verbose with long names. Something like the YAML parser here with no comments and single-letter names would be hard to get into. I can recognize, say, that c⌿¨⍨←(∨⍀∧∨⍀U⊖)∘(~⊢∊LF⍪WS⍨)¨c trims leading and trailing whitespace from each string in a few seconds, but in other places there are a lot of magic numbers so I get the "what" but not the "why". Eh, as I look over it things are starting to make sense, could probably get through this in an hour or so. But a lot of APLers don't have experience with the patterns used here.
[0] https://github.com/Co-dfns/Co-dfns
[1] https://github.com/mlochbaum/BQN/blob/master/src/c.bqn
[2] https://github.com/mlochbaum/Singeli/blob/master/singeli.bqn
[3] https://mlochbaum.github.io/BQN/implementation/codfns.html
[4] https://aplwiki.com/wiki/Chat_rooms_and_forums
- Singeli: A DSL for building SIMD algorithms
-
Tolower() in Bulk at Speed
Here's an AVX-2 implementation that assumes it can read up to 31 bytes past the end of the input: https://godbolt.org/z/P7PP1MnK7
Requires -fno-unroll-loops as otherwise clang gets overly unroll-y; the code is fast enough. Tail is dealt with by blending the originally read value with the new one.
(yes, that's autogenerated; from some https://github.com/mlochbaum/singeli code)
-
Jd
It's not ideal, but I've done this in BQN and it took about 15 lines. I didn't need to handle comments or escapes, which would add a little complexity. See functions ParseXml and ParseAttr here: https://github.com/mlochbaum/Singeli/blob/master/data/iintri...
XML is particularly simple though, dealing with something like JPEG would be an entirely different experience.
tinygrad
-
AMD Unveils Ryzen 8000G Series Processors: Zen 4 APUs for Desktop with Ryzen AI
Not sure if I completely understand what "Ryzen AI" does, but Tinygrad for example has some limited support for RDNA3[0]. It isn't quite there yet in matters of performance though, as you can read in the comments of that file.
There's also a small tutorial by AMD on how to use the WMMA intrinsic[1] using AMD's hipcc[2] compiler. Documentation is sparse kinda sparse, but the instruction set is not huge. The RDNA3 ISA guide[3] might also be helpful (and only a fraction of the pages are relevant.)
0. https://github.com/tinygrad/tinygrad/blob/master/extra/gemm/...
1. https://gpuopen.com/learn/wmma_on_rdna3/
2. https://github.com/ROCm/HIPCC
3. https://www.amd.com/content/dam/amd/en/documents/radeon-tech...
- Tinygrad 0.8.0 Release
-
Beyond Backpropagation - Higher Order, Forward and Reverse-mode Automatic Differentiation for Tensorken
This post describes how I added automatic differentiation to Tensorken. Tensorken is my attempt to build a fully featured yet easy-to-understand and hackable implementation of a deep learning library in Rust. It takes inspiration from the likes of PyTorch, Tinygrad, and JAX.
-
[D] What is a good way to maintain code readability and code quality while scaling up complexity in libraries like Hugging Face?
what do you think about tinygrad? I think its a good example of growing and well written, (partially) well documented library with many close to reference implementations
-
AMD MI300 Performance – Faster Than H100, but How Much?
The idea of model architecture making fast hardware design easier is what makes https://github.com/tinygrad/tinygrad so interesting.
-
💻 7 Open-Source DevTools That Save Time You Didn't Know to Exist ⌛🚀
🌟 Support on GitHub Website: https://tinygrad.org/
- Tinygrad
-
How to train an Iris dataset classifier with Tinygrad
Before we begin, make sure you have TinyGrad and the required dependencies installed. You can find the installation instructions here.
-
Decomposing Language Models into Understandable Components
Try to get something like tinygrad[1] running locally, that way you can tweak things a bit run it again and see how it performs. While doing this you'll pick up most of the concepts and get a feeling of how things work. Also, take a look at projects like llama.cpp[2], you don't have to fully understand what's going on here, tho.
You may need some intermediate knowledge of linear algebra and this thing called "data science" nowadays, which is pretty much knowing how to mangle data and visualize it.
Try creating a small model on your own, it doesn't have to be super fancy just make sure it does something you want it to do. And then ... you'll probably could go on your own then.
1: https://github.com/tinygrad/tinygrad
2: https://github.com/ggerganov/llama.cpp
- Tinygrad 0.7.0
What are some alternatives?
tinygrad - You like pytorch? You like micrograd? You love tinygrad! ❤️ [Moved to: https://github.com/tinygrad/tinygrad]
Pytorch - Tensors and Dynamic neural networks in Python with strong GPU acceleration
emojicode - 😀😜🔂 World’s only programming language that’s bursting with emojis
jax - Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
data_jd - Jd
llama.cpp - LLM inference in C/C++
rust - Empowering everyone to build reliable and efficient software.
llama - Inference code for Llama models
BQN-autograd - Autograd library in BQN using (generalized) dual numbers
openpilot - openpilot is an open source driver assistance system. openpilot performs the functions of Automated Lane Centering and Adaptive Cruise Control for 250+ supported car makes and models.
CBQN - a BQN implementation in C
tensorflow_macos - TensorFlow for macOS 11.0+ accelerated using Apple's ML Compute framework.