BQNprop
tinygrad
Our great sponsors
- InfluxDB - Access the most powerful time series database as a service
- ONLYOFFICE ONLYOFFICE Docs — document collaboration in your environment
- CodiumAI - TestGPT | Generating meaningful tests for busy devs
- Sonar - Write Clean Python Code. Always.
BQNprop | tinygrad | |
---|---|---|
1 | 55 | |
3 | 13,677 | |
- | - | |
0.0 | 9.8 | |
almost 2 years ago | about 5 hours ago | |
Python | ||
- | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
BQNprop
-
Jd
Oh, thanks for clarifying, since it occurred to me that you might mean just the appeal to you, but not that you meant the field of programming! I'm no NN expert, but tinygrad looks very approachable in BQN. You might be interested in some other initial work along those lines: https://github.com/loovjo/BQN-autograd with automatic differentiation, and the smaller https://github.com/bddean/BQNprop using backprop.
tinygrad
-
llama.cpp now officially supports GPU acceleration.
There are currently at least 3 ways to run llama on m1 with GPU acceleration. - mlc-llm (pre-built, only 1 model has been ported) - tinygrad (very memory efficient, not that easy to integrate into other projects) - llama-mps (original llama codebase + llama adapter support)
-
The Coming of Local LLMs
tinygrad
https://github.com/geohot/tinygrad/tree/master/accel/ane
But I have not tested it on Linux since Asahi has not yet added support.
llama.cpp runs at 18ms per token (7B) and 200ms per token (65B) without quantization.
- Everything we know about the Apple Neural Engine (ANE)
- How 'Open' Is OpenAI, Really?
-
LLaMA-7B in Pure C++ with full Apple Silicon support
Also in that realm is tinygrad by geohot, which has an open PR for integrating LLaMA support.
George Hotz already implemented LLaMA 7B and 15B on Twitch yesterday on GPU in Tunygrad llama branch:
https://github.com/geohot/tinygrad/tree/llama
The only problem is that it's swapping on 16GB Macbook, so you need at least 24GB in practice.
- Nvidia reported Q4 23 results today - Why is the stock up $18 (9%) AH?
-
Why TensorFlow for Python is dying a slow death
While PyTorch is obviously the future in the short term, it will be interesting to see how this space evolves.
Before Tensorflow, people (myself included) were largely coding all of this stuff pretty manually, or with the zoo of incredibly clucky homemade libs.
Tensorflow and PyTorch made the whole situation far more accessible and sane. You can get a basic neural network working in a few lines of code. Magical.
But it's still early days. George Hotz, author of tinygrad[0], a PyTorch "competitor", made a really insightful comment -- we will look back on PyTorch & friends like we look back on FORTRAN and COBOL. Yes, they were far better than assembly. But they are really clunky compared to what we have today.
What will we have in 20 years?
[0] https://github.com/geohot/tinygrad, https://tinygrad.org
- Ask HN: Strategies for working with engineers that are too smart?
-
Ask HN: How to get back into AI?
Read all the leading papers, many times, to get a deep understanding, the writing quality is usually pretty low, but the information density can be very high, you'll probably miss the important details the first time.
Most medium and low-quality papers are full of errors and noise, but you can still learn from them.
Get your hands dirty with real code.
I would take a look at those:
What are some alternatives?
Pytorch - Tensors and Dynamic neural networks in Python with strong GPU acceleration
llama.cpp - Port of Facebook's LLaMA model in C/C++
openpilot - openpilot is an open source driver assistance system. openpilot performs the functions of Automated Lane Centering and Adaptive Cruise Control for over 200 supported car makes and models.
tensorflow_macos - TensorFlow for macOS 11.0+ accelerated using Apple's ML Compute framework.
minikeyvalue - A distributed key value store in under 1000 lines. Used in production at comma.ai
text-generation-webui - A gradio web UI for running Large Language Models like LLaMA, llama.cpp, GPT-J, Pythia, OPT, and GALACTICA.
flameshot - Powerful yet simple to use screenshot software :desktop_computer: :camera_flash:
llama - Inference code for LLaMA models
docs - Hardware and software docs / wiki
LevelDB - LevelDB is a fast key-value storage library written at Google that provides an ordered mapping from string keys to string values.
jsource - J engine source mirror
mxnet - Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more