pykan
kan-gpt
pykan | kan-gpt | |
---|---|---|
3 | 2 | |
13,143 | 625 | |
- | - | |
9.3 | 8.8 | |
7 days ago | 16 days ago | |
Jupyter Notebook | Python | |
MIT License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
pykan
-
Kolmogorov-Arnold Networks
Update2: got it to 100% training accuracy, 99 test accuracy with (2, 2, 2) shape.
Changes:
1. Increased the training set from 1000 to 100k samples. This solved overfitting.
2. In the dataset generation, slightly reduced noise (0.1 -> 0.07) so that classes don't overlap. With an overlap, naturally, it's impossible to hit 100%.
3. Most important & specific to KANs: train for 30 steps with grid=5 (5 segments for each activation function), then 30 steps with grid=10 (and initializing from the previous model), and then 30 steps with grid=20. This is idiomatic to KANs and covered in the Example_1_function_fitting.ipynb: https://github.com/KindXiaoming/pykan/blob/master/tutorials/...
Overall, my impressions are:
- it works!
- the reference implementation is very slow. A GPU implementation is dearly needed.
- it feels like it's a bit too non-linear and training is not as stable as it's with MLP + ReLU.
- Scaling is not guaranteed to work well. Really need to see if MNIST is possible to solve with this approach.
I will definitely keep an eye on this development.
kan-gpt
- FLaNK-AIM Weekly 13 May 2024
-
Kolmogorov-Arnold Networks
- Training script
I am currently working on training it on the WebText dataset to compare it to the original gpt2. Facing a few out-of-memory issues at the moment. Perhaps the vocab size (50257) is too large?
I'm open to contributions and would love to hear your thoughts!
https://github.com/AdityaNG/kan-gpt
What are some alternatives?
efficient-kan - An efficient pure-PyTorch implementation of Kolmogorov-Arnold Network (KAN).
FourierKAN