Kolmogorov-Arnold Networks

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • pykan

    Kolmogorov Arnold Networks

  • Update2: got it to 100% training accuracy, 99 test accuracy with (2, 2, 2) shape.

    Changes:

    1. Increased the training set from 1000 to 100k samples. This solved overfitting.

    2. In the dataset generation, slightly reduced noise (0.1 -> 0.07) so that classes don't overlap. With an overlap, naturally, it's impossible to hit 100%.

    3. Most important & specific to KANs: train for 30 steps with grid=5 (5 segments for each activation function), then 30 steps with grid=10 (and initializing from the previous model), and then 30 steps with grid=20. This is idiomatic to KANs and covered in the Example_1_function_fitting.ipynb: https://github.com/KindXiaoming/pykan/blob/master/tutorials/...

    Overall, my impressions are:

    - it works!

    - the reference implementation is very slow. A GPU implementation is dearly needed.

    - it feels like it's a bit too non-linear and training is not as stable as it's with MLP + ReLU.

    - Scaling is not guaranteed to work well. Really need to see if MNIST is possible to solve with this approach.

    I will definitely keep an eye on this development.

  • efficient-kan

    An efficient pure-PyTorch implementation of Kolmogorov-Arnold Network (KAN).

  • Hi Noesis, I just noticed that your implementation, combined with the efficientKAN by Blealtan (https://github.com/Blealtan/efficient-kan), results in a structure very similar to Siren(MLP with Sin activations). efficientKAN first computes the common basis functions for all the edge activations and the output can be calculated with a linear combination of the basis. If the basis functions are fourier, then a KAN layer can be viewed as a linear layer with fixed weights + Sin activation + a linear layer with learnable weights, which is a special form of Siren.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • FourierKAN

  • I quickly skimmed the paper, got inspired to simplify it, and created some Pytorch Layer :

    https://github.com/GistNoesis/FourierKAN/

    The core is really just a few lines.

    In the paper they use some spline interpolation to represent 1d function that they sum. Their code seemed aimed at smaller sizes. Instead I chose a different representation, aka fourier coefficients that are used to interpolate the functions of individual coordinates.

    It should give an idea of Kolmogorov-Arnold networks representation power, it should probably converge easier than their spline version but spline version have less operations.

    Of course, if my code doesn't work, it doesn't mean theirs doesn't.

    Feel free to experiment and publish paper if you want.

  • kan-gpt

    The PyTorch implementation of Generative Pre-trained Transformers (GPTs) using Kolmogorov-Arnold Networks (KANs) for language modeling

  • - Training script

    I am currently working on training it on the WebText dataset to compare it to the original gpt2. Facing a few out-of-memory issues at the moment. Perhaps the vocab size (50257) is too large?

    I'm open to contributions and would love to hear your thoughts!

    https://github.com/AdityaNG/kan-gpt

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Chat.sh – curl almost anything and get a cheat sheet

    1 project | news.ycombinator.com | 21 May 2024
  • We created the first open-source implementation of Meta's TestGen–LLM

    2 projects | news.ycombinator.com | 21 May 2024
  • How to count tokens in frontend for Popular LLM Models: GPT, Claude, and Llama

    2 projects | dev.to | 21 May 2024
  • Blitz: RESTFull API on the Fly

    1 project | news.ycombinator.com | 21 May 2024
  • Ask HN: Do you run your own DNS servers?

    1 project | news.ycombinator.com | 21 May 2024