NeuralFlow

Visualize the intermediate output of Mistral 7B (by valine)

NeuralFlow Alternatives

Similar projects and alternatives to NeuralFlow

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better NeuralFlow alternative or higher similarity.

NeuralFlow reviews and mentions

Posts with mentions or reviews of NeuralFlow. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-02-19.
  • FLaNK Stack Weekly 19 Feb 2024
    50 projects | dev.to | 19 Feb 2024
  • Show HN: NeuralFlow – Visualize the intermediate output of Mistral 7B
    3 projects | news.ycombinator.com | 14 Feb 2024
    A few days ago I saw a post using NeuralFlow to help explain the repetition problem.

    https://old.reddit.com/r/LocalLLaMA/comments/1ap8mxh/what_ca...

    > I’ve done some investigation into this. In a well trained model, if you plot the intermediate output for the last token in the sequence, you see the values update gradually layer to layer. In a model that produces repeating sequences I almost always see a sudden discontinuity at some specific layer. The residual connections are basically flooding the next layer with a distribution of values outside anything else in the dataset.

    > The discontinuity is pretty classic overfitting. You’ve both trained a specific token to attend primarily to itself and also incentivized that token to be sampled more often. The result is that if that token is ever included at the end of the context the model is incentivized to repeat it again.

    ...

    > Literally just plotting the output of the layer normalized between zero and one. For one token in mistral 7B it’s a 4096 dimension tensor. Because of the residual connections if you plot that graph for every layer you get a really nice visualization.

    > Edit: Here's my visualization. It’s a simple idea but I've never personally seen it done before. AFAIK this is a somewhat novel way to look at transformer layer output.

    > Initial output: https://imgur.com/sMwEFEw

    > Over-fit output: https://imgur.com/a0obyUj

    > Second edit: Code to generate the visualization: https://github.com/valine/NeuralFlow

Stats

Basic NeuralFlow repo stats
4
263
7.2
3 months ago

valine/NeuralFlow is an open source project licensed under GNU General Public License v3.0 only which is an OSI approved license.

The primary programming language of NeuralFlow is Python.


Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com