reweight-gpt
repeng
reweight-gpt | repeng | |
---|---|---|
1 | 1 | |
51 | 573 | |
- | 3.7% | |
6.3 | 5.6 | |
over 1 year ago | 4 months ago | |
Jupyter Notebook | Jupyter Notebook | |
MIT License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
reweight-gpt
-
[Research] An alternative to self-attention mechanism in GPT
Instead of self attention, I tried to generate the self-attention matrix directly using lateral connections among the inputs. The method is like LSTM but it gates all the past inputs using separate gates for each input (it can be parallelized). It's very easy to implement the method into the current GPT architectures. You just remove the attention part and replace it with learnable weights. Her is a working implementation (around100 lines!): Code: https://github.com/hunar4321/reweight-gpt In my experience, it learns very well and it can super-pass the self-attention mechanism if the number of the parameters are matched. (I tested it on small datasets for next character prediction. I haven't systematically compared these two methods yet).
repeng
-
RWKV Language Model
I'm quite interested in repeng [0] (representztion engineering) for steerability of (so fzr transformer based) LLMs and was wondering if anyone had tried such methods on rwkv (or mamba for that matter). Maybe there are some low hanging fruits about it.
[0] https://github.com/vgel/repeng/issues
What are some alternatives?
ai_story_scale - The AI story scale (AISS): A human rating scale for texts written with generative language models.
RWKV-Runner - A RWKV management and startup tool, full automation, only 8MB. And provides an interface compatible with the OpenAI API. RWKV is a large language model that is fully open source and available for commercial use.
ML-foundations - Machine Learning Foundations: Linear Algebra, Calculus, Statistics & Computer Science
Promptify - Prompt Engineering | Prompt Versioning | Use GPT or other prompt based models to get structured output. Join our discord for Prompt-Engineering, LLMs and other latest research
numerical-linear-algebra - Free online textbook of Jupyter notebooks for fast.ai Computational Linear Algebra course
generative-ai-for-beginners - 21 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/