LongNet
long_llama
LongNet | long_llama | |
---|---|---|
16 | 5 | |
652 | 1,436 | |
- | - | |
9.0 | 7.9 | |
4 months ago | 6 months ago | |
Python | Python | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
LongNet
-
Which features you wish that were added to Character Ai?
i wish they would implement this into character.ai github.com/kyegomez/LongNet
- Why AI will not replace programmers.
-
LongLlama
If you want to talk immature looking, longnet wouldn't even compile. That's a big oof, considering it's a python and usually nonworking code is good enough to generate byte code. (also it has hard-coded dtype and device)
-
An open model that beats ChatGPT. We're seeing a real shift towards open source models that will accelerate in the coming weeks.
When will the Open Source LLMs start using LongNet https://github.com/kyegomez/LongNet https://arxiv.org/abs/2307.02486
- GitHub - kyegomez/LongNet: Implementation of plug in and play Attention from "LongNet: Scaling Transformers to 1,000,000,000 Tokens"
long_llama
- LongLLaMA-Instruct v1.1 32K
-
How content extension works in simple words?
same time we also have chart like this, that shows that models with extended context work well with contexts longer they are learnt on https://github.com/CStanKonrad/long_llama
- Generating "stories" with smaller models
-
Deepmind: Focused Transformer: Contrastive Training for Context Scaling
LONGLLAMA : extending LLaMA’s context length with FOT One of the promises of our work is that FOT can be used to fine-tune already existing large models to extend their context length. In this section, we show that this is indeed the case. We use OpenLLaMA-3B and OpenLLaMA-7B models trained for 1T tokens as start- ing points and fine-tune them with FOT. We show that the resulting models, which we call LONGLLAMAs, are capable of extrapolating beyond their training context length (even up to 256K) and retain the performance on short-context tasks. We release the inference code on GitHub: https://github.com/CStanKonrad/long_llama and the LONGLLAMA-3B check- point on Hugging Face: https://huggingface.co/syzymon/long_llama_3b. We note that our checkpoint is backward compatible, i.e. can be used with any existing LLaMA inference code (both in Hugging Face and other implementations), albeit without long-context capabilities
- LongLlama
What are some alternatives?
Transformer-in-Transformer - An Implementation of Transformer in Transformer in TensorFlow for image classification, attention inside local patches
unilm - Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
PromptBroker - 🦊 The ONLY AI Prompts Broker you will ever need.
a-PyTorch-Tutorial-to-Transformers - Attention Is All You Need | a PyTorch Tutorial to Transformers
swarms - Orchestrate Swarms of Agents From Any Framework Like OpenAI, Langchain, and Etc for Real World Workflow Automation. Join our Community: https://discord.gg/DbjBMJTSWD
Play-Billing-v6-For-Unity - A Plugin for Unity which implements Google Play Billing Library v6.0.1 for in app products, made (mostly) by ChatGPT and GPT-4.
nn - 🧑🏫 60 Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠