LongNet VS long_llama

Compare LongNet vs long_llama and see what are their differences.

LongNet

Implementation of plug in and play Attention from "LongNet: Scaling Transformers to 1,000,000,000 Tokens" (by kyegomez)

long_llama

LongLLaMA is a large language model capable of handling long contexts. It is based on OpenLLaMA and fine-tuned with the Focused Transformer (FoT) method. (by CStanKonrad)
InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
LongNet long_llama
16 5
652 1,436
- -
9.0 7.9
4 months ago 6 months ago
Python Python
Apache License 2.0 Apache License 2.0
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

LongNet

Posts with mentions or reviews of LongNet. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-07-07.

long_llama

Posts with mentions or reviews of long_llama. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-07-07.
  • LongLLaMA-Instruct v1.1 32K
    1 project | /r/LocalLLaMA | 11 Aug 2023
  • How content extension works in simple words?
    1 project | /r/LocalLLaMA | 10 Jul 2023
    same time we also have chart like this, that shows that models with extended context work well with contexts longer they are learnt on https://github.com/CStanKonrad/long_llama
  • Generating "stories" with smaller models
    1 project | /r/LocalLLaMA | 10 Jul 2023
  • Deepmind: Focused Transformer: Contrastive Training for Context Scaling
    1 project | /r/LocalLLaMA | 9 Jul 2023
    LONGLLAMA : extending LLaMA’s context length with FOT One of the promises of our work is that FOT can be used to fine-tune already existing large models to extend their context length. In this section, we show that this is indeed the case. We use OpenLLaMA-3B and OpenLLaMA-7B models trained for 1T tokens as start- ing points and fine-tune them with FOT. We show that the resulting models, which we call LONGLLAMAs, are capable of extrapolating beyond their training context length (even up to 256K) and retain the performance on short-context tasks. We release the inference code on GitHub: https://github.com/CStanKonrad/long_llama and the LONGLLAMA-3B check- point on Hugging Face: https://huggingface.co/syzymon/long_llama_3b. We note that our checkpoint is backward compatible, i.e. can be used with any existing LLaMA inference code (both in Hugging Face and other implementations), albeit without long-context capabilities
  • LongLlama
    2 projects | /r/LocalLLaMA | 7 Jul 2023

What are some alternatives?

When comparing LongNet and long_llama you can also consider the following projects:

Transformer-in-Transformer - An Implementation of Transformer in Transformer in TensorFlow for image classification, attention inside local patches

unilm - Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

PromptBroker - 🦊 The ONLY AI Prompts Broker you will ever need.

a-PyTorch-Tutorial-to-Transformers - Attention Is All You Need | a PyTorch Tutorial to Transformers

swarms - Orchestrate Swarms of Agents From Any Framework Like OpenAI, Langchain, and Etc for Real World Workflow Automation. Join our Community: https://discord.gg/DbjBMJTSWD

Play-Billing-v6-For-Unity - A Plugin for Unity which implements Google Play Billing Library v6.0.1 for in app products, made (mostly) by ChatGPT and GPT-4.

nn - 🧑‍🏫 60 Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠