LongNet vs long_llama

LongNet

Implementation of plug in and play Attention from "LongNet: Scaling Transformers to 1,000,000,000 Tokens" (by kyegomez)

Source Code

discord.gg

Suggest alternative

Edit details

long_llama

LongLLaMA is a large language model capable of handling long contexts. It is based on OpenLLaMA and fine-tuned with the Focused Transformer (FoT) method. (by CStanKonrad)

Suggest topics

Source Code

Suggest alternative

Edit details

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

LongNet		long_llama
	Project
16	Mentions	5
652	Stars	1,436
-	Growth	-
9.0	Activity	7.9
4 months ago	Latest Commit	6 months ago
Python	Language	Python
Apache License 2.0	License	Apache License 2.0

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

LongNet

Posts with mentions or reviews of LongNet. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-07-07.

Which features you wish that were added to Character Ai?
1 project | /r/CharacterAI | 7 Jul 2023

i wish they would implement this into character.ai github.com/kyegomez/LongNet
Why AI will not replace programmers.
2 projects | /r/ChatGPT | 7 Jul 2023
LongLlama
2 projects | /r/LocalLLaMA | 7 Jul 2023

If you want to talk immature looking, longnet wouldn't even compile. That's a big oof, considering it's a python and usually nonworking code is good enough to generate byte code. (also it has hard-coded dtype and device)
An open model that beats ChatGPT. We're seeing a real shift towards open source models that will accelerate in the coming weeks.
1 project | /r/aipromptprogramming | 6 Jul 2023

When will the Open Source LLMs start using LongNet https://github.com/kyegomez/LongNet https://arxiv.org/abs/2307.02486
GitHub - kyegomez/LongNet: Implementation of plug in and play Attention from "LongNet: Scaling Transformers to 1,000,000,000 Tokens"
1 project | /r/AITechTips | 6 Jul 2023

1 project | /r/mlscaling | 6 Jul 2023

1 project | /r/machinelearningnews | 6 Jul 2023

1 project | /r/MachineLearning | 6 Jul 2023

1 project | /r/singularity | 6 Jul 2023

1 project | /r/mlscaling | 6 Jul 2023

long_llama

Posts with mentions or reviews of long_llama. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-07-07.

LongLLaMA-Instruct v1.1 32K
1 project | /r/LocalLLaMA | 11 Aug 2023
How content extension works in simple words?
1 project | /r/LocalLLaMA | 10 Jul 2023

same time we also have chart like this, that shows that models with extended context work well with contexts longer they are learnt on https://github.com/CStanKonrad/long_llama
Generating "stories" with smaller models
1 project | /r/LocalLLaMA | 10 Jul 2023
Deepmind: Focused Transformer: Contrastive Training for Context Scaling
1 project | /r/LocalLLaMA | 9 Jul 2023

LONGLLAMA : extending LLaMA’s context length with FOT One of the promises of our work is that FOT can be used to fine-tune already existing large models to extend their context length. In this section, we show that this is indeed the case. We use OpenLLaMA-3B and OpenLLaMA-7B models trained for 1T tokens as start- ing points and fine-tune them with FOT. We show that the resulting models, which we call LONGLLAMAs, are capable of extrapolating beyond their training context length (even up to 256K) and retain the performance on short-context tasks. We release the inference code on GitHub: https://github.com/CStanKonrad/long_llama and the LONGLLAMA-3B check- point on Hugging Face: https://huggingface.co/syzymon/long_llama_3b. We note that our checkpoint is backward compatible, i.e. can be used with any existing LLaMA inference code (both in Hugging Face and other implementations), albeit without long-context capabilities
LongLlama
2 projects | /r/LocalLLaMA | 7 Jul 2023

What are some alternatives?

When comparing LongNet and long_llama you can also consider the following projects:

Transformer-in-Transformer - An Implementation of Transformer in Transformer in TensorFlow for image classification, attention inside local patches

unilm - Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

PromptBroker - 🦊 The ONLY AI Prompts Broker you will ever need.

a-PyTorch-Tutorial-to-Transformers - Attention Is All You Need | a PyTorch Tutorial to Transformers

swarms - Orchestrate Swarms of Agents From Any Framework Like OpenAI, Langchain, and Etc for Real World Workflow Automation. Join our Community: https://discord.gg/DbjBMJTSWD

Play-Billing-v6-For-Unity - A Plugin for Unity which implements Google Play Billing Library v6.0.1 for in app products, made (mostly) by ChatGPT and GPT-4.

nn - 🧑‍🏫 60 Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠

LongNet vs Transformer-in-Transformer LongNet vs unilm LongNet vs PromptBroker LongNet vs a-PyTorch-Tutorial-to-Transformers LongNet vs swarms LongNet vs Play-Billing-v6-For-Unity LongNet vs nn

Compare LongNet vs long_llama and see what are their differences.

LongNet

long_llama

LongNet

long_llama

What are some alternatives?