Landmark-attention Alternatives

Similar projects and alternatives to landmark-attention

text-generation-webui

876 36,293 9.9 Python landmark-attention VS text-generation-webui

A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.
can-ai-code

30 432 9.6 Python landmark-attention VS can-ai-code

Self-evaluating interview for AI coders
InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
landmark-attention-qlora

3 124 5.6 Python landmark-attention VS landmark-attention-qlora

Landmark Attention: Random-Access Infinite Context Length for Transformers QLoRA

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better landmark-attention alternative or higher similarity.

Suggest an alternative to landmark-attention

landmark-attention reviews and mentions

Posts with mentions or reviews of landmark-attention. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-06-21.

LLMs use a surprisingly simple mechanism to retrieve some stored knowledge
1 project | news.ycombinator.com | 28 Mar 2024

It indeed is. An attention mechanism's key and value matrices grow linearly with context length. With PagedAttention[1], we could imagine an external service providing context. The hard part is the how, of course. We can't load our entire database in every conversation, and I suspect there are also training challenges (perhaps addressed via LandmarkAttention[2] and other mechanisms to efficiently retrieve additional key-value matrices.
To manage 20-50 tokens/sec, must arrive within 50-20ms. Pausing the autoregressive transformer when it creates a Q vector stalls the batch, so we need a way to predict queries _ahead_ of where they'd be useful.
[1] https://arxiv.org/abs/2309.06180
[2] https://arxiv.org/abs/2305.16300
Which are the best LLMs that can explain code?
3 projects | /r/LocalLLaMA | 21 Jun 2023
Landmark Attention Oobabooga Support + GPTQ Quantized Models!
2 projects | /r/LocalLLaMA | 13 Jun 2023

Thanks again to the team who worked on the original landmark paper for making this possible! https://github.com/epfml/landmark-attention They made an update to the repo and the code I wrote 4 days ago is now marked legacy so I'm in the process of updating it again...
New OpenAI update: lowered pricing and a new 16k context version of GPT-3.5
1 project | /r/singularity | 13 Jun 2023
Context tokens are the bane of all fun.
1 project | /r/LocalLLaMA | 9 Jun 2023

Implementing other solutions such as Landmark Attention to allow for much larger context windows. Landmark Attention basically creates new 'landmark tokens' that represent larger chunks of input tokens, and the language model is fine-tuned to allow the attention layer to access the relevant landmark tokens, effectively overcoming context window issues with our relying on external retrieval processes like LangChain.
"Today, the diff weights for LLaMA 7B were published which enable it to support context sizes of up to 32k"
2 projects | /r/LocalLLaMA | 6 Jun 2023

Links: https://arxiv.org/abs/2305.16300 https://huggingface.co/epfml/landmark-attention-llama7b-wdiff https://github.com/epfml/landmark-attention
The weight diffs for 32K context length LLaMA 7B trained with landmark attention have been released
2 projects | /r/LocalLLaMA | 4 Jun 2023

Paper: https://arxiv.org/abs/2305.16300
[N] (Update: Code Released) Landmark Attention: Random-Access Infinite Context Length for Transformers
1 project | /r/MachineLearning | 31 May 2023
(Code Released) Landmark Attention: Random-Access Infinite Context Length for Transformers
2 projects | /r/LocalLLaMA | 30 May 2023
Landmark Attention: Random-Access Infinite Context Length for Transformers
1 project | news.ycombinator.com | 28 May 2023

The link to the repo (https://github.com/epfml/landmark-attention) leads to "we'll publish something later".
A note from our sponsor - WorkOS
workos.com | 27 Apr 2024

The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning. Learn more →

Stats

Basic landmark-attention repo stats

Mentions

Stars

389

Activity

5.4

Last Commit

4 months ago

epfml/landmark-attention is an open source project licensed under Apache License 2.0 which is an OSI approved license.

The primary programming language of landmark-attention is Python.

Popular Comparisons