LLMs use a surprisingly simple mechanism to retrieve some stored knowledge

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

landmark-attention

13 389 5.4 Python

Landmark Attention: Random-Access Infinite Context Length for Transformers

It indeed is. An attention mechanism's key and value matrices grow linearly with context length. With PagedAttention[1], we could imagine an external service providing context. The hard part is the how, of course. We can't load our entire database in every conversation, and I suspect there are also training challenges (perhaps addressed via LandmarkAttention[2] and other mechanisms to efficiently retrieve additional key-value matrices.
To manage 20-50 tokens/sec, must arrive within 50-20ms. Pausing the autoregressive transformer when it creates a Q vector stalls the batch, so we need a way to predict queries _ahead_ of where they'd be useful.
[1] https://arxiv.org/abs/2309.06180
[2] https://arxiv.org/abs/2305.16300

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

VMware Outsourcing Their Support
1 project | news.ycombinator.com | 28 Apr 2024
Show HN: Cognita – open-source RAG framework for modular applications
3 projects | news.ycombinator.com | 27 Apr 2024
Show HN: PgQueuer – Over 5k Jobs/SEC with PostgreSQL
1 project | news.ycombinator.com | 28 Apr 2024
Show HN: Code Limit – Your Refactoring Alarm
1 project | news.ycombinator.com | 28 Apr 2024
Brunoamaral/gregory: Gregory uses AI to help find scientific research
1 project | news.ycombinator.com | 28 Apr 2024

LLMs use a surprisingly simple mechanism to retrieve some stored knowledge

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com Post date: 28 Mar 2024

landmark-attention

WorkOS

Related posts