[D] Theoretical papers on transformers? (or attention mechanism, or just seq2seq?)

This page summarizes the projects mentioned and recommended in the original post on /r/MachineLearning

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • x-transformers

    A simple but complete full-attention transformer with a set of promising experimental features from various papers

  • One thing I’ve looked at is the fact that there’s no obvious reason to distinguish between W_K and W_Q in the formulation of a transformer as far as I can tell. However if you build a transformer where you merge the two matrices, it doesn’t learn as well. It still learns, but not as well. You can try out the code here. The training loss can be seen here, though we aborted the run because of how poorly it was doing.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • x-transformers

    1 project | news.ycombinator.com | 31 Mar 2024
  • A single API call using almost the whole 32k context window costs around 2$.

    1 project | /r/OpenAI | 15 Mar 2023
  • GPT-4 architecture: what we can deduce from research literature

    1 project | news.ycombinator.com | 14 Mar 2023
  • You’ll be able to run chatgpt on your own device quite easily very soon

    2 projects | /r/OpenAI | 13 Mar 2023
  • The GPT Architecture, on a Napkin

    4 projects | news.ycombinator.com | 11 Dec 2022