[R] Mega: Moving Average Equipped Gated Attention. By using LSTM-style gates, Mega outperforms Transformer and S4 over Long Range Area, NMT, ImageNet, Wikitext-103 and raw speech classification.

This page summarizes the projects mentioned and recommended in the original post on /r/MachineLearning

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • RWKV-LM

    RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.

  • thanks for the paper and the great experimental results! the findings line up with the "time delay" use in https://github.com/BlinkDL/RWKV-LM from /u/bo_peng

  • fairseq-apollo

    FairSeq repo with Apollo optimizer

  • Here is the code https://github.com/XuezheMax/fairseq-apollo The checkpoints are releasing soon.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Ask HN: Running LLMs Locally

    2 projects | news.ycombinator.com | 15 May 2024
  • GPUsGoBurr: Get up to 2x higher performance by Tuning LLM Inference Deployment

    1 project | news.ycombinator.com | 15 May 2024
  • Show HN: Tarsier – vision for text-only LLM web agents that beats GPT-4o

    8 projects | news.ycombinator.com | 15 May 2024
  • PaliGemma: Open-Source Multimodal Model by Google

    5 projects | news.ycombinator.com | 15 May 2024
  • Project Gameface Launches on Android

    1 project | news.ycombinator.com | 15 May 2024