Transformers from Scratch

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

scratch-www

804 1,561 9.9 JavaScript

Standalone web client for Scratch

The capital letter on "Scratch" in the title made me think that the article was about implementing transformers on https://scratch.mit.edu/ -- which would be amazing, but it's not the case.

workshop

3 13 10.0 Python

- There are a few common ways you might see this done, but they broadly work by assigning fixed or learned embeddings to each position in the input token sequence. These embeddings can be added to our matrix above so that the first row gets the embedding for the first position added to it, the second row gets the embedding for the second position, and so on. Now if the tokens are reordered, the embedding matrix will not be the same. Alternatively, these embeddings can be concatenated horizontally to our matrix: this guarantees the positional information is kept entirely separate from the linguistic (at the cost of having a larger combined embedding that the block must support).
I put together this repository at the end of last year to better help visualize the internals of a transformer block when applied to a toy problem: https://github.com/rstebbing/workshop/tree/main/experiments/.... It is not super long, and the point is to try and better distinguish between the quantities you referred to by seeing them (which is possible when embeddings are in a low dimension).
I hope this helps!

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
picoGPT

7 3,081 1.9 Python

An unnecessarily tiny implementation of GPT-2 in NumPy.

I wrote a minimal implementation in NumPy here (the forward pass code is only 40 lines): https://github.com/jaymody/picoGPT
Although this is for a decoder-only transformer (aka GPT) and doesnt include the encoder part.

potatogpt

2 40 6.0 TypeScript

Pure Typescript, dependency free, ridiculously slow implementation of GPT2 for educational purposes

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

The Impact of API Response Time on Performance: What You Need to Know

2 projects | dev.to | 16 May 2024
Ask HN: Running LLMs Locally

2 projects | news.ycombinator.com | 15 May 2024
GPUsGoBurr: Get up to 2x higher performance by Tuning LLM Inference Deployment

1 project | news.ycombinator.com | 15 May 2024
Show HN: Tarsier – vision for text-only LLM web agents that beats GPT-4o

8 projects | news.ycombinator.com | 15 May 2024
PaliGemma: Open-Source Multimodal Model by Google

5 projects | news.ycombinator.com | 15 May 2024