LLMs Learn to Be "Generative"

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

nanoGPT

69 31,914 5.4 Python

The simplest, fastest repository for training/finetuning medium-sized GPTs.

where x1 denotes the 1st token, x2 denotes the 2nd token and so on, respectively.
I understand the conditional terms p(x_n|...) where we use cross-entropy to calculate their losses. However, I'm unsure about the probability of the very first token p(x1). How is it calculated? Is it in some configurations of the training process, or in the model architecture, or in the loss function?
IMHO, if the model doesn't learn p(x1) properly, the entire formula for Bayes' rule cannot be completed, and we can't refer to LLMs as "truly generative". Am I missing something here?
I asked the same question on nanoGPT repo: https://github.com/karpathy/nanoGPT/issues/432, but I haven't found the answer I'm looking for yet. Could someone please enlighten me.

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Ask HN: Modern Day Equivalent to HyperCard?

5 projects | news.ycombinator.com | 1 May 2024
CommaAgents, LLM AutoGenish like system for building LLM systems

1 project | news.ycombinator.com | 1 May 2024
Monitor Postgres replication slot growth via Slack

1 project | news.ycombinator.com | 1 May 2024
Fourier Kolmogorov-Arnold Networks

1 project | news.ycombinator.com | 1 May 2024
Emulation of Nintendo Game Boy (DMG-01) (2016) [pdf]

1 project | news.ycombinator.com | 1 May 2024

LLMs Learn to Be "Generative"

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com Post date: 4 Feb 2024

nanoGPT

InfluxDB

Related posts

Ask HN: Modern Day Equivalent to HyperCard?

CommaAgents, LLM AutoGenish like system for building LLM systems

Monitor Postgres replication slot growth via Slack

Fourier Kolmogorov-Arnold Networks

Emulation of Nintendo Game Boy (DMG-01) (2016) [pdf]