How Chain-of-Thought Reasoning Helps Neural Networks Compute

Scout Monitoring - Free Django app performance insights with Scout Monitoring

Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.

www.scoutapm.com

featured

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

llama

184 53,908 8.0 Python

Inference code for Llama models

This is kind of an epistemological debate at this level, and I make an effort to link to some source code [1] any time it seems contentious.
LLMs (of the decoder-only, generative-pretrained family everyone means) are next token predictors in a literal implementation sense (there are some caveats around batching and what not, but none that really matter to the philosophy of the thing).
But, they have some emergent behaviors that are a trickier beast. Probably the best way to think about a typical Instruct-inspired “chat bot” session is of them sampling from a distribution with a KL-style adjacency to the training corpus (sidebar: this is why shops that do and don’t train/tune on MMLU get ranked so differently than e.g. the arena rankings) at a response granularity, the same way a diffuser/U-net/de-noising model samples at the image batch (NCHW/NHWC) level.
The corpus is stocked with everything from sci-fi novels with computers arguing their own sentience to tutorials on how to do a tricky anti-derivative step-by-step.
This mental model has adequate explanatory power for anything a public LLM has ever been shown to do, but that only heavily implies it’s what they’re doing.
There is active research into whether there is more going on that is thus far not conclusive to the satisfaction of an unbiased consensus. I personally think that research will eventually show it’s just sampling, but that’s a prediction not consensus science.
They might be doing more, there is some research that represents circumstantial evidence they are doing more.
[1] https://github.com/meta-llama/llama/blob/54c22c0d63a3f3c9e77...

Scout Monitoring

www.scoutapm.com featured

Free Django app performance insights with Scout Monitoring. Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

OpenAI and Microsoft Azure to deprecate GPT-4 32K

1 project | news.ycombinator.com | 16 Jun 2024
Highly realistic talking head video generation

4 projects | news.ycombinator.com | 15 Jun 2024
Alga: CLI for remote controlling LG webOS TVs

1 project | news.ycombinator.com | 16 Jun 2024
NumPy 2.0.0

1 project | news.ycombinator.com | 16 Jun 2024
Super-charging Django: Tips & Tricks

1 project | dev.to | 16 Jun 2024

How Chain-of-Thought Reasoning Helps Neural Networks Compute

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
hardware-buttons scrape-images linkedin-bot
Post date: 22 Mar 2024

llama

Scout Monitoring

Related posts

OpenAI and Microsoft Azure to deprecate GPT-4 32K

Highly realistic talking head video generation

Alga: CLI for remote controlling LG webOS TVs

NumPy 2.0.0

Super-charging Django: Tips & Tricks

How Chain-of-Thought Reasoning Helps Neural Networks Compute

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com hardware-buttons scrape-images linkedin-bot Post date: 22 Mar 2024

llama

Scout Monitoring

Related posts

OpenAI and Microsoft Azure to deprecate GPT-4 32K

Highly realistic talking head video generation

Alga: CLI for remote controlling LG webOS TVs

NumPy 2.0.0

Super-charging Django: Tips & Tricks

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
hardware-buttons scrape-images linkedin-bot
Post date: 22 Mar 2024