AI Language Models Are Struggling to “Get” Math

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

diffgeo

1 9 10.0 Coq

A formalization of synthetic differential geometry in Coq using infinitesimal analysis

> Of course computers can do arithmetic operations, but this is not the same as solving math problems, proving theorems, etc.
Computers can solve math problems and prove theorems; this remains a significant subfield of Computer Science with lots of industrial use cases. However, pure machine learning based approaches toward these problems remain subpar.
> Even mathematical objects are approximated up to an approximation error in a computer (like a differentiable manifold or a real number).
Only because it caught on (and in the case of non-computationally-intensive applications, for purely historical reasons). For example, Mathematica has Reals and even functionality for Reals that is literally impossible to implement for integers [1,2]. There are also precise characterizations of objects in differential geometry [3]. You could imagine applying LLMs to these types of programs a la Copilot, but when you do this you will find yourself agreeing with Paul Houle's observation that math is harder to fake than eg art, language, or even glue code for web apps (assuming you're not on a grant that economically incentivizes you to draw the opposite conclusion, ofc).
[1] https://reference.wolfram.com/language/ref/Reduce.html
[2] https://en.wikipedia.org/wiki/G%C3%B6del%27s_incompleteness_...
[3] https://github.com/bollu/diffgeo

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Bertie – A minimal, high-assurance implementation of TLS 1.3 written in hacspec
5 projects | news.ycombinator.com | 23 Mar 2024
Stalin Sort Algorithm
1 project | news.ycombinator.com | 3 Feb 2024
So you think you know C?
2 projects | news.ycombinator.com | 20 Jan 2024
Kami: A Platform for Hardware Specification and Verification
1 project | news.ycombinator.com | 28 Dec 2023
bfcoq: Brainfuck compiler in Coq
1 project | /r/programming | 4 Dec 2023

AI Language Models Are Struggling to “Get” Math

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com Post date: 12 Oct 2022

diffgeo

WorkOS

Related posts