Google Open-Sources Trillion-Parameter AI Language Model Switch Transformer

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

higgs-logistic-regression

2 1 3.6 Haskell

I beg to disagree.
[1] provides one with a whole-data-set training method (ADMM, one of such methods). Page 8 contains figure 2(b) - accuracy of training after specified amount of time. Note that ADMM start where stochastic gradient stops.
[1] https://arxiv.org/pdf/1605.02026.pdf
At [2] I tried to apply logistic regression trained using reweighted least squares algorithm on the same Higgs boson data set. I've got the same accuracy (64%) as mentioned in the ADMM paper with much less number of coefficients - basically, just the size of input vector + 1 instead of 300 such rows of coefficients and then 300x1 affine transformation. When I added squares of inputs (for the simplest approximation of polynomial regression) and used the same reweighted iterative least squares algorithm, I've got even better accuracy (66%) for double the number of coefficients.
[2] https://github.com/thesz/higgs-logistic-regression
There's a hypothesis [3] that SGD and ADAM are best optimizers because that everyone use and report on. Rarely if ever you get anything that differ.
[3] https://parameterfree.com/2020/12/06/neural-network-maybe-ev...
So, answering your question of "how do you know" - researchers at Google cannot do IRLS (search provides IRLS only for logistic regression in Tensorflow), they cannot do Hessian-free optimization ([4], closed due lack of activity - notice the "we can't support RNN due to the WHILE loop" bonanza), etc. All due to the fact they have to use Tensorflow - it just does not support these things.
https://github.com/tensorflow/tensorflow/issues/2682
I haven't seen anything about whole-data-set optimization from Google at all. That's why I (and only me - due to standing I take and experiments I did) conclude that they do not quite care about parameter efficiency.

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

The Cult of the Haskell Programmer

1 project | news.ycombinator.com | 14 May 2024
Static Chess

2 projects | news.ycombinator.com | 13 May 2024
KMonad: An Advanced Keyboard Manager

1 project | news.ycombinator.com | 6 May 2024
How I switched from Stack to Cabal

2 projects | dev.to | 5 May 2024
IHP – The Haskell Framework for Non-Haskellers

1 project | news.ycombinator.com | 22 Apr 2024

Google Open-Sources Trillion-Parameter AI Language Model Switch Transformer

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com Post date: 17 Feb 2021

higgs-logistic-regression

InfluxDB

Related posts

The Cult of the Haskell Programmer

Static Chess

KMonad: An Advanced Keyboard Manager

How I switched from Stack to Cabal

IHP – The Haskell Framework for Non-Haskellers