Improving gradient descent convergence e.g. based on local trend of gradients?

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

SGD-OGR-Hessian-estimator

8 10 4.9 Mathematica

SGD (stochastic gradient descent) with OGR - online gradient regression Hessian estimator

So it directly estimates learning rates based on local trend of gradients, here is analogous scenario for standard momentum methods with maximal fixed learning rate: much smaller steps, after 30 steps ~50x worse values: https://github.com/JarekDuda/SGD-OGR-Hessian-estimator/raw/main/momentum.png

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

AArch64-Explore/vol1 M1 Explainer [pdf]
1 project | news.ycombinator.com | 29 Mar 2024
Set 10 Tome of Traits Guide
1 project | /r/CompetitiveTFT | 7 Dec 2023
Mathematica notebook to check the math of "The Hydrino Hypothesis" at substack's "Hydrogen Revolution"
1 project | /r/GUToCPandSociety | 6 Dec 2023
Correction To My Mathematica GUTCP Cosmology Notebook: 120 Billion Years Since Start of the Last Expansion Phase
1 project | /r/GUToCPandSociety | 5 Dec 2023
I've done it, I beat Space Exploration's (v0.5) Secret Ending in 673 hours!
1 project | /r/factorio | 7 Nov 2023

Improving gradient descent convergence e.g. based on local trend of gradients?

This page summarizes the projects mentioned and recommended in the original post on /r/math Post date: 29 Nov 2022

SGD-OGR-Hessian-estimator

WorkOS

Related posts