We implement four gradient-based methods which can escape from saddle point quickly. They are “Perturbed Gradient Descent”, “Perturbed Accelerated Gradient Descent”, “Faster Perturbed Gradient Descent”, and “Faster Perturbed Accelerated Gradient Descent”.
These methods are ensured to converge to -second order stationary points in some steps related with problem’s dimension and the accuracy we want, with probability .
If the devtools package is not yet installed, install it first:
install.packages('devtools')
Then run:
# install AEenrich from Github:
devtools::install_github('HongfanChen/PertGD')
library(PertGD)
For documentation pages:
?train_gd
?PertGD
## Objective function: --------------------------------------------------------
obj_f = function(x){
y = 0.5 * x[1]^2 + 0.5 * sin(3 * x[1]) + x[2]^2
return(y)
}
## Gradient function: ---------------------------------------------------------
gd = function(x){
gd_vec = c(x[1] + 3 / 2 * cos(3 * x[1]), 2 * x[2])
return(gd_vec)
}
## start from c(0.6806,0): ----------------------------------------------------
params1 = list(x = c(0.6806,0), eta = 0.1, theta = 0.5, epsilon = 1e-2,
radius = 1e-3, v = c(0,0), gamma = 1e-2, s = 0.1, count = 0,
t = 10, zeta = 0, z = c(1,1), x_0 = c(1,1), iter = 0, t_sub = 5,
eta_sub = 0.05)
## get results: ---------------------------------------------------------------
res_gd = train_gd(vanilla_gd, params1, gd, obj_f, 20)
res_agd = train_gd(AGD, params1, gd, obj_f, 20)
res_pagd = train_gd(PAGD, params1, gd, obj_f, 20)
res_fpgd = train_gd(FPGD, params1, gd, obj_f, 20)
res_fpagd = train_gd(FPAGD, params1, gd, obj_f, 20)
-
Ge R, Huang F, Jin C, et al. Escaping from saddle points—online stochastic gradient for tensor decomposition[C]. Conference on learning theory. PMLR, 2015: 797-842.
-
Lee J D, Simchowitz M, Jordan M I, et al. Gradient descent only converges to minimizers[C]. Conference on learning theory. PMLR, 2016: 1246-1257.
-
Jin C, Ge R, Netrapalli P, et al. How to escape saddle points efficiently[C]. International Conference on Machine Learning. PMLR, 2017: 1724-1732.
-
Jin C, Netrapalli P, Jordan M I. Accelerated gradient descent escapes saddle points faster than gradient descent[C]. Conference On Learning Theory. PMLR, 2018: 1042-1085.
-
Chenyi Zhang and Tongyang Li. Escape saddle points by a simple gradient-descent based algorithm. Advances in Neural Information Processing Systems. 2021.
-
Jin C, Netrapalli P, Ge R, et al. On nonconvex optimization for machine learning: Gradients, stochasticity, and saddle points[J]. Journal of the ACM (JACM), 2021, 68(2): 1-29.