SpiderSQN¶
Implements SpiderSQN, a stochastic quasi-Newton method that drives a damped L-BFGS direction with a SPIDER variance-reduced gradient.
SpiderSQN combines two ingredients. The gradient estimator \(v_k\) is built with the SPIDER recursion: every \(q\) steps it is refreshed with a full gradient, and in between it is updated by accumulating stochastic gradient differences, which keeps its variance small without recomputing the full gradient each step. That low-variance estimator is then turned into a search direction \(d_k = H_k v_k\), where \(H_k\) is the inverse-Hessian approximation produced by stochastic damped L-BFGS (SdLBFGS) via two-loop recursion over stored curvature pairs. The parameter step is a fixed-rate move along \(-d_k\).
where \(x\) are the parameters, \(\eta\) the step size, \(f\) the finite-sum objective with components \(f_i\), \(\xi_k\) a sampled index (or minibatch) so that \(\nabla f_{\xi_k}\) is a stochastic gradient, \(q\) the period between full-gradient refreshes, \(v_k\) the SPIDER gradient estimator, and \(H_k\) the SdLBFGS inverse-Hessian approximation applied to \(v_k\) through the L-BFGS two-loop recursion.
Reference: Qingsong Zhang, Feihu Huang, Cheng Deng, Heng Huang, "Faster Stochastic Quasi-Newton Methods", arXiv 2020. https://arxiv.org/abs/2004.06479