SRMM¶
Implements SRMM (Stochastic Regularized Majorization-Minimization), a majorization-minimization scheme for nonconvex stochastic optimization with weakly convex or block multi-convex surrogates.
At each step a new data point defines the instantaneous loss \(f_n(\theta)=\ell(x_n,\theta)\), for which a surrogate \(g_n\) majorizing \(f_n\) near \(\theta_{n-1}\) is chosen. SRMM maintains a running average \(\bar g_n\) of these surrogates and minimizes it under a regularizer that keeps the iterate close to the previous one. The regularizer either adds a proximal term \(\tfrac{\hat\rho}{2}\lVert\theta-\theta_{n-1}\rVert^2\) (to convexify weakly convex surrogates with \(\hat\rho>-\rho\)) or imposes a diminishing trust-region radius \(\lVert\theta-\theta_{n-1}\rVert\le c\,w_n\) for block multi-convex surrogates.
where \(g_n\) is a surrogate majorizing the loss \(f_n\) at \(\theta_{n-1}\) within tolerance \(\varepsilon_n\), \(w_n\in(0,1]\) is a non-increasing averaging weight, \(\bar g_n\) is the weighted-average surrogate, \(\hat\rho\) is the proximal regularization strength (\(\hat\rho>-\rho\) for weakly convex surrogates), \(\Psi_n\) is the regularizer (proximal term or diminishing-radius indicator), \(c>0\) sets the trust-region radius, and \(\Theta\subseteq\mathbb{R}^p\) is the constraint set.
Reference: Hanbaek Lyu, "Stochastic regularized majorization-minimization with weakly convex and multi-convex surrogates", arXiv 2022. https://arxiv.org/abs/2201.01652