RGrad-Avg¶
Implements RGrad-Avg, a gradient-averaging variant of Riemannian Gradient Descent.
RGrad-Avg lifts the Euclidean Grad-Avg scheme (a Heun-style predictor-corrector) to Riemannian submanifolds. Plain RGD takes one retracted step along the negative Riemannian gradient; RGrad-Avg instead computes a predicted point, evaluates the gradient there, and retracts along the average of the two gradients. Because the gradient at the predicted point lives in a different tangent space, it is parallel-transported back to the current point before averaging.
Let \(\mathcal{M}\) be a Riemannian submanifold, \(\mathrm{grad}\,f\) the Riemannian gradient (the Euclidean gradient orthogonally projected onto the tangent space), \(\mathrm{Re}_x\) the (exponential) retraction at \(x\), and \(P_v\) the parallel transport along the retraction curve generated by \(v\). One iteration is:
where \(x_k \in \mathcal{M}\) is the current iterate, \(\gamma > 0\) is the step size, \(\bar{x}_{k+1}\) is the predicted point, \(\mathrm{grad}\,f(\bar{x}_{k+1})\) is the gradient at that prediction, and \(P_v^{-1}\) transports it from \(\bar{x}_{k+1}\) back to the tangent space at \(x_k\) for averaging. On \(\mathcal{M} = \mathbb{R}^n\) with \(\mathrm{Re}_x(s) = x + s\) and identity transport, this reduces to Euclidean Grad-Avg.
Reference: Saugata Purkayastha, Sukannya Purkayastha, "On Riemannian Gradient Descent Algorithm using gradient averaging", OPT2025: 17th Annual Workshop on Optimization for Machine Learning, 2025. https://opt-ml.org/papers/2025/paper7.pdf