Normalized Fractional SGD (NFSGD)¶
Implements Normalized Fractional SGD (NFSGD), a matrix-factorization recommender solver that augments stochastic gradient descent with a fractional-order gradient and an adaptively normalized learning rate.
Recommender systems factor the observed rating matrix into user and item latent feature vectors \(p\) and \(q\). The error of each observed entry drives the update of both vectors. The fractional variant adds, alongside the ordinary first-order gradient, a fractional-order (\(\alpha\)) gradient term that endows the update with a memory effect through the power-law factor \(p^{1-\alpha}\) (resp. \(q^{1-\alpha}\)) and the Gamma function \(\Gamma(2-\alpha)\). Setting \(\alpha=1\) recovers standard SGD. The normalized paradigm replaces the fixed step size with a learning rate that is adaptively scaled by the magnitude of the latent vectors, so the effective step shrinks when feature norms grow, stabilizing convergence across datasets and fractional orders.
For an observed rating \(c_{ui}\) with prediction error \(e_{ui}=c_{ui}-p_u^{\top}q_i\), the latent vectors update as
where \(p_u\) is the user latent vector, \(q_i\) the item latent vector, \(c_{ui}\) the observed rating, \(e_{ui}\) the prediction error, \(\eta\) the integer-order step size, \(\eta_{\mathrm{fr}}\) the fractional step size, \(\alpha\in(0,1]\) the fractional order, \(\Gamma(\cdot)\) the Gamma function, and the power \(p_u^{\,1-\alpha}\) is applied elementwise. In the normalized scheme the step sizes \(\eta,\eta_{\mathrm{fr}}\) are scaled by the inverse magnitude of the latent feature vectors so the learning rate adapts automatically during training.
Reference: Zeshan Aslam Khan, Syed Zubair, Naveed Ishtiaq Chaudhary, Muhammad Asif Zahoor Raja, Farrukh Aslam Khan, Nadeem Iqbal, "Design of normalized fractional SGD computing paradigm for recommender systems", Neural Computing and Applications 2020. https://doi.org/10.1007/s00521-019-04562-6