the Fractional Steepest Descent Method (FSDM)¶

Implements the Fractional Steepest Descent Method (FSDM), a steepest-descent rule that replaces the integer-order gradient with a fractional-order derivative of the loss.

The method generalizes classic first-order steepest descent by taking the reverse incremental search along the negative direction of the \(v\)-order fractional derivative of a quadratic energy norm (the loss). Because a fractional derivative is nonlocal and carries long-term memory, the update accumulates weighted history of the loss landscape rather than only its local slope. Setting \(v = 1\) recovers ordinary gradient descent, so the integer-order method is a special case. The order \(v\) controls how strongly past geometry influences each step and, for suitable \(v\), lets the search escape the first-order optimal point toward a fractional-order extremum.

\[ \theta_{t+1} = \theta_t - \eta \, D_{\theta}^{v} E \big|_{\theta = \theta_t} \]

where \(\theta\) is the parameter being optimized, \(E\) is the quadratic energy norm (squared-error loss), \(\eta\) is the learning rate (a small positive constant), \(v\) is the fractional order of differentiation, and \(D_{\theta}^{v} E\) is the \(v\)-order fractional derivative of \(E\) with respect to \(\theta\) evaluated at \(\theta_t\) (taken in the Grünwald-Letnikov / Caputo sense, with the Gamma function in its kernel).

Reference: Yi-Fei Pu, Ji-Liu Zhou, Yi Zhang, Ni Zhang, Guo Huang, Patrick Siarry, "Fractional Extreme Value Adaptive Training Method: Fractional Steepest Descent Approach", IEEE Transactions on Neural Networks and Learning Systems, vol. 26, no. 4, pp. 653-662, 2015. https://doi.org/10.1109/TNNLS.2013.2286175

Back to the Canon