the Fractional Order Gradient Method¶

Implements the Fractional Order Gradient Method, gradient descent whose search direction is a fractional-order derivative of the objective.

Replacing the integer-order gradient by a fractional-order one yields an iteration that, when convergent, settles at a fractional extreme point that generally differs from the true extreme point, because the fractional derivative is nonlocal: it depends on the lower integral terminal \(c\), the order \(\alpha\), and the starting point. To recover the real extreme point the method applies the short-memory principle, replacing the constant lower terminal \(c\) by the lagged iterate \(\theta_{t-K}\) so the memory window is fixed; expanding the Caputo derivative as a Taylor series gives the first update below.

A practically computable companion truncates the series to its leading term, scaling the ordinary gradient by \(|\theta_t - c|^{1-\alpha}/\Gamma(2-\alpha)\) (with a small \(\epsilon\) guarding the singularity at \(\theta_t = c\)); this is equivalent to a varying learning rate that tends to a constant as the iterate approaches the extreme point. As \(\alpha \to 1\) both rules reduce to ordinary gradient descent.

\[ \begin{aligned} \theta_{t+1} &= \theta_t - \eta \sum_{i=1}^{+\infty} \binom{\alpha-1}{\,i-1\,} \frac{f^{(i)}(\theta_t)}{\Gamma(i+1-\alpha)} (\theta_t - \theta_{t-K})^{\,i-\alpha}, \\ \theta_{t+1} &= \theta_t - \eta \, \frac{f^{(1)}(\theta_t)}{\Gamma(2-\alpha)} \left( |\theta_t - c| + \epsilon \right)^{1-\alpha}. \end{aligned} \]

where \(\theta\) are the parameters, \(\eta > 0\) the learning rate, \(\alpha \in (0,1)\) the fractional order, \(c\) the lower integral terminal, \(K \in \mathbb{Z}^+\) the fixed memory step (with lagged terminal \(\theta_{t-K}\)), \(\epsilon \ge 0\) a small constant avoiding division by zero, \(f^{(i)}\) the \(i\)-th integer-order derivative of the objective, \(\Gamma(\cdot)\) the Gamma function, and \(\binom{\alpha-1}{\,i-1\,} = \Gamma(\alpha)/[\Gamma(i)\,\Gamma(\alpha-i+1)]\) the generalized binomial coefficient.

Reference: Yiheng Wei, Yu Kang, Weidi Yin, Yong Wang, "Generalization of the gradient method with fractional order gradient direction", Journal of the Franklin Institute 357(4), 2020. https://doi.org/10.1016/j.jfranklin.2020.01.008

Back to the Canon