Skip to content

Conformable Fractional Gradient Descent

Implements Conformable Fractional Gradient Descent, gradient descent built on the conformable fractional derivative of the loss.

The conformable fractional derivative of order \(\alpha\) is local: it rescales the ordinary derivative by \(t^{1-\alpha}\), i.e. \(D_\alpha f(t) = t^{1-\alpha} f'(t)\). Applied to backpropagation, this replaces each classical gradient with its conformable fractional counterpart, so the partial derivative of the loss with respect to a weight is multiplied by that weight raised to the power \(1-\alpha\). The factor avoids the history-dependent cost of Caputo or Riemann-Liouville fractional gradients while reshaping the effective step per parameter. An optional \(L_2\) term contributes a \(\theta^{2-\alpha}\) decay under the same conformable rescaling.

\[ \begin{aligned} g_t &= \frac{\partial E}{\partial \theta_t} \\ \theta_{t+1} &= \theta_t - \eta \, \theta_t^{\,1-\alpha} \, g_t \\ \theta_{t+1} &= \theta_t - \eta \left( \theta_t^{\,1-\alpha} \, g_t - \lambda \, \theta_t^{\,2-\alpha} \right) \quad (\text{with } L_2) \end{aligned} \]

where \(\theta\) is a weight, \(\eta\) the learning rate, \(g_t\) the ordinary gradient of the loss \(E\), \(\alpha \in (0,1)\) the conformable fractional order, and \(\lambda\) the \(L_2\) regularization coefficient. Setting \(\alpha = 1\) recovers standard gradient descent; \(\alpha = 0.5\) gives the paper's fractional Newton-type step \(\theta_{t+1} = \theta_t - \sqrt{\theta_t}\, g_t\).

Reference: Mohammad Rushdi Saleh, Basem Ajarmah, "Fractional Gradient Descent Learning of Backpropagation Artificial Neural Networks with Conformable Fractional Calculus", Frontiers in Artificial Intelligence and Applications, Vol. 358 (FSDM 2022). https://doi.org/10.3233/FAIA220372


Back to the Canon