Caputo Fractional-Order Gradient Descent¶

Implements Caputo Fractional-Order Gradient Descent, a memory-weighted gradient descent for first-order Takagi–Sugeno neuro-fuzzy models.

Conventional integer-order gradient descent updates each fuzzy-rule parameter and network weight using only the instantaneous gradient. This method instead replaces the integer derivative of the error with a Caputo fractional derivative of order \(\alpha \in (0,1)\), whose non-local kernel lets past iterates influence the current step. The fractional order acts as a tunable memory: smaller \(\alpha\) retains a longer history of weight changes, which the authors report accelerates convergence and improves classification accuracy over the integer-order baseline.

To avoid the singularity that the Caputo derivative develops at a fixed lower terminal, the lower terminal is taken at the previous iterate \(\theta_{t-1}\). The resulting update for each parameter is

\[ \begin{aligned} \theta_{t+1} &= \theta_t - \frac{\eta}{\Gamma(2-\alpha)}\,\bigl(\lvert \theta_t - \theta_{t-1}\rvert + \delta\bigr)^{1-\alpha}\, g_t, \\ g_t &= \frac{\partial E}{\partial \theta}\Big|_{\theta=\theta_t}, \end{aligned} \]

where \(\theta\) is a fuzzy-rule parameter or network weight, \(\eta\) is the learning rate, \(E\) is the error (loss) function, \(g_t\) is its integer-order gradient, \(\alpha \in (0,1)\) is the fractional order, \(\Gamma\) is the gamma function, and \(\delta > 0\) is a small constant that regularizes the memory term \(\lvert \theta_t - \theta_{t-1}\rvert^{1-\alpha}\) when consecutive iterates coincide.

Reference: Junling Liu et al., "A Novel Neuro-fuzzy Learning Algorithm for First-Order Takagi–Sugeno Fuzzy Model: Caputo Fractional-Order Gradient Descent Method", International Journal of Fuzzy Systems 2024. https://doi.org/10.1007/s40815-024-01750-y

Back to the Canon