Adaptive Terminal Caputo Fractional Gradient Descent (AT-CFGD)¶
Implements Adaptive Terminal Caputo Fractional Gradient Descent (AT-CFGD), gradient descent that replaces the integer-order derivative with a Caputo fractional derivative whose terminal point is reset each step.
Classical fractional gradient descent with a fixed terminal point converges to a biased point rather than the true minimizer, because the fractional derivative does not vanish there. AT-CFGD fixes this by tying the terminal \(c_t\) to the current iterate through the gradient, so the fractional operator is consistent with the local descent direction; a \(\beta\)-weighted order-\((1+\alpha)\) term adds a higher-order correction. The univariate operator is applied coordinatewise in the multidimensional case.
where \(\theta \equiv x\) are the parameters, \(\eta_t\) is the learning rate, \(D^{\alpha}_{c}\) is the Caputo fractional derivative of order \(\alpha \in (0,1]\) with terminal point \(c\), \(\Gamma\) is the gamma function, \(\beta\) weights the order-\((1+\alpha)\) correction, and \(\lambda_t > 0\) sets the adaptive terminal point relative to the gradient.
Reference: Ashwani Aggarwal, "Convergence Analysis of Fractional Gradient Descent", arXiv 2023. https://arxiv.org/abs/2311.18426