Frac-Adam¶

Implements Frac-Adam, a Caputo-fractional variant of Adam that injects long-term memory into the gradient signal.

The method replaces the integer-order gradient in Adam with a fractional-order derivative \(D^\alpha\) of the loss, so each moment estimate aggregates history through a power-law memory kernel rather than a single instantaneous gradient. The fractional order \(\alpha \in (0,1]\) tunes how strongly past gradients persist, matching memory effects such as volatility clustering in financial series. In practice the continuous Caputo derivative is approximated by a truncated Grünwald–Letnikov sum over a short memory window. The same fractional substitution defines a family (Frac-RMSprop, Frac-SGD, Frac-Adagrad, and others); the Adam form is given below.

\[ \begin{aligned} g_t^{(\alpha)} &= \frac{1}{h^\alpha}\sum_{k=0}^{M}\omega_k(\alpha)\, g_{t-k}, \qquad \omega_k(\alpha) = (-1)^k \binom{\alpha}{k} \\ m_t &= \beta_1 m_{t-1} + (1-\beta_1)\, g_t^{(\alpha)} \\ v_t &= \beta_2 v_{t-1} + (1-\beta_2)\,\bigl(g_t^{(\alpha)}\bigr)^2 \\ \hat{m}_t &= \frac{m_t}{1-\beta_1^{\,t}}, \qquad \hat{v}_t = \frac{v_t}{1-\beta_2^{\,t}} \\ \theta_t &= \theta_{t-1} - \eta\,\frac{\hat{m}_t}{\sqrt{\hat{v}_t}+\epsilon} \end{aligned} \]

where \(g_t^{(\alpha)}\) is the Grünwald–Letnikov approximation of the Caputo fractional derivative \(D^\alpha\) of the gradient \(g_t = \nabla_\theta J(\theta)\), \(\alpha \in (0,1]\) is the fractional order, \(\omega_k(\alpha)\) are the fractional binomial weights, \(h\) is the step size, \(M\) is the memory-window length, \(\eta\) is the learning rate, \(\beta_1,\beta_2\) are the moment decay rates, and \(\epsilon\) is a stability constant. The Caputo derivative is \(D_C^\alpha f(t) = \frac{1}{\Gamma(n-\alpha)}\int_0^t \frac{f^{(n)}(\tau)}{(t-\tau)^{\alpha-n+1}}\,d\tau\) for \(n-1 < \alpha < n\).

Reference: Mustapha Ez-zaiym, Yassine Senhaji, Meriem Rachid, Karim El Moutaouakil, Vasile Palade, "Fractional Optimizers for LSTM Networks in Financial Time Series Forecasting", Mathematics 2025. https://doi.org/10.3390/math13132068

Back to the Canon