LT-ADMM¶
Implements LT-ADMM, a communication-efficient distributed ADMM that lets each agent run several local training steps between communication rounds.
Each agent \(i\) minimizes its local loss \(f_i\) subject to consensus with its neighbors \(\mathcal{N}_i\). Instead of solving the ADMM proximal step exactly, every round \(k\) runs \(\tau\) inner gradient steps on a local variable \(\phi_{i,k}^t\) that approximates the proximal update; the result becomes \(x_{i,k+1}\). Agents then exchange the auxiliary variables \(z_{ij}\) and update them, so communication happens only once per \(\tau\) local steps. The gradient \(g_i\) can be a plain mini-batch estimate or a SAGA-style variance-reduced estimate using a per-sample memory table \(r_{i,h,k}^t\).
where \(x_{i,k}\) is agent \(i\)'s model, \(\phi_{i,k}^t\) the local iterate over \(\tau\) inner steps, \(\gamma\) the step size, \(\rho\) the ADMM penalty, \(\mathcal{N}_i\) the neighbor set with degree \(|\mathcal{N}_i|\), \(z_{ij}\) the auxiliary (edge) variables exchanged between neighbors, \(\mathcal{B}_i\) a sampled mini-batch over the \(m_i\) local samples, and \(r_{i,h,k}^t\) the SAGA memory holding the iterate at which \(\nabla f_{i,h}\) was last evaluated.
Reference: Xiaoxing Ren, Nicola Bastianello, Karl H. Johansson, Thomas Parisini, "Communication-Efficient Stochastic Distributed Learning", arXiv 2025. https://arxiv.org/abs/2501.13516