FedCET¶
Implements FedCET, a communication-efficient federated method achieving linear convergence on heterogeneous data.
FedCET runs an adapted-gradient-tracking recursion that uses the learning rate itself as the weighting mechanism to cancel client drift, so each communication round transmits only a single model-sized vector in each direction. Between communication rounds clients perform a momentum-style local recursion driven purely by successive gradient differences; at a communication round the server averages the clients' transmitted perturbations and mixes them back in.
Writing the per-client iterate as \(\theta_i\), the local update (when \((t+1)\bmod\tau \neq 0\)) and the equivalent compact recursion with correction variable \(d_t\) are
where \(\theta^{t}=[\theta_1^{t},\dots,\theta_N^{t}]\) stacks the \(N\) client iterates, \(g_i^{t}=\nabla f_i(\theta_i^{t})\) with \(g^{t}\) the stacked gradients, \(\eta\) is the learning rate, \(c\) is the weighting (mixing) parameter, \(\tau\) is the local-training period, \(d^{t}\) is the drift-correction variable, and \(W^{t+1}=\frac{1}{N}\mathbf{1}\mathbf{1}^{\top}\) at communication rounds (\(t+1=\tau k\)) and \(W^{t+1}=I\) otherwise.
Reference: Jie Liu, Yongqiang Wang, "Communication Efficient Federated Learning with Linear Convergence on Heterogeneous Data", arXiv 2025. https://arxiv.org/abs/2503.15804