Interleaved-ShuffleG¶
Implements Interleaved-ShuffleG, a differentially private shuffled gradient method that interleaves public and private samples within each epoch.
Each epoch fixes a without-replacement permutation \(\pi^{(s)}\) over the data and takes one noisy gradient step per sample. Privacy comes from adding Gaussian noise to every private-sample gradient; convergence is improved by mixing in public samples, whose gradients carry no noise. Of the \(n\) steps in an epoch, \(n_d\) draw from the private set and the rest from the public set, all under a single interleaved permutation. At each epoch boundary a proximal step applies the regularizer \(\psi\).
where \(\theta\) are the parameters, \(\eta\) the learning rate, \(s\) the epoch index, \(\pi^{(s)}\) the per-epoch permutation, \(d_{\pi_i^{(s)}}\) the \(i\)-th shuffled sample, \(\nabla f\) the per-sample gradient, \(\rho_i^{(s)} \sim \mathcal{N}(0, (\sigma^{(s)})^2 I_d)\) the privacy noise (zero for public samples), and \(\psi\) the regularizer applied via the proximal step.
Reference: Shuli Jiang, Pranay Sharma, Zhiwei Steven Wu, Gauri Joshi, "Improving the Convergence of Private Shuffled Gradient Methods with Public Data", arXiv 2025. https://arxiv.org/abs/2502.03652