PRISM¶
Implements PRISM, a gauge-invariant differentially private optimizer for LoRA fine-tuning.
A LoRA update \(Z = AB^\top\) is non-identifiable: many factor pairs \((A,B)\) encode the same \(Z\), so applying DP-SGD directly to the factors injects gauge-dependent noise that can be unboundedly amplified. PRISM instead clips and perturbs the intrinsic tangent update on the fixed-rank manifold \(\mathcal{M}_r\), so the released step depends only on \(Z\) and not on the chosen factorization. Per-example gradients are lifted into the tangent space, clipped by a global intrinsic norm, averaged, and perturbed with an isotropic tangent-space Gaussian; the result is passed through a gauge-invariant adaptive preconditioner whose ridge floors are scaled to the known DP noise level, then retracted back onto \(\mathcal{M}_r\).
For LoRA module \(\ell\) with frozen weight \(W_0\) and factors \((A_\ell, B_\ell)\), one PRISM step over a minibatch of \(b\) samples is
where \(P_{A_\ell,B_\ell}\) projects a full gradient \(G_{i,\ell}\) onto the rank-\(r\) tangent space \(T_{Z_\ell}\mathcal{M}_r\) (gauge invariant), \(C\) is the intrinsic clip threshold, \(\sigma\) the noise multiplier, \(\eta\) the learning rate, \(\beta_1,\beta_2\) the moment decays, \(m_\ell\) the first moment, \(V_\ell \in \mathbb{R}^{r\times r}\) the right-invariant rank-space second moment, \(d_\ell \in \{m_\ell, n_\ell\}\) the module dimension, \(\lambda_\ell\) the DP-aware ridge floor (scaled by the known noise level and module geometry, e.g. \(\lambda_{A,\ell} \asymp \tfrac{\sigma C^2}{b}\,\mathrm{tr}(N_\ell^{-1})/r\)), and \(\mathrm{Retr}_r\) the best rank-\(r\) approximation (truncated SVD) onto \(\mathcal{M}_r\). The moments and direction for \(A_\ell\) and \(B_\ell\) are computed analogously, giving \((U_{A,\ell}, U_{B,\ell})\).
Reference: Shihao Wang, Xueru Zhang, "PRISM: Gauge-Invariant Tangent-Space Differentially Private LoRA", ICML 2026. https://arxiv.org/abs/2606.00944