Newton's Method¶
Implements Newton's Method, a second-order step that minimizes the local quadratic model of the objective.
Newton's method models the objective around the current iterate with a second-order Taylor expansion and steps to the minimizer of that quadratic. Setting the gradient of the model to zero yields a step given by the inverse Hessian applied to the gradient. A step size \(\eta\) (chosen by line search, or \(\eta = 1\) for the pure method) damps the step to ensure progress when the quadratic model is inaccurate. The method exhibits local quadratic convergence near a minimizer where the Hessian is positive definite.
where \(\theta\) are the parameters, \(g_t\) is the gradient, \(H_t\) is the Hessian of the objective \(f\), \(H_t^{-1} g_t\) is the Newton direction, and \(\eta\) is the step size (equal to \(1\) for the undamped method).
Reference: J. J. Moré and D. C. Sorensen, "Newton's Method", Argonne National Laboratory technical report ANL-82-8, 1982. https://www.osti.gov/biblio/5326201