Inverse Preconditioning

The relaxed LLQR update can differ from exact LQR because the learned inverse preconditioner is constrained by directions spanned by current gradients. The practical LLQR relaxation replaces exact layer updates with a preconditioned gradie…

1 sources - 5 claims

The relaxed LLQR update can differ from exact LQR because the learned inverse preconditioner is constrained by directions spanned by current gradients. The practical LLQR relaxation replaces exact layer updates with a preconditioned gradient using a block-diagonal inverse preconditioner. LLQR learns the inverse preconditioner under the LQR objective instead of inverting a pre-structured curvature matrix. LLQR periodically refits the learned inverse preconditioner and passes the preconditioned gradient to SGDM or AdamW. The overhead of LLQR is mainly in periodic refitting of the inverse preconditioner rather than in applying it.