Thursday, May 19, 2011

L1 Regularization

Someone on MetaOptimize asked an interesting question: How to use L1 regularization with L-BFGS. One might rephrase this question to ask whether L1 regularization can be achieved with generally available optimization software. My impression of the discussion is that the best answer can be found in a paper by Mark Schmidt, Glenn Fung, and Romer Rosales.

One standard implementation of L-BFGS is SciPy.optimize.lbfgsb which allows for relatively simple inequality constraints. One of the approaches Schmidt, et al. suggest is the classic formulation of L1 regularization which imposes an L1 constraint on the parameter vector (Equation 6 in the ECML paper). Unfortunately, this cannot be used with SciPy.optimize.lbfgsb. However, by splitting each variable into positive and negative components and using the Lagrangian form, one arrives at a formulation for which the SciPy.optimize.lbfgsb implementation is well-suited. See Equation 7 of the ECML paper.