Efficiency of serial and parallel block-coordinate descent methods for minimizing composite functions

13 October 2011
Dr Peter Richt√°rik
We develop a randomized block-coordinate descent method for minimizing the sum of a smooth and a simple nonsmooth block-separable convex function and prove that it obtains an $\epsilon$-accurate solution with probability at least $1-\rho$ in at most $O[(2n/\epsilon)\log(1/\rho)]$ iterations, where $n$ is the dimension of the problem. For strongly convex functions the method converges linearly. This extends recent results of Nesterov [Efficiency of coordinate descent methods on huge-scale optimization problems, CORE Discussion Paper \#2010/2], which cover the smooth case, to composite minimization, while at the same time improving the complexity by the factor of 4 and removing $\epsilon$ from the logarithm. More importantly, in contrast with the aforementioned work in which the authors achieve the results by applying their method to a regularized version of the objective function with an unknown scaling factor, we show that this is not necessary, thus achieving true iteration complexity bounds. In the smooth case we also allow for arbitrary probability vectors and non-Euclidean norms. Time permitting, we will also mention new iteration complexity results for a parallel version of the method. \\ \\ In the second part of the talk we demonstrate numerically that the algorithm is able to solve huge-scale $\ell_1$-regularized support vector machine and least squares problems with billion variables. Finally, we present preliminary computational results with a GPU-accelerated parallel version of the method, on truss topology design instances, achieving speedups of up to two orders of magnitude when compared to a single-core implementation in C. \\ \\ References: \\ P. Richtarik and M. Takac, Iteration complexity of randomized block-coordinate descent methods for minimizing a composite function, 2011; http://arxiv.org/abs/1107.2848 \\ P. Richtarik and M. Takac, Efficient serial and parallel coordinate descent methods for huge-scale truss topology design, 2011; http://www.optimization-online.org/DB_HTML/2011/08/3118.html \\ P. Richtarik and M. Takac, Efficiency of randomized coordinate descent methods on minimization problems with a composite objective function, Proceedings of SPARS11
  • Computational Mathematics and Applications Seminar