Efficiency of serial and parallel block-coordinate descent methods for minimizing composite functions
Abstract
We develop a randomized block-coordinate descent method for minimizing the sum of a smooth and a simple nonsmooth block-separable convex function and prove that it obtains an $\epsilon$-accurate solution with probability at least $1-\rho$ in at most $O[(2n/\epsilon)\log(1/\rho)]$ iterations, where $n$ is the dimension of the problem. For strongly convex functions the method converges linearly. This extends recent results of Nesterov [Efficiency of coordinate descent methods on huge-scale optimization problems, CORE Discussion Paper \#2010/2], which cover the smooth case, to composite minimization, while at the same time improving the complexity by the factor of 4 and removing $\epsilon$ from the logarithm. More importantly, in contrast with the aforementioned work in which the authors achieve the results by applying their method to a regularized version of the objective function with an unknown scaling factor, we show that this is not necessary, thus achieving true iteration complexity bounds. In the smooth case we also allow for arbitrary probability vectors and non-Euclidean norms. Time permitting, we will also mention new iteration complexity results for a parallel version of the method.
\\
\\
In the second part of the talk we demonstrate numerically that the algorithm is able to solve huge-scale $\ell_1$-regularized support vector machine and least squares problems with billion variables. Finally, we present preliminary computational results with a GPU-accelerated parallel version of the method, on truss topology design instances, achieving speedups of up to two orders of magnitude when compared to a single-core implementation in C.
\\
\\
References:
\\
P. Richtarik and M. Takac, Iteration complexity of randomized block-coordinate descent methods for minimizing a composite function, 2011; http://arxiv.org/abs/1107.2848
\\
P. Richtarik and M. Takac, Efficient serial and parallel coordinate descent methods for huge-scale truss topology design, 2011; http://www.optimization-online.org/DB_HTML/2011/08/3118.html
\\
P. Richtarik and M. Takac, Efficiency of randomized coordinate descent methods on minimization problems with a composite objective function, Proceedings of SPARS11
From sparsity to block-sparsity: direct solution of linear systems of dimension 10^9
Abstract
We discuss a method for solving very large structured symmetric indefinite equation systems arising in optimization with interior point methods.
Many real-life economic models involve system dynamics, spatial distribution or uncertainty and lead to large-scale optimization problems. Such problems usually have a hidden structure: they are constructed by replication of some small generic block. The linear algebra subproblems which arise in optimization algorithms for such problems involve matrices which are not only sparse, but they additionally display a block-structure with many smaller blocks sparsely distributed in the large matrix.
We have developed a structure-exploiting parallel interior point solver for optimization problems. Its design uses object-orientated programming techniques. The progress OOPS (Object-Orientated Parallel Solver: http://www.maths.ed.ac.uk/~gondzio/parallel/solver.html) on a number of different computing platforms and achieves scalability on a number of different computing platforms. We illustrate its performance on a collection of problems with sizes reaching 109 variables arising from asset liability management and portfolio optimization.
This is a joint work with Andreas Grothey.
15:45
"The Second Law of Probability: Entropy growth in the central limit process."
Abstract
The talk will explain how a geometric principle
gave rise to a new variational description of information-theoretic entropy and
how this led to the solution of a problem dating back to the 50's: whether the
the central limit theorem is driven by an analogue of the second law of
thermodynamics.
A high performance dual revised simplex solver
Abstract
Implementations of the revised simplex method for solving large scale sparse linear programming (LP) problems are highly efficient for single-core architectures. This talk will discuss the limitations of the underlying techniques in the context of modern multi-core architectures, in particular with respect to memory access. Novel techniques for implementing the dual revised simplex method will be introduced, and their use in developing a dual revised simplex solver for multi-core architectures will be described.
How sharp is the restricted isometry property? An investigation into sparse approximation techniques
Abstract
15:45
The story of three polytopes and what they tell us about information acquisition
Abstract
We will examine the typical structure of random polytopes by projecting the three fundamental regular polytopes: the simplex, cross-polytope, and hypercube. Along the way we will explore the implications of their structure for information acquisition and optimization. Examples of these implications include: that an N-vector with k non-zeros can be recovered computationally efficiently from only n random projections with n=2e k log(N/n), or that for a surprisingly large set of optimization problems the feasible set is actually a point. These implications are driving a new signal processing paradigm, Compressed Sensing, which has already lead to substantive improvements in various imaging modalities. This work is joint with David L. Donoho.
12:00
Elliptic equations in the plane satisfying a Carleson measure condition
Abstract
We study the Neumann and regularity boundary value problems for a divergence form elliptic equation in the plane. We assume the gradient
of the coefficient matrix satisfies a Carleson measure condition and consider data in L^p, 1
15:45
Some results concerning the q-optimal martingale measure
Abstract
An important and challenging problem in mathematical finance is how to choose a pricing measure in an incomplete market, i.e. how to find a probability measure under which expected payoffs are calculated and fair option prices are derived under some notion of optimality.
The notion of q-optimality is linked to the unique equivalent martingale measure (EMM) with minimal q-moment (if q > 1) or minimal relative entropy (if q=1). Hobson's (2004) approach to identifying the q-optimal measure (through a so-called fundamental equation) suggests a relaxation of an essential condition appearing in Delbaen & Schachermayer (1996). This condition states that for the case q=2, the Radon-Nikodym process, whose last element is the density of the candidate measure, is a uniformly integrable martingale with respect to any EMM with a bounded second moment. Hobson (2004) alleges that it suffices to show that the above is true only with respect to the candidate measure itself and extrapolates for the case q>1. Cerny & Kallsen (2008) however presented a counterexample (for q=2) which demonstrates that the above relaxation does not hold in general.
The speaker will present the general form of the q-optimal measure following the approach of Delbaen & Schachermayer (1994) and prove its existence under mild conditions. Moreover, in the light of the counterexample in Cerny & Kallsen (2008) concerning Hobson's (2004) approach, necessary and sufficient conditions will be presented in order to determine when a candidate measure is the q-optimal measure.