# Past Computational Mathematics and Applications Seminar

A popular approach within the signal processing and machine learning communities consists in modelling signals as sparse linear combinations of atoms selected from a learned dictionary. While this paradigm has led to numerous empirical successes in various fields ranging from image to audio processing, there have only been a few theoretical arguments supporting these evidences. In particular, sparse coding, or sparse dictionary learning, relies on a non-convex procedure whose local minima have not been fully analyzed yet. Considering a probabilistic model of sparse signals, we show that, with high probability, sparse coding admits a local minimum around the reference dictionary generating the signals. Our study takes into account the case of over-complete dictionaries and noisy signals, thus extending previous work limited to noiseless settings and/or under-complete dictionaries. The analysis we conduct is non-asymptotic and makes it possible to understand how the key quantities of the problem, such as the coherence or the level of noise, can scale with respect to the dimension of the signals, the number of atoms, the sparsity and the number of observations.

This is joint work with Rodolphe Jenatton & Francis Bach.

The context of data assimilation in oceanography will be described as well as the computational challenges associated with it. A class of numerical linear algebra methods is described whose purpose is to exploit the problem structure in order to reduce the computational burden and provide provable convergence results for what remains a (very large) nonlinear problem. This class belongs to the Krylov-space family of methods and the special structure used is the imbalance between the dimensions of the state space and the observation space. It is also shown how inexact matrix-vector products can be exploited. Finally, preconditioning issues and resulting adaptations of the trust-region methodology for nonlinear minimization will also be outlined.

By Serge Gratton, Selime Gurol, Philippe Toint, Jean Tshimanga and Anthony Weaver.

Sparse matrix factorization involves a mix of regular and irregular computation, which is a particular challenge when trying to obtain high-performance on the highly parallel general-purpose computing cores available on graphics processing units (GPUs). We present a sparse multifrontal QR factorization method that meets this challenge, and is up to ten times faster than a highly optimized method on a multicore CPU. Our method is unique compared with prior methods, since it factorizes many frontal matrices in parallel, and keeps all the data transmitted between frontal matrices on the GPU. A novel bucket scheduler algorithm extends the communication-avoiding QR factorization for dense matrices, by exploiting more parallelism and by exploiting the staircase form present in the frontal matrices of a sparse multifrontal method.

This is joint work with Nuri Yeralan and Sanjay Ranka.

Convex regularization has become a popular approach to solve large scale inverse or data separation problems. A prominent example is the problem of identifying a sparse signal from linear samples my minimizing the l_1 norm under linear constraints. Recent empirical research indicates that many convex regularization problems on random data exhibit a phase transition phenomenon: the probability of successfully recovering a signal changes abruptly from zero to one as the number of constraints increases past a certain threshold. We present a rigorous analysis that explains why phase transitions are ubiquitous in convex optimization. It also describes tools for making reliable predictions about the quantitative aspects of the transition, including the location and the width of the transition region. These techniques apply to regularized linear inverse problems, to demixing problems, and to cone programs with random affine constraints. These applications depend on a new summary parameter, the statistical dimension of cones, that canonically extends the dimension of a linear subspace to the class of convex cones.

Joint work with Dennis Amelunxen, Mike McCoy and Joel Tropp.

In this talk, we investigate in a unified way the structural properties of a large class of convex regularizers for linear inverse problems. We consider regularizations with convex positively 1-homogenous functionals (so-called gauges) which are piecewise smooth. Singularies of such functionals are crucial to force the solution to the regularization to belong to an union of linear space of low dimension. These spaces (the so-called "models") allows one to encode many priors on the data to be recovered, conforming to some notion of simplicity/low complexity. This family of priors encompasses many special instances routinely used in regularized inverse problems such as L^1, L^1-L^2 (group sparsity), nuclear norm, or the L^infty norm. The piecewise-regular requirement is flexible enough to cope with analysis-type priors that include a pre-composition with a linear operator, such as for instance the total variation and polyhedral gauges. This notion is also stable under summation of regularizers, thus enabling to handle mixed regularizations.

The main set of contributions of this talk is dedicated to assessing the theoretical recovery performance of this class of regularizers. We provide sufficient conditions that allow to provably controlling the deviation of the recovered solution from the true underlying object, as a function of the noise level. More precisely we establish two main results. The first one ensures that the solution to the inverse problem is unique and lives on the same low dimensional sub-space as the true vector to recover, with the proviso that the minimal signal to noise ratio is large enough. This extends previous results well-known for the L^1 norm [1], analysis L^1 semi-norm [2], and the nuclear norm [3] to the general class of piecewise smooth gauges. In the second result, we establish L^2 stability by showing that the L^2 distance between the recovered and true vectors is within a factor of the noise level, thus extending results that hold for coercive convex positively 1-homogenous functionals [4].

This is a joint work with S. Vaiter, C. Deledalle, M. Golbabaee and J. Fadili. For more details, see [5].

Bibliography:

[1] J.J. Fuchs, On sparse representations in arbitrary
redundant bases. IEEE Transactions on Information Theory, 50(6):1341-1344,
2004.

[2] S. Vaiter, G. Peyré, C. Dossal, J. Fadili, Robust
Sparse Analysis Regularization, to appear in IEEE Transactions on Information
Theory, 2013.

[3] F. Bach, Consistency of trace norm minimization,
Journal of Machine Learning Research, 9, 1019-1048, 2008.

[4] M. Grasmair, Linear convergence rates for Tikhonov
regularization with positively homogeneous functionals. Inverse Problems,
27(7):075014, 2011.

[5] S. Vaiter, M. Golbabaee, J. Fadili, G. Peyré, Model
Selection with Piecewise Regular Gauges, Preprint hal-00842603, 2013