The restricted isometry property is arguably the most prominent tool in the theory of compressive sensing. In its classical version, it features l_2 norms as inner and outer norms. The modified version considered in this talk features the l_1 norm as the inner norm, while the outer norm depends a priori on the distribution of the random entries populating the measurement matrix. The modified version holds for a wider class of random matrices and still accounts for the success of sparse recovery via basis pursuit and via iterative hard thresholding. In the special case of Gaussian matrices, the outer norm actually reduces to an l_2 norm. This fact allows one to retrieve results from the theory of one-bit compressive sensing in a very simple way. Extensions to one-bit matrix recovery are then straightforward.

# Past Numerical Analysis Group Internal Seminar

Forecasting a diagnosis of Alzheimer’s disease is a promising means of selection for clinical trials of Alzheimer’s disease therapies. A positive PET scan is commonly used as part of the inclusion criteria for clinical trials, but PET imaging is expensive, so when a positive scan is one of the trial inclusion criteria it is desirable to avoid screening failures. In this talk I will describe a scheme for pre-selecting participants using statistical learning methods, and investigate how brain regions change as the disease progresses. As a means of generating features I apply the Chen path signature. This is a systematic way of providing feature sets for multimodal data that can probe the nonlinear interactions in the data as an extension of the usual linear features. While it can easily perform a traditional analysis, it can also probe second and higher order events for their predictive value. Combined with Lasso regularisation one can auto detect situations where the observed data has nonlinear information.

The latest CPUs by Intel and ARM support vectorised operations, where a single set of instructions (e.g. add, multiple, bit shift, XOR, etc.) are performed in parallel for small batches of data. This can provide great performance improvements if each parallel instruction performs the same operation, but carries the risk of performance loss if each needs to perform different tasks (e.g. if else conditions). I will present the work I have done so far looking into how to recover the full performance of the hardware, and some of the challenges faced when trading off between ever larger parallel tasks, risks of tasks diverging, and how certain coding styles might be modified for memory bandwidth limited applications. Examples will be taken from finance and Monte Carlo applications, inspecting some standard maths library functions and possibly random number generation.

Low-rank plus sparse matrices arise in many data-oriented applications, most notably in a foreground-background separation from a moving camera. It is known that low-rank matrix recovery from a few entries (low-rank matrix completion) requires low coherence (Candes et al 2009) as in the extreme cases when the low-rank matrix is also sparse, where matrix completion can miss information and be unrecoverable. However, the requirement of low coherence does not suffice in the low-rank plus sparse model, as the set of low-rank plus sparse matrices is not closed. We will discuss the relation of non-closedness of the low-rank plus sparse model to the notion of matrix rigidity function in complexity theory.

The object of this talk is a class of generalised Newtonian fluids with implicit constitutive law.

Both in the steady and the unsteady case, existence of weak solutions was proven by Bul\'\i{}\v{c}ek et al. (2009, 2012) and the main challenge is the small growth exponent qq and the implicit law.

I will discuss the application of a splitting and regularising strategy to show convergence of FEM approximations to weak solutions of the flow.

In the steady case this allows to cover the full range of growth exponents and thus generalises existing work of Diening et al. (2013). If time permits, I will also address the unsteady case.

This is joint work with Endre Suli.

We consider the problem of localising non-negative point sources, namely finding their locations and amplitudes from noisy samples which consist of the convolution of the input signal with a known kernel (e.g. Gaussian). In contrast to the existing literature, which focuses on TV-norm minimisation, we analyse the feasibility problem. In the presence of noise, we show that the localised error is proportional to the level of noise and depends on the distance between each source and the closest samples. This is achieved using duality and considering the spectrum of the associated sampling matrix.

A simple experiment in the field of electrochemistry involves controlling the applied potential in an electrochemical cell. This causes electron transfer to take place at the electrode surface and in turn this causes a current to flow. The current depends on parameters in the system and the inverse problem requires us to estimate these parameters given an experimental trace of the current. We briefly describe recent work in this area from simple least squares approximation of the parameters, through bootstrapping to estimate the distributions of the parameters, to MCMC methods which allow us to see correlations between parameters.

Within the wide field of sparse approximation, convolutional sparse coding (CSC) has gained considerable attention in the computer vision and machine learning communities. While several works have been devoted to the practical aspects of this model, a systematic theoretical understanding of CSC seems to have been left aside. In this talk, I will present a novel analysis of the CSC problem based on the observation that, while being global, this model can be characterized and analyzed locally. By imposing only local sparsity conditions, we show that uniqueness of solutions, stability to noise contamination and success of pursuit algorithms are globally guaranteed. I will then present a Multi-Layer extension of this model and show its close relation to Convolutional Neural Networks (CNNs). This connection brings a fresh view to CNNs, as one can attribute to this architecture theoretical claims under local sparse assumptions, which shed light on ways of improving the design and implementation of these networks. Last, but not least, we will derive a learning algorithm for this model and demonstrate its applicability in unsupervised settings.

In this talk, we revisit the cubic regularization (CR) method for solving smooth non-convex optimization problems and study its local convergence behaviour. In their seminal paper, Nesterov and Polyak showed that the sequence of iterates of the CR method converges quadratically a local minimum under a non-degeneracy assumption, which implies that the local minimum is isolated. However, many optimization problems from applications such as phase retrieval and low-rank matrix recovery have non-isolated local minima. In the absence of the non-degeneracy assumption, the result was downgraded to the superlinear convergence of function values. In particular, they showed that the sequence of function values enjoys a superlinear convergence of order 4/3 (resp. 3/2) if the function is gradient dominated (resp. star-convex and globally non-degenerate). To remedy the situation, we propose a unified local error bound (EB) condition and show that the sequence of iterates of the CR method converges quadratically a local minimum under the EB condition. Furthermore, we prove that the EB condition holds if the function is gradient dominated or if it is star-convex and globally non-degenerate, thus improving the results of Nesterov and Polyak in three aspects: weaker assumption, faster rate and iterate instead of function value convergence. Finally, we apply our results to two concrete non-convex optimization problems that arise from phase retrieval and low-rank matrix recovery. For both problems, we prove that with overwhelming probability, the local EB condition is satisfied and the CR method converges quadratically to a global optimizer. We also present some numerical results on these two problems.

The question of what happens to the eigenvalues of a matrix after an additive perturbation has a long history, with notable contributions from Wilkinson, Sorensen, Golub, H\"ormander, Ipsen and Mehrmann, among many others. If the perturbed matrix $C \in \mathbb{C}^{n \times n}$ is given by $C = A + B$, these theorems typically consider the case where $A$ and/or $B$ are symmetric and $B$ has rank one. In this talk we will prove a theorem that bounds the number of distinct eigenvalues of $C$ in terms of the number of distinct eigenvalues of $A$, the diagonalisability of $A$, and the rank of $B$. This new theorem is more general in that it applies to arbitrary matrices $A$ perturbed by matrices of arbitrary rank $B$. We will also discuss various refinements of my bound recently developed by other authors.