Thu, 19 Jun 2014
14:00
Rutherford Appleton Laboratory, nr Didcot

Preconditioning and deflation techniques for interior point methods

Dr Rachael Tappenden
(Edinburgh University)
Abstract

The accurate and efficient solution of linear systems Ax = b is very important in many engineering and technological applications, and systems of this form also arise as subproblems within other algorithms. In particular, this is true for interior point methods (IPM), where the Newton system must be solved to find the search direction at each iteration. Solving this system is a computational bottleneck of an IPM, and in this talk I will explain how preconditioning and deflation techniques can be used, to lessen this computational burden.

This is joint work with Jacek Gondzio.

Thu, 27 Feb 2014

14:00 - 15:00
Rutherford Appleton Laboratory, nr Didcot

Alternating minimal energy methods for linear systems in higher dimensions

Dr Dmitry Savostyanov
(University of Southampton)
Abstract

When high-dimensional problems are concerned, not much algorithms can break the curse of dimensionality, and solve them efficiently and reliably. Among those, tensor product algorithms, which implement the idea of separation of variables for multi-index arrays (tensors), seem to be the most general and also very promising. They originated in quantum physics and chemistry and descent broadly from the density matrix renormalization group (DMRG) and matrix product states (MPS) formalisms. The same tensor formats were recently re-discovered in the numerical linear algebra (NLA) community as the tensor train (TT) format.

Algorithms developed in the quantum physics community are based on the optimisation in tensor formats, that is performed subsequently for all components of a tensor format (i.e. all sites or modes).
The DMRG/MPS schemes are very efficient but very difficult to analyse, and at the moment only local convergence results for the simplest algorithm are available. In the NLA community, a common approach is to use a classical iterative scheme (e.g. GMRES) and enforce the compression to a tensor format at every step. The formal analysis is quite straightforward, but tensor ranks of the vectors which span the Krylov subspace grow rapidly with iterations, and the methods are struggling in practice.

The first attempt to merge classical iterative algorithms and DMRG/MPS methods was made by White (2005), where the second Krylov vector is used to expand the search space on the optimisation step.
The idea proved to be useful, but the implementation was based on the fair amount of physical intuition, and the algorithm is not completely justified.

We have recently proposed the AMEn algorithm for linear systems, that also injects the gradient direction in the optimisation step, but in a way that allows to prove the global convergence of the resulted scheme. The scheme can be easily applied for the computation of the ground state --- the differences to the algorithm of S. White are emphasized in Dolgov and Savostyanov (2013).
The AMEn scheme is already acknowledged in the NLA community --- for example it was recently applied for the computation of extreme eigenstates by Kressner, Steinlechner and Uschmajew (2013), using the block-TT format proposed by in Dolgov, Khoromskij, Oseledets and Savostyanov (2014).

At the moment, AMEn algorithm was applied
 - to simulate the NMR spectra of large molecules (such as ubiquitin),
 - to solve the Fokker-Planck equation for the non-Newtonian polymeric flows,
 - to the chemical master equation describing the mesoscopic model of gene regulative networks,
 - to solve the Heisenberg model problem for a periodic spin chain.
We aim to extend this framework and the analysis to other problems of NLA: eigenproblems, time-dependent problems, high-dimensional interpolation, and matrix functions;  as well as to a wider list of high-dimensional problems.

This is a joint work with Sergey Dolgov the from Max-Planck Institute for Mathematics in the Sciences, Leipzig, Germany.

Thu, 28 Nov 2013

14:00 - 15:00
Rutherford Appleton Laboratory, nr Didcot

Block LU factorization with panel Rank Revealing Pivoting and its Communication Avoiding version

Dr Amal Khabou
(University of Manchester)
Abstract

We present a block LU factorization with panel rank revealing

pivoting (block LU_PRRP), an algorithm based on strong

rank revealing QR for the panel factorization.

Block LU_PRRP is more stable than Gaussian elimination with partial

pivoting (GEPP), with a theoretical upper bound of the growth factor

of $(1+ \tau b)^{(n/ b)-1}$, where $b$ is the size of the panel used

during the block factorization, $\tau$ is a parameter of the strong

rank revealing QR factorization, and $n$ is the number of columns of

the matrix. For example, if the size of the panel is $b = 64$, and

$\tau = 2$, then $(1+2b)^{(n/b)-1} = (1.079)^{n-64} \ll 2^{n-1}$, where

$2^{n-1}$ is the upper bound of the growth factor of GEPP. Our

extensive numerical experiments show that the new factorization scheme

is as numerically stable as GEPP in practice, but it is more resistant

to some pathological cases where GEPP fails. We note that the block LU_PRRP

factorization does only $O(n^2 b)$ additional floating point operations

compared to GEPP.

Thu, 07 Nov 2013

14:00 - 15:00
Rutherford Appleton Laboratory, nr Didcot

Sparse multifrontal QR factorization on the GPU

Professor Tim Davis
(University of Florida)
Abstract

Sparse matrix factorization involves a mix of regular and irregular computation, which is a particular challenge when trying to obtain high-performance on the highly parallel general-purpose computing cores available on graphics processing units (GPUs). We present a sparse multifrontal QR factorization method that meets this challenge, and is up to ten times faster than a highly optimized method on a multicore CPU. Our method is unique compared with prior methods, since it factorizes many frontal matrices in parallel, and keeps all the data transmitted between frontal matrices on the GPU. A novel bucket scheduler algorithm extends the communication-avoiding QR factorization for dense matrices, by exploiting more parallelism and by exploiting the staircase form present in the frontal matrices of a sparse multifrontal method.

This is joint work with Nuri Yeralan and Sanjay Ranka.

Thu, 30 May 2013

14:00 - 15:00
Rutherford Appleton Laboratory, nr Didcot

The FEAST eigenvalue algorithm and solver with new perspectives for first-principle electronic structure calculations

Professor Eric Polizzi
(University of Massachusetts)
Abstract

FEAST is a new general purpose eigenvalue algorithm that takes its inspiration from the density-matrix representation and contour integration technique in quantum mechanics [Phys. Rev. B 79, 115112, (2009)], and it can be understood as a subspace iteration algorithm using approximate spectral projection [http://arxiv.org/abs/1302.0432 (2013)]. The algorithm combines simplicity and efficiency and offers many important capabilities for achieving high performance, robustness, accuracy, and multi-level parallelism on modern computing platforms. FEAST is also the name of a comprehensive numerical library package which currently (v2.1) focuses on solving the symmetric eigenvalue problems on both shared-memory architectures (i.e. FEAST-SMP version -- also integrated into Intel MKL since Feb 2013) and distributed architectures (i.e. FEAST-MPI version) including three levels of parallelism MPI-MPI-OpenMP.

\\

\\

In this presentation, we aim at expanding the current capabilies of the FEAST eigenvalue algorithm and developing an unified numerical approach for solving linear, non-linear, symmetric and non-symmetric eigenvalue problems. The resulting algorithms retain many of the properties of the symmetric FEAST including the multi-level parallelism. It will also be outlined that the development strategy roadmap for FEAST is closely tied together with the needs to address the variety of eigenvalue problems arising in computational nanosciences. Consequently, FEAST will also be presented beyond the "black-box" solver as a fundamental modeling framework for electronic structure calculations.

\\

\\

Three problems will be presented and discussed: (i) a highly efficient and robust FEAST-based alternative to traditional self-consistent field

(SCF) procedure for solving the non-linear eigenvector problem (J. Chem. Phys. 138, p194101 (2013)]); (ii) a fundamental and practical solution of the exact muffin-tin problem for performing both accurate and scalable all-electron electronic structure calculations using FEAST on parallel architectures [Comp. Phys. Comm. 183, p2370 (2012)]; (iii) a FEAST-spectral-based time-domain propagation techniques for performing real-time TDDFT simulations. In order to illustrate the efficiency of the FEAST framework, numerical examples are provided for various molecules and carbon-based materials using our in-house all-electron real-space FEM implementation and both the DFT/Kohn-Sham/LDA and TDDFT/ALDA approaches.

Thu, 25 Apr 2013

14:00 - 15:00
Rutherford Appleton Laboratory, nr Didcot

Scalable Data Analytics

Dr Tobias Berka
(University of Cambridge)
Abstract

Very-large scale data analytics are an alleged golden goose for efforts in parallel and distributed computing, and yet contemporary statistics remain somewhat of a dark art for the uninitiated. In this presentation, we are going to take a mathematical and algorithmic look beyond the veil of Big Data by studying the structure of the algorithms and data, and by analyzing the fit to existing and proposed computer systems and programming models. Towards highly scalable kernels, we will also discuss some of the promises and challenges of approximation algorithms using randomization, sampling, and decoupled processing, touching some contemporary topics in parallel numerics.

Thu, 07 Mar 2013

14:00 - 15:00
Rutherford Appleton Laboratory, nr Didcot

The How and Why of Balancing

Dr Philip Knight
(University of Strathclyde)
Abstract

We consider the problem of taking a matrix A and finding diagonal matrices D and E such that the rows and columns of B = DAE satisfy some specific constraints. Examples of constraints are that

* the row and column sums of B should all equal one;
* the norms of the rows and columns of B should all be equal;
* the row and column sums of B should take values specified by vectors p and q.

Simple iterative algorithms for solving these problems have been known for nearly a century. We provide a simple framework for describing these algorithms that allow us to develop robust convergence results and describe a straightforward approach to accelerate the rate of convergence.

We describe some of the diverse applications of balancing with examples from preconditioning, clustering, network analysis and psephology.

This is joint work with Kerem Akartunali (Strathclyde), Daniel Ruiz (ENSEEIHT, Toulouse) and Bora Ucar (ENS, Lyon).

Thu, 22 Nov 2012

14:00 - 15:00
Rutherford Appleton Laboratory, nr Didcot

Domain decomposition for total variation regularisation and applications

Dr Carola-Bibiane Schönlieb
(DAMTP, University of Cambridge)
Abstract

Domain decomposition methods were introduced as techniques for solving partial differential equations based on a decomposition of the spatial domain of the problem into several subdomains. The initial equation restricted to the subdomains defines a sequence of new local problems. The main goal is to solve the initial equation via the solution of the local problems. This procedure induces a dimension reduction which is the major responsible of the success of such a method. Indeed, one of the principal motivations is the formulation of solvers which can be easily parallelized.

In this presentation we shall develop a domain decomposition algorithm to the minimization of functionals with total variation constraints. In this case the interesting solutions may be discontinuous, e.g., along curves in 2D. These discontinuities may cross the interfaces of the domain decomposition patches. Hence, the crucial difficulty is the correct treatment of interfaces, with the preservation of crossing discontinuities and the correct matching where the solution is continuous instead. I will present our domain decomposition strategy, including convergence results for the algorithm and numerical examples for its application in image inpainting and magnetic resonance imaging.

Subscribe to Rutherford Appleton Laboratory, nr Didcot