Thu, 14 Oct 2021
14:00
Virtual

What is the role of a neuron?

David Bau
(MIT)
Abstract

 

One of the great challenges of neural networks is to understand how they work.  For example: does a neuron encode a meaningful signal on its own?  Or is a neuron simply an undistinguished and arbitrary component of a feature vector space?  The tension between the neuron doctrine and the population coding hypothesis is one of the classical debates in neuroscience. It is a difficult debate to settle without an ability to monitor every individual neuron in the brain.

 

Within artificial neural networks we can examine every neuron. Beginning with the simple proposal that an individual neuron might represent one internal concept, we conduct studies relating deep network neurons to human-understandable concepts in a concrete, quantitative way: Which neurons? Which concepts? Are neurons more meaningful than an arbitrary feature basis? Do neurons play a causal role? We examine both simplified settings and state-of-the-art networks in which neurons learn how to represent meaningful objects within the data without explicit supervision.

 

Following this inquiry in computer vision leads us to insights about the computational structure of practical deep networks that enable several new applications, including semantic manipulation of objects in an image; understanding of the sparse logic of a classifier; and quick, selective editing of generalizable rules within a fully trained generative network.  It also presents an unanswered mathematical question: why is such disentanglement so pervasive?

 

In the talk, we challenge the notion that the internal calculations of a neural network must be hopelessly opaque. Instead, we propose to tear back the curtain and chart a path through the detailed structure of a deep network by which we can begin to understand its logic.

--

A link for this talk will be sent to our mailing list a day or two in advance.  If you are not on the list and wish to be sent a link, please contact @email.

Thu, 06 May 2021
14:00
Virtual

A proximal quasi-Newton trust-region method for nonsmooth regularized optimization

Dominique Orban
(École Polytechnique Montréal)
Abstract

We develop a trust-region method for minimizing the sum of a smooth term f and a nonsmooth term h, both of which can be nonconvex. Each iteration of our method minimizes a possibly nonconvex model of f+h in a trust region. The model coincides with f+h in value and subdifferential at the center. We establish global convergence to a first-order stationary point when f satisfies a smoothness condition that holds, in particular, when it has Lipschitz-continuous gradient, and h is proper and lower semi-continuous. The model of h is required to be proper, lower-semi-continuous and prox-bounded. Under these weak assumptions, we establish a worst-case O(1/ε^2) iteration complexity bound that matches the best known complexity bound of standard trust-region methods for smooth optimization. We detail a special instance in which we use a limited-memory quasi-Newton model of f and compute a step with the proximal gradient method, resulting in a practical proximal quasi-Newton method. We describe our Julia implementations and report numerical results on inverse problems from sparse optimization and signal processing. Our trust-region algorithm exhibits promising performance and compares favorably with linesearch proximal quasi-Newton methods based on convex models.

This is joint work with Aleksandr Aravkin and Robert Baraldi.

-

A link for this talk will be sent to our mailing list a day or two in advance.  If you are not on the list and wish to be sent a link, please contact @email.

Thu, 17 Jun 2021

14:00 - 15:00
Virtual

Primal-dual Newton methods, with application to viscous fluid dynamics

Georg Stadler
(New York University)
Abstract

I will discuss modified Newton methods for solving nonlinear systems of PDEs. These methods introduce additional variables before deriving the Newton linearization. These variables can then often be eliminated analytically before solving the Newton system, such that existing solvers can be adapted easily and the computational cost does not increase compared to a standard Newton method. The resulting algorithms yield favorable convergence properties. After illustrating the ideas on a simple example, I will show its application for the solution of incompressible Stokes flow problems with viscoplastic constitutive relation, where the additionally introduced variable is the stress tensor. These models are commonly used in earth science models. This is joint work with Johann Rudi (Argonne) and Melody Shih (NYU).

 

--

A link for this talk will be sent to our mailing list a day or two in advance.  If you are not on the list and wish to be sent a link, please contact @email.

Thu, 29 Apr 2021
14:00

Regularity, stability and passivity distances for dissipative Hamiltonian systems

Volker Mehrmann
(TU Berlin)
Abstract

Dissipative Hamiltonian systems are an important class of dynamical systems that arise in all areas of science and engineering. They are a special case of port-Hamiltonian control systems. When the system is linearized arround a stationary solution one gets a linear dissipative Hamiltonian typically differential-algebraic system. Despite the fact that the system looks unstructured at first sight, it has remarkable properties.  Stability and passivity are automatic, spectral structures for purely imaginary eigenvalues, eigenvalues at infinity, and even singular blocks in the Kronecker canonical form are very restricted and furthermore the structure leads to fast and efficient iterative solution methods for asociated linear systems. When port-Hamiltonian systems are subject to (structured) perturbations, then it is important to determine the minimal allowed perturbations so that these properties are not preserved. The computation of these structured distances to instability, non-passivity, or non-regularity, is typically a very hard non-convex optimization problem. However, in the context of dissipative Hamiltonian systems, the computation becomes much easier and can even be implemented efficiently for large scale problems in combination with model reduction techniques. We will discuss these distances and the computational methods and illustrate the results via an industrial problem.

 

--

A link for this talk will be sent to our mailing list a day or two in advance.  If you are not on the list and wish to be sent a link, please contact @email.

Thu, 27 May 2021
14:00
Virtual

Algebraic multigrid methods for GPUs

Ulrike Meier Yang
(Lawrence Livermore National Laboratory)
Abstract

Computational science is facing several major challenges with rapidly changing highly complex heterogeneous computer architectures. To meet these challenges and yield fast and efficient performance, solvers need to be easily portable. Algebraic multigrid (AMG) methods have great potential to achieve good performance, since they have shown excellent numerical scalability for a variety of problems. However, their implementation on emerging computer architectures, which favor structure, presents new challenges. To face these difficulties, we have considered modularization of AMG, that is breaking AMG components into smaller kernels to improve portability as well as the development of new algorithms to replace components that are not suitable for GPUs. Another way to achieve performance on accelerators is to increase structure in algorithms. This talk will discuss new algorithmic developments, including a new class of interpolation operators that consists of simple matrix operations for unstructured AMG and efforts to develop a semi-structured AMG method.

 

A link for this talk will be sent to our mailing list a day or two in advance.  If you are not on the list and wish to be sent a link, please contact @email.

Thu, 20 May 2021
14:00
Virtual

The bubble transform and the de Rham complex

Ragnar Winther
(University of Oslo)
Abstract

The bubble transform was a concept introduced by Richard Falk and me in a paper published in The Foundations of Computational Mathematics in 2016. From a simplicial mesh of a bounded domain in $R^n$ we constructed a map which decomposes scalar valued functions into a sum of local bubbles supported on appropriate macroelements.The construction is done without reference to any finite element space, but has the property that the standard continuous piecewise polynomial spaces are invariant. Furthermore, the transform is bounded in $L^2$ and $H^1$, and as a consequence we obtained a new tool for the understanding of finite element spaces of arbitrary polynomial order. The purpose of this talk is to review the previous results, and to discuss how to generalize the construction to differential forms such that the corresponding properties hold. In particular, the generalized transform will be defined such that it commutes with the exterior derivative.

 

A link for this talk will be sent to our mailing list a day or two in advance.  If you are not on the list and wish to be sent a link, please contact @email.

Thu, 10 Jun 2021
14:00
Virtual

53 Matrix Factorizations, generalized Cartan, and Random Matrix Theory

Alan Edelman
(MIT)
Further Information

Joint seminar with the Random Matrix Theory group

Abstract

An insightful exercise might be to ask what is the most important idea in linear algebra. Our first answer would not be eigenvalues or linearity, it would be “matrix factorizations.” We will discuss a blueprint to generate 53 inter-related matrix factorizations (times 2) most of which appear to be new. The underlying mathematics may be traced back to Cartan (1927), Harish-Chandra (1956), and Flensted-Jensen (1978) . We will discuss the interesting history. One anecdote is that Eugene Wigner (1968) discovered factorizations such as the SVD in passing in a way that was buried and only eight authors have referenced that work. Ironically Wigner referenced Sigurður Helgason (1962) but Wigner did not recognize his results in Helgason's book. This work also extends upon and completes open problems posed by Mackey² & Tisseur (2003/2005).

Classical results of Random Matrix Theory concern exact formulas from the Hermite, Laguerre, Jacobi, and Circular distributions. Following an insight from Freeman Dyson (1970), Zirnbauer (1996) and Duenez (2004/5) linked some of these classical ensembles to Cartan's theory of Symmetric Spaces. One troubling fact is that symmetric spaces alone do not cover all of the Jacobi ensembles. We present a completed theory based on the generalized Cartan distribution. Furthermore, we show how the matrix factorization obtained by the generalized Cartan decomposition G=K₁AK₂ plays a crucial role in sampling algorithms and the derivation of the joint probability density of A.

Joint work with Sungwoo Jeong

 

--

A link for this talk will be sent to our mailing list a day or two in advance.  If you are not on the list and wish to be sent a link, please contact @email.

Thu, 03 Jun 2021
14:00
Virtual

Distributing points by minimizing energy for constructing approximation formulas with variable transformation

Ken'ichiro Tanaka
(University of Tokyo)
Abstract


In this talk, we present some effective methods for distributing points for approximating analytic functions with prescribed decay on a strip region including the real axis. Such functions appear when we use numerical methods with variable transformations. Typical examples of such methods are provided by single-exponential (SE) or double-exponential (DE) sinc formulas, in which variable transformations yield single- or double-exponential decay of functions on the real axis. It has been known that the formulas are nearly optimal on a Hardy space with a single- or double-exponential weight on the strip region, which is regarded as a space of transformed functions by the variable transformations.

Recently, we have proposed new approximation formulas that outperform the sinc formulas. For constructing them, we use an expression of the error norm (a.k.a. worst-case error) of an n-point interpolation operator in the weighted Hardy space. The expression is closely related to potential theory, and optimal points for interpolation correspond to an equilibrium measure of an energy functional with an external field. Since a discrete version of the energy becomes convex in the points under a mild condition about the weight, we can obtain good points with a standard optimization technique. Furthermore, with the aid of the formulation with the energy, we can find approximate distributions of the points theoretically.

[References]
- K. Tanaka, T. Okayama, M. Sugihara: Potential theoretic approach to design of accurate formulas for function approximation in symmetric weighted Hardy spaces, IMA Journal of Numerical Analysis Vol. 37 (2017), pp. 861-904.

- K. Tanaka, M. Sugihara: Design of accurate formulas for approximating functions in weighted Hardy spaces by discrete energy minimization, IMA Journal of Numerical Analysis Vol. 39 (2019), pp. 1957-1984.

- S. Hayakawa, K. Tanaka: Convergence analysis of approximation formulas for analytic functions via duality for potential energy minimization, arXiv:1906.03133.

A link for this talk will be sent to our mailing list a day or two in advance.  If you are not on the list and wish to be sent a link, please contact @email.

Thu, 11 Mar 2021
14:00
Virtual

Structured matrix approximations via tensor decompositions

Arvind Saibaba
(North Carolina State University)
Abstract

We provide a computational framework for approximating a class of structured matrices (e.g., block Toeplitz, block banded). Our approach has three steps: map the structured matrix to tensors, use tensor compression algorithms, and map the compressed tensors back to obtain two different matrix representations --- sum of Kronecker products and block low-rank format. The use of tensor decompositions enable us to uncover latent structure in the matrices and lead to computationally efficient algorithms. The resulting matrix approximations are memory efficient, easy to compute with, and preserve the error due to the tensor compression in the Frobenius norm. While our framework is quite general, we illustrate the potential of our method on structured matrices from three applications: system identification, space-time covariance matrices, and image deblurring.

Joint work with Misha Kilmer (Tufts University)

 

--

A link for this talk will be sent to our mailing list a day or two in advance.  If you are not on the list and wish to be sent a link, please contact @email.

Subscribe to Computational Mathematics and Applications Seminar