Thu, 14 Nov 2024

14:00 - 15:00
Lecture Room 3

Group discussion on the use of AI tools in research

Mike Giles
(Oxford University)
Abstract

AI tools like ChatGPT, Microsoft Copilot, GitHub Copilot, Claude and even older AI-enabled tools like Grammarly and MS Word, are becoming an everyday part of our research environment.  This last-minute opening up of a seminar slot due to the unfortunate illness of our intended speaker (who will hopefully re-schedule for next term) gives us an opportunity to discuss what this means for us as researchers; what are good helpful uses of AI, and are there uses of AI which we might view as inappropriate?  Please come ready to participate with examples of things which you have done yourselves with AI tools.

Thu, 05 Dec 2024

14:00 - 15:00
Lecture Room 3

Solving (algebraic problems from) PDEs; a personal perspective

Andy Wathen
(Oxford University)
Abstract

We are now able to solve many partial differential equation problems that were well beyond reach when I started in academia. Some of this success is due to computer hardware but much is due to algorithmic advances. 

I will give a personal perspective of the development of computational methodology in this area over my career thus far. 

Thu, 28 Nov 2024

14:00 - 15:00
Lecture Room 3

Unleashing the Power of Deeper Layers in LLMs

Shiwei Liu
(Oxford University)
Abstract

Large Language Models (LLMs) have demonstrated impressive achievements. However, recent research has shown that their deeper layers often contribute minimally, with effectiveness diminishing as layer depth increases. This pattern presents significant opportunities for model compression. 

In the first part of this seminar, we will explore how this phenomenon can be harnessed to improve the efficiency of LLM compression and parameter-efficient fine-tuning. Despite these opportunities, the underutilization of deeper layers leads to inefficiencies, wasting resources that could be better used to enhance model performance. 

The second part of the talk will address the root cause of this ineffectiveness in deeper layers and propose a solution. We identify the issue as stemming from the prevalent use of Pre-Layer Normalization (Pre-LN) and introduce Mix-Layer Normalization (Mix-LN) with combined Pre-LN and Post-LN as a new approach to mitigate this training deficiency.

Thu, 21 Nov 2024

14:00 - 15:00
Lecture Room 3

Tackling complexity in multiscale kinetic and mean-field equations

Lorenzo Pareschi
(Heriot Watt University)
Abstract

Kinetic and mean-field equations are central to understanding complex systems across fields such as classical physics, engineering, and the socio-economic sciences. Efficiently solving these equations remains a significant challenge due to their high dimensionality and the need to preserve key structural properties of the models. 

In this talk, we will focus on recent advancements in deterministic numerical methods, which provide an alternative to particle-based approaches (such as Monte Carlo or particle-in-cell methods) by avoiding stochastic fluctuations and offering higher accuracy. We will discuss strategies for designing these methods to reduce computational complexity while preserving fundamental physical properties and maintaining efficiency in stiff regimes. 
Special attention will be given to the role of these methods in addressing multi-scale problems in rarefied gas dynamics and plasma physics. Time permitting, we will also touch on emerging techniques for uncertainty quantification in these systems.

Thu, 31 Oct 2024

14:00 - 15:00
Lecture Room 3

Theory to Enable Practical Quantum Advantage

Balint Koczor
(Oxford University)
Abstract

Quantum computers are becoming a reality and current generations of machines are already well beyond the 50-qubit frontier. However, hardware imperfections still overwhelm these devices and it is generally believed the fault-tolerant, error-corrected systems will not be within reach in the near term: a single logical qubit needs to be encoded into potentially thousands of physical qubits which is prohibitive.

 

Due to limited resources, in the near term, hybrid quantum-classical protocols are the most promising candidates for achieving early quantum advantage and these need to resort to quantum error mitigation techniques. I will explain the basic concepts and introduce hybrid quantum-classical protocols are the most promising candidates for achieving early quantum advantage. These have the potential to solve real-world problems---including optimisation or ground-state search---but they suffer from a large number of circuit repetitions required to extract information from the quantum state. I will finally identify the most likely areas where quantum computers may deliver a true advantage in the near term.

 

Bálint Koczor

Associate Professor in Quantum Information Theory

Mathematical Institute, University of Oxford

webpage

Thu, 24 Oct 2024

14:00 - 15:00
(This talk is hosted by Rutherford Appleton Laboratory)

Machine learning in solution of inverse problems: subjective perspective

Marta Betcke
(University College London)
Abstract

Following the 2012 breakthrough in deep learning for classification and visions problems, the last decade has seen tremendous raise of interest in machine learning in a wider mathematical research community from foundational research through field specific analysis to applications. 

As data is at the core of any inverse problem, it was a natural direction for the field to investigate how machine learning could aid various aspects of inversion yielding numerous approaches from somewhat ad-hoc but very effective like learned unrolled methods to provably convergent learned regularisers with everything in between. In this talk I will review some on these developments through a lens of the research of our group.   

 

Thu, 13 Feb 2025

14:00 - 15:00
Lecture Room 3

Global Optimization with Hamilton-Jacobi PDEs

Dante Kalise
(Imperial College London)
Abstract

We introduce a novel approach to global optimization  via continuous-time dynamic programming and Hamilton-Jacobi-Bellman (HJB) PDEs. For non-convex, non-smooth objective functions,  we reformulate global optimization as an infinite horizon, optimal asymptotic stabilization control problem. The solution to the associated HJB PDE provides a value function which corresponds to a (quasi)convexification of the original objective.  Using the gradient of the value function, we obtain a  feedback law driving any initial guess towards the global optimizer without requiring derivatives of the original objective. We then demonstrate that this HJB control law can be integrated into other global optimization frameworks to improve its performance and robustness. 

Thu, 17 Oct 2024

14:00 - 15:00
Lecture Room 3

On the loss of orthogonality in low-synchronization variants of reorthogonalized block classical Gram-Schmidt

Kathryn Lund
(STFC Rutherford Appleton Laboratory)
Abstract
Interest in communication-avoiding orthogonalization schemes for high-performance computing has been growing recently.  We address open questions about the numerical stability of various block classical Gram-Schmidt variants that have been proposed in the past few years.  An abstract framework is employed, the flexibility of which allows for new rigorous bounds on the loss of orthogonality in these variants. We first analyse a generalization of (reorthogonalized) block classical Gram-Schmidt and show that a "strong'' intrablock orthogonalization routine is only needed for the very first block in order to maintain orthogonality on the level of the unit roundoff. 
Using this variant, which has four synchronization points per block column, we remove the synchronization points one at a time and analyse how each alteration affects the stability of the resulting method. Our analysis shows that the variant requiring only one synchronization per block column cannot be guaranteed to be stable in practice, as stability begins to degrade with the first reduction of synchronization points.
Our analysis of block methods also provides new theoretical results for the single-column case. In particular, it is proven that DCGS2 from Bielich, D. et al. {Par. Comput.} 112 (2022)] and CGS-2 from Swirydowicz, K. et al, {Num. Lin. Alg. Appl.} 28 (2021)] are as stable as Householder QR.  
Numerical examples from the BlockStab toolbox are included throughout, to help compare variants and illustrate the effects of different choices of intraorthogonalization subroutines.


 

Mon, 08 Apr 2024

11:00 - 12:00
Lecture Room 3

Heavy-Tailed Large Deviations and Sharp Characterization of Global Dynamics of SGDs in Deep Learning

Chang-Han Rhee
(Northwestern University, USA)
Abstract

While the typical behaviors of stochastic systems are often deceptively oblivious to the tail distributions of the underlying uncertainties, the ways rare events arise are vastly different depending on whether the underlying tail distributions are light-tailed or heavy-tailed. Roughly speaking, in light-tailed settings, a system-wide rare event arises because everything goes wrong a little bit as if the entire system has conspired up to provoke the rare event (conspiracy principle), whereas, in heavy-tailed settings, a system-wide rare event arises because a small number of components fail catastrophically (catastrophe principle). In the first part of this talk, I will introduce the recent developments in the theory of large deviations for heavy-tailed stochastic processes at the sample path level and rigorously characterize the catastrophe principle for such processes. 

The empirical success of deep learning is often attributed to the mysterious ability of stochastic gradient descents (SGDs) to avoid sharp local minima in the loss landscape, as sharp minima are believed to lead to poor generalization. To unravel this mystery and potentially further enhance such capability of SGDs, it is imperative to go beyond the traditional local convergence analysis and obtain a comprehensive understanding of SGDs' global dynamics within complex non-convex loss landscapes. In the second part of this talk, I will characterize the global dynamics of SGDs building on the heavy-tailed large deviations and local stability framework developed in the first part. This leads to the heavy-tailed counterparts of the classical Freidlin-Wentzell and Eyring-Kramers theories. Moreover, we reveal a fascinating phenomenon in deep learning: by injecting and then truncating heavy-tailed noises during the training phase, SGD can almost completely avoid sharp minima and hence achieve better generalization performance for the test data.  

 

This talk is based on the joint work with Mihail Bazhba, Jose Blanchet, Bohan Chen, Sewoong Oh, Zhe Su, Xingyu Wang, and Bert Zwart.

Thu, 13 Jun 2024

14:00 - 15:00
Lecture Room 3

A New Two-Dimensional Model-Based Subspace Method for Large-Scale Unconstrained Derivative-Free Optimization: 2D-MoSub

Pengcheng Xie
(Chinese Academy of Sciences)
Abstract

This seminar will introduce 2D-MoSub, a derivative-free optimization method based on the subspace method and quadratic models, specifically tackling large-scale derivative-free problems. 2D-MoSub combines 2-dimensional quadratic interpolation models and trust-region techniques to update the points and explore the 2-dimensional subspace iteratively. Its framework includes constructing the interpolation set, building the quadratic interpolation model, performing trust-region trial steps, and updating the trust-region radius and subspace. Computation details and theoretical properties will be discussed. Numerical results demonstrate the advantage of 2D-MoSub.

 

Short Bio:
Pengcheng Xie, PhD (Chinese Academy of Sciences), is joining Lawrence Berkeley National Laboratory as a postdoctoral scholar specializing in mathematical optimization and numerical analysis. He has developed optimization methods, including 2D-MoSub and SUSD-TR. Pengcheng has published in major journals and presented at ISMP 2024 (upcoming), ICIAM 2023, and CSIAM 2022. He received the Hua Loo-keng scholarship in 2019 and the CAS-AMSS Presidential scholarship in 2023.
 

Subscribe to Computational Mathematics and Applications Seminar