Fri, 19 Jan 2024

14:00 - 15:00
L3

Modelling cells in one-dimension: diverse migration modes, emergent oscillations on junctions and multicellular "trains"

Professor Nir Gov
(Department of Chemical and Biological Physics Weizmann Institute of Science)
Abstract

Motile cells inside living tissues often encounter junctions, where their path branches into several alternative directions of migration. We present a theoretical model of cellular polarization for cells migrating along one-dimensional lines, exhibiting diverse migration modes. When arriving at a symmetric Y-junction and extending protrusions along the different paths that emanate from the junction. The model predicts the spontaneous emergence of deterministic oscillations between competing protrusions, whereby the cellular polarization and growth alternates between the competing protrusions. These predicted oscillations are found experimentally for two different cell types, noncancerous endothelial and cancerous glioma cells, migrating on patterned network of thin adhesive lanes with junctions. Finally we present an analysis of the migration modes of multicellular "trains" along one-dimensional tracks.

On large externally definable sets in NIP
Bays, M Ben-Neria, O Kaplan, I Simon, P Journal of the Institute of Mathematics of Jussieu volume 23 issue 5 2159-2173 (04 Dec 2023)
Macroscopic description of a heavy particle immersed within a flow of light particles
Erban, R Van Gorder, R (20 Nov 2023)
A frame approach for equations involving the fractional Laplacian
Papadopoulos, I Gutleb, T Carrillo, J Olver, S IMA Journal of Numerical Analysis
Mon, 26 Feb 2024

14:00 - 15:00
Lecture Room 3

Fantastic Sparse Neural Networks and Where to Find Them

Dr Shiwei Liu
(Maths Institute University of Oxford)
Abstract

Sparse neural networks, where a substantial portion of the components are eliminated, have widely shown their versatility in model compression, robustness improvement, and overfitting mitigation. However, traditional methods for obtaining such sparse networks usually involve a fully pre-trained, dense model. As foundation models become prevailing, the cost of this pre-training step can be prohibitive. On the other hand, training intrinsic sparse neural networks from scratch usually leads to inferior performance compared to their dense counterpart. 

 

In this talk, I will present a series of approaches to obtain such fantastic sparse neural networks by training from scratch without the need for any dense pre-training steps, including dynamic sparse training, static sparse with random pruning, and only masking no training. First, I will introduce the concept of in-time over-parameterization (ITOP) (ICML2021) which enables training sparse neural networks from scratch (commonly known as sparse training) to attain the full accuracy of dense models. By dynamically exploring new sparse topologies during training, we avoid the costly necessity of pre-training and re-training, requiring only a single training run to obtain strong sparse neural networks. Secondly, ITOP involves additional overhead due to the frequent change in sparse topology. Our following work (ICLR2022) demonstrates that even a naïve, static sparse network produced by random pruning can be trained to achieve dense model performance as long as our model is relatively larger. Moreover, I will further discuss that we can continue to push the extreme of training efficiency by only learning masks at initialization without any weight updates, addressing the over-smoothing challenge in building deep graph neural networks (LoG2022).

Mon, 12 Feb 2024

14:00 - 15:00
Lecture Room 3

Do Stochastic, Feel Noiseless: Stable Optimization via a Double Momentum Mechanism

Kfir Levy
(Technion – Israel Institute of Technology)
Abstract

The tremendous success of the Machine Learning paradigm heavily relies on the development of powerful optimization methods, and the canonical algorithm for training learning models is SGD (Stochastic Gradient Descent). Nevertheless, the latter is quite different from Gradient Descent (GD) which is its noiseless counterpart. Concretely, SGD requires a careful choice of the learning rate, which relies on the properties of the noise as well as the quality of initialization.

 It further requires the use of a test set to estimate the generalization error throughout its run. In this talk, we will present a new SGD variant that obtains the same optimal rates as SGD, while using noiseless machinery as in GD. Concretely, it enables to use the same fixed learning rate as GD and does not require to employ a test/validation set. Curiously, our results rely on a novel gradient estimate that combines two recent mechanisms which are related to the notion of momentum.

Finally, as much as time permits, I will discuss several applications where our method can be extended.

Subscribe to