We begin with a short overview of Random Matrix Theory (RMT), focusing on the Marchenko-Pastur (MP) spectral approach.
Next, we present recent analytical and numerical results on accelerating the training of Deep Neural Networks (DNNs) via MP-based pruning ([1]). Furthermore, we show that combining this pruning with L2 regularization allows one to drastically decrease randomness in the weight layers and, hence, simplify the loss landscape. Moreover, we show that the DNN’s weights become deterministic at any local minima of the loss function.
Finally, we discuss our most recent results (in progress) on the generalization of the MP law to the input-output Jacobian matrix of the DNN. Here, our focus is on the existence of fixed points. The numerical examples are done for several types of DNNs: fully connected, CNNs and ViTs. These works are done jointly with PSU PhD students M. Kiyashko, Y. Shmalo, L. Zelong and with E. Afanasiev and V. Slavin (Kharkiv, Ukraine).
[1] Berlyand, Leonid, et al. "Enhancing accuracy in deep learning using random matrix theory." Journal of Machine Learning. (2024).