In practice, it is standard to initialize Artificial Neural Networks (ANN) with random parameters. We will see that this allows to describe, in the functional space, the limit of the evolution of (fully connected) ANN when their width tends towards infinity. Within this limit, an ANN is initially a Gaussian process and follows, during learning, a gradient descent convoluted by a kernel called the Neural Tangent Kernel.
This description allows a better understanding of the convergence properties of neural networks, of how they generalize to examples during learning and has
practical implications on the training of wide ANNs.
- Stochastic Analysis & Mathematical Finance Seminars