On the Initialisation of wide Neural Networks: the Edge of Chaos
Wide Neural Networks are well known for their Gaussian Process behaviour. Based upon this fact, an initialisation scheme for the weights and biases of a network preserving some geometrical properties of the input data is presented — The edge-of-chaos. This talk will introduce such a scheme before briefly mentioning a recent contribution related to the edge-of-chaos dynamics of wide randomly initialized low-rank feedforward networks. Formulae for the optimal weight and bias variances are extended from the full-rank to low-rank setting and are shown to follow from multiplicative scaling. The principle second order effect, the variance of the input-output Jacobian, is derived and shown to increase as the rank to width ratio decreases. These results inform practitioners how to randomly initialize feedforward networks with a reduced number of learnable parameters while in the same ambient dimension, allowing reductions in the computational cost and memory constraints of the associated network.