Learning to process and analyze the raw weight matrices of neural networks is an emerging research area with intriguing potential applications like editing and analyzing Implicit Neural Representations (INRs), weight pruning/quantization, and function editing. However, weight spaces have inherent permutation symmetries – permutations can be applied to the weights of an architecture, yielding new weights that represent the same function. As with other data types like graphs and point clouds, these symmetries make learning in weight spaces challenging.
This talk will overview recent advances in designing architectures that can effectively operate on weight spaces while respecting their underlying symmetries. First, we will discuss our ICML 2023 paper which introduces novel equivariant architectures for learning on multilayer perceptron weight spaces. We first characterize all linear equivariant layers for their symmetries and then construct networks composed of these layers. We then turn to our ICLR 2024 work, which generalizes the approach to diverse network architectures using what we term Graph Metanetworks (GMN). This is done by representing input networks as graphs and processing them with graph neural networks. We show the resulting metanetworks are expressive and equivariant to weight space symmetries of the architecture being processed. Our graph metanetworks are applicable to CNNs, attention layers, normalization layers, and more. Together, these works make promising steps toward versatile and principled architectures for weight-space learning.