Variance, covariance and assortativity on graphs
Abstract
We develop a theory to measure the variance and covariance of probability distributions defined on the nodes of a graph, which takes into account the distance between nodes. Our approach generalizes the usual (co)variance to the setting of weighted graphs and retains many of its intuitive and desired properties. As a particular application, we define the maximum-variance problem on graphs with respect to the effective resistance distance, and characterize the solutions to this problem both numerically and theoretically. We show how the maximum-variance distribution can be interpreted as a core-periphery measure, illustrated by the fact that these distributions are supported on the leaf nodes of tree graphs, low-degree nodes in a configuration-like graph and boundary nodes in random geometric graphs. Our theoretical results are supported by a number of experiments on a network of mathematical concepts, where we use the variance and covariance as analytical tools to study the (co-)occurrence of concepts in scientific papers with respect to the (network) relations between these concepts. Finally, I will draw connections to related notion of assortativity on networks, a network analogue of correlation used to describe how the presence and absence of edges covaries with the properties of nodes.