Inferring the large-scale structure of networks

12 November 2015
16:00
Tiago Peixoto
Abstract

Networks form the backbones of a wide variety of complex systems,
ranging from food webs, gene regulation and social networks to
transportation networks and the internet. Due to the sheer size and
complexity of many of theses systems, it remains an open challenge to
formulate general descriptions of their large-scale structures.
Although many methods have been proposed to achieve this, many of them
yield diverging descriptions of the same network, making both the
comparison and understanding of their results very
difficult. Furthermore, very few methods attempt to gauge the
statistical significance of the uncovered structures, and hence the
majority cannot reliably separate actual structure from stochastic
fluctuations.  In this talk, I will show how these issues can be tackled
in a principled fashion by formulating appropriate generative models of
network structure that can have their parameters inferred from data. I
will also consider the comparison between a variety of generative
models, including different structural features such as degree
correction, where nodes with arbitrary degrees can belong to the same
group, and community overlap, where nodes are allowed to belong to more
than one group. Because such model variants possess an increased number
of parameters, they become prone to overfitting. We demonstrate how
model selection based on the minimum description length criterion and
posterior odds ratios can fully account for the increased degrees of
freedom of the larger models, and selects the most appropriate trade-off
between model complexity and quality of fit based on the statistical
evidence present in the data.

Throughout the talk I will illustrate the application of the methods
with many empirical networks such as the internet at the autonomous
systems level, the global airport network, the network of actors and
films, social networks, citations among websites, co-occurrence of
disease-causing genes and many others.
 

  • Industrial and Applied Mathematics Seminar