Compression of Graphical Data
Abstract
Data that have an intrinsic network structure can be found in various contexts, including social networks, biological systems (e.g., protein-protein interactions, neuronal networks), information networks (computer networks, wireless sensor networks), economic networks, etc. As the amount of graphical data that is generated is increasingly large, compressing such data for storage, transmission, or efficient processing has become a topic of interest. In this talk, I will give an information theoretic perspective on graph compression.
The focus will be on compression limits and their scaling with the size of the graph. For lossless compression, the Shannon entropy gives the fundamental lower limit on the expected length of any compressed representation. I will discuss the entropy of some common random graph models, with a particular emphasis on our results on the random geometric graph model.
Then, I will talk about the problem of compressing a graph with side information, i.e., when an additional correlated graph is available at the decoder. Turning to lossy compression, where one accepts a certain amount of distortion between the original and reconstructed graphs, I will present theoretical limits to lossy compression that we obtained for the Erdős–Rényi and stochastic block models by using rate-distortion theory.