Modelling the impact of scientific collaboration

If nations are to grow, both economically and intellectually, they must foster scientific creativity. To do that they must create scientific environments that stimulate collaboration. This is especially true of developing countries as they seek to prosper in a global economy.

Oxford Mathematician Soumya Banerjee’s work looks at scientific collaboration networks, finding novel patterns and clusters in the data that may give insights and guidelines into how the scientific development of developing countries can create richer and more prosperous societies.  

Scientific collaboration networks are an important component of scientific output. Examining a dataset from a scientific collaboration network, Soumya analysed this data using a combination of machine learning techniques and dynamical models.

Soumya's results found a range of clusters of countries with different characteristics of collaboration and corresponding to nations at different stages of development (see figure). Some of these clusters were dominated by developed countries (e.g. the USA and the UK) that have higher numbers of self-connections compared with connections to other countries. Another cluster was dominated by developing nations (such as Liberia and El Salvador) that have mostly connections and collaborations with other countries, but fewer self-connections (shown by different clusters in the figure). 

The research has implications for policy. Countries like El Salvador have a low percentage of foreign connections (this could be a result of the protracted civil war). Consequently the development of active science and research programs in such nations is crucial in generating the concomitant foreign connections. By contrast, Liberia has 100% external connections, suggesting that more effort needs to be taken to develop its own scientific infrastructure. Both a thriving internal and external network are crucial to development.

Proposing a complex systems dynamical model that explains these characteristics, the research explains how the scientific collaboration networks of impoverished and developing nations change over time. The models suggest that developing nations can over time become as successful as the developed nations of today. Soumya also found interesting patterns in the behaviour of countries that may reflect past foreign policies and relations and contemporary geopolitics.

Clearly the model and analyses give food for thought as to how the scientific growth of developing countries can be guided and how it cannot be separated from their existing socio-economic environment and their future prosperity. Big data, machine learning and complexity science are enabling unprecedented computational power to be brought to bear on the fundamental developmental challenges facing humanity.

The figure above plots the percentage of external connections that each country has vs. the distinct number of countries each country is connected with. Clustering is done with k-means and shows three distinct clusters. Click on the image to enlarge.

Soumya's talk on his work can be found here together with his slides and code.