Thursday, 23 February 2017

Nick Trefethen wins the George Pólya Prize

Oxford Mathematician Nick Trefethen FRS has been awarded the prestigious George Pólya Prize by the Society for Industrial and Applied Mathematics (SIAM). The Prize for Mathematical Exposition, established in 2013, is awarded every two years to an outstanding expositor of the mathematical sciences.

Nick Trefethen is Professor of Numerical Analysis, University of Oxford, Fellow of Balliol College & Global Distinguished Professor, New York University. He is Head of Oxford Mathematics' Numerical Analysis Group. He is known for a succession of influential textbooks and monographs related to numerical mathematics, most recently 'Approximation Theory and Approximation Practice' which appeared in 2013. His next book will explore Ordinary Differential Equations.

Wednesday, 22 February 2017

Philip Maini awarded the Arthur T. Winfree Prize by the Society of Mathematical Biology

Oxford Mathematician Philip Maini has been awarded the Arthur T. Winfree Prize by the Society of Mathematical Biology for his work on mathematical modelling of spatiotemporal processes in biology and medicine. In the words of the citation Philip's work "has led to significant scientific advances not only in mathematics, but also in biology and the biomedical sciences. His mathematical oncology research has provided detailed insight into the design of combination cancer therapies."

Philip will receive his award at the 2017 Annual Meeting of the Society, to be held at the University of Utah in Salt Lake City from July 17-20, 2017.

Tuesday, 21 February 2017

Mathematics and Politics: The International Congresses of Mathematicians

The International Congresses of Mathematicians (ICMs) take place every four years at different locations around the globe, and are the largest regular gatherings of mathematicians from all nations.  However, as much as the assembled mathematicians may like to pretend that these gatherings transcend politics, they have always been coloured by world events: the congresses prior to the Second World War saw friction between French and German mathematicians, for example, whilst Cold War political tensions likewise shaped the conduct of later congresses.

The first ICM, held in Zurich in 1897, emerged from the great expansion in international scientific activities (where 'international' usually meant just Europe and North America) that resulted from the improved communications and transport connections of the late nineteenth century.  The second ICM was held in Paris in 1900, alongside the many other conferences and exhibitions that were being staged there to mark the new century.  A noteworthy feature of the Paris ICM was a lecture given by the prominent German mathematician David Hilbert (1862-1943), in which he outlined a series of problems that he thought ought to be tackled by mathematicians in the coming decades.  Hilbert's problems went on to shape a great deal of twentieth-century mathematical research; just three remain entirely unresolved.

After Paris, a four-yearly pattern was established for the ICMs, and further meetings took place elsewhere in Europe (Heidelberg, 1904; Rome, 1908; Cambridge, 1912).  It was proposed that the 1916 congress would be held in Stockholm, but in the face of the war raging on the continent, it did not take place.

After the end of the First World War, the mathematicians of Western Europe realised that something ought to be done to help to rebuild their discipline and its international networks.  To this end, a group of mathematicians, many of whom hailed from the Western European countries that had been particularly devastated by the war, proposed to re-establish the ICMs with a meeting in 1920.  But in doing this, they made two bold statements.  The first was that the ICM would take place in Strasbourg: a French city that had been incorporated into Germany following the Franco-Prussian War of 1870-1, and that had only recently been returned to France by the Treaty of Versailles.  The second was that all mathematicians from Germany and her wartime allies would be barred from attending the congress.  The exclusion of German mathematicians extended also to the next congress (Toronto, 1924), but by the time of the 1928 ICM in Bologna, the more moderate voices had become louder, and Germans delegates were admitted.

Despite (or because of) the re-opening of relations with German mathematicians, tensions remained in the international mathematical community.  There were those who believed that the Germans should not have been re-admitted to the ICMs.  Moreover, some German mathematicians felt resentment at their earlier exclusion and so boycotted the 1928 congress.  In a bid to bring people back together and re-establish ties, the ICM returned in 1932 to Zurich – a deliberately neutral choice.  Similar reasoning resulted in Oslo being chosen as the venue for the 1936 ICM.

From the start, the Oslo congress was a political melting pot of different agendas; the effects of the wider European political situation were clearly visible.  The goal of the Nazi-led German contingent, for example, was clear: to showcase the best of 'Aryan mathematics'.  The expected Soviet delegation, on the other hand, was conspicuous by its absence.  Like the Germans, Russian mathematicians had had a difficult relationship with the ICMs.  Prior to the First World War, they had regularly attended in significant numbers, but had been rather less visible during the 1920s, following the October Revolution (1917) and subsequent Russian Civil War (1917-1922).  As the decade progressed, they began to reappear, but as Stalin increased his grip on power during the early 1930s, and sought to exercise greater control over the USSR’s scientific community, the ability of Soviet academics to travel to foreign conferences was gradually curtailed.

At the Oslo congress, around eleven Soviets were expected to attend, including two plenary speakers, one of whom, A. O. Gel’fond (1906-1968), was due to lecture on his solution to Hilbert's 7th problem.  However, when the congress convened in July 1936, it was announced that none of the Soviet delegates had appeared: all had been denied permission to travel.

Just like the proposed Stockholm ICM, the congress planned for 1940 did not take place.  It was not until 1950 that the ICMs resumed with a congress in Cambridge, Massachusetts.  No Soviet delegates attended, although several had been invited.  Shortly before the congress, the organisers had received a telegram from the president of the Soviet Academy of Sciences, making the rather transparent excuse that Soviet mathematicians were unable to attend due to pressure of work.

Following Stalin’s death in 1953, international relations thawed somewhat, and the numbers of Soviet delegates at the ICMs gradually increased.  The USSR's involvement in the international mathematical community expanded further in 1957 when it joined the International Mathematical Union.  The signal that the USSR was now fully engaged in world mathematics came in 1966 when it hosted the ICM in Moscow that year.  In the decades that followed, the ICMs provided a forum for mathematicians from East and West to establish personal contacts – but their organisation was certainly not free of difficulties arising from the Cold War political climate.

The International Congresses of Mathematicians provide an excellent means of studying the development of mathematics in the twentieth century: not only can we trace its technical developments and its trends by looking at the choices of plenary speakers, but we can also investigate the ways in which its conduct was affected by events in the wider world, and thereby see that mathematics is indeed a part of global culture.

Christopher Hollings

Oxford Mathematician Christopher Hollings is Departmental Lecturer in Mathematics and its History, and Clifford Norton Senior Research Fellow in the History of Mathematics at The Queen's College. More about the relations between the mathematicians of the East and the West can be found here. Christopher's podcast on the subject will feature shortly as part of the Oxford Sparks series.

Thursday, 16 February 2017

Ursula Martin elected Fellow of the Royal Society of Edinburgh

Oxford Mathematician and Computer Scientist Ursula Martin has been elected a Fellow of the Royal Society of Edinburgh, joining over 1600 current fellows drawn from a wide range of disciplines – science & technology, arts, humanities, social science, business and public service.

Ursula's career has taken in Cambridge and Warwick and included spells across the Atlantic as well as recently at Queen Mary, University of London. From 1992 to 2002, she was Professor of Computer Science at the University of St Andrews in Scotland, the first female professor at the University since its foundation in 1411. Her work around theoretical Computer Science is accompanied by a passionate commitment to advancing the cause of women in science. She has also been a leading light in the recent study and promotion of the life and work of Victorian Mathematician Ada Lovelace and has been instrumental in examining and explaining Ada's mathematics as well as promoting her achievements as a woman.



Tuesday, 14 February 2017

Applied mathematics: don’t think twice, it’s all right

In an interview with Rolling Stone Magazine in 1965, Bob Dylan was pushed to define himself: Do you think of yourself primarily as a singer or a poet? To which, Dylan famously replied: Oh, I think of myself more as a song and dance man, y’know. Dylan’s attitude to pigeonholing resonates with many applied mathematicians. I lack the coolness factor of Dylan, but if pushed about defining what kind of mathematician I am, I would say: Oh, I think myself more as an equation and matrix guy, y’know. 

One of the greatest strengths of applied mathematics is that it has established itself by defying simple categorisation. Applied mathematics, be it an art, a craft, or a discipline, is not bound to a particular scientific application, a particular mathematical universe, or a well-defined university department. The drawback is that applied mathematics usually gets no mega-funding or the limelight associated with big scientific breakthroughs. But its biggest advantage is that it can insert itself into all scientific disciplines and easily reinvent itself by moving fluidly from one field to the next, guided only by methods, theory, and applications: it is all equations and matrices. Many applied mathematicians see new challenges as an opportunity to expand their mathematical horizons, and in our rapidly changing modern new society such new challenges abound. Here are three of these.

Major scientific efforts are required for major society challenges. These include fighting climate change, optimising new renewable energy sources, developing new medical treatments, and understanding the brain. Traditionally, applied mathematicians involved with these collaborative efforts were considered a useful but small cog in a huge scientific machine, but it is now appreciated that quality science requires clever modelling, state-of-the-art numerical methods, and fundamental theoretical insights from simplified models. This is the realm of applied mathematics, and accordingly our role in these endeavours is bound to increase. By the end of the day, we may not get the fame, but we’ll certainly have the fun.

A second relatively recent development of applied mathematics is the theory of networks. Networks represent connections between multiple physical or virtual entities. They are found in information theory (web links, social connections), biological systems (gene regulatory networks, metabolic networks, evolutionary trees), and physical systems (axon connections, electric grid). Regardless of their origin, these networks share common mathematical features. Their analyses span many different fields of study, and network theory has now established tentacular connections to various parts of pure and applied mathematics, a network of its own.

For about five years there has been much excitement about BIG DATA. The initial hope was that one could go straight into data and use empirical methods to unravel the mysteries of the universe. Quite the opposite is happening. The success of many methods has shed a bright light on the need to understand the underlying mathematical structure of both data and methods. The subject now presents a rich field of study that brings all mathematical sciences together, including statistics and computer science.

These examples share a common thread that highlights a new trend in mathematical and scientific discoveries: beyond inter-, multi-, and supra-disciplinarity, we live in a post-disciplinary world. Things have changed, and Oxford University with its collegiate system, and the Mathematical Institute with its collegial atmosphere, are particularly well equipped to thrive in this new scientific world. But despite all the hype, we’re also fully aware that there’s nothing wrong with the old world, the old problems, or the old conjectures. We have an intellectual responsibility to promote and cherish these areas of knowledge defined by the great thinkers, past and present, especially if they are believed to be useless or irrelevant.

Bob Dylan in the same interview, foresaw yet another possible application of mathematics: What would you call your music? His reply: I like to think of it more in terms of vision music – it's mathematical music.

Alain Goriely, Professor of Mathematical Modelling, Oxford Mathematics, University of Oxford.

The caption next to Bob above is a scan of Alain’s brain. Bob sang: “My feet are so tired, my brain is so wired”. But will collaborative applied mathematics untangle the mystery of the author’s brain, mathematical or otherwise?

Alain's 'Applied Mathematics: A Very Short Introduction' will be published by OUP later in the Year. You can also watch his lecture via the Oxford Mathematics YouTube page.

Friday, 10 February 2017

The Truth is not enough - Tim Harford Oxford Mathematics Public Lecture now online

From the tobacco companies in the fifties to the arguments of the Brexit campaign, Economist and BBC Radio 4 Presenter Tim Harford takes us on a tour of truths, facts and the weapon that is doubt. Surely fact-checking websites and rational thinking are the best weapons to convince people of the truth? Or is in fact the truth simply not good enough. Do we have time or any inclination to hear it? Maybe we need to start with something simpler. Perhaps arousing people's curiosity might be just as important.

Watch Tim make his case in the latest of the successful Oxford Mathematics Public Lecture series.

Monday, 6 February 2017

Oxford Mathematics Research - On the null origin of the ambitwistor string

As part of our series of research articles deliberately focusing on the rigour and intricacies of mathematics and its problems, Oxford Mathematician Eduardo Casali discusses his work.

This work [1], done with my collaborator Piotr Tourkine, is our attempt to understand the origin of some recent constructions in theoretical physics and how they fit into more standard techniques. These constructions go by the name of ambitwistor strings [2], named so because they combine elements of classical twistor theory and string theory. As in usual string theory, it is a theory of maps from Riemann surfaces into some target space, usually spacetime, taken to be $\mathbb{R}^D$ where the dimension $D$ depends on which flavour of string theory one is studying. In the case of the ambitwistor string target space is a generalization of twistor space to arbitrary dimensions called ambitwistor space. This can be defined as the space of complex null geodesics of complexified spacetime. For example, the ambitwistor space of $\mathcal{M}=\mathbb{C}^D$ is given by the quotient $\mathbb{A}=T^*\mathcal{M}//\{P^2=0\}$. The ambitwistor string is then a theory of holomorphic maps from Riemann surfaces into $\mathbb{A}$.

Following the rules of string theory, one can allow certain kinds of singularities on the Riemann surface such that these surfaces now have moduli and the string theory gives an integral over the corresponding moduli space. These integrals correspond to scattering of particles in a theory with an infinite spectrum of massive particles. To obtain results related to standard quantum field theories a low-energy limit must be taken, which corresponds to $\alpha'\rightarrow0$ where $\alpha'$ is the inverse of the string tension. In terms of the moduli space integral this limit is subtle, breaking the integral into several smaller pieces akin to Feynman diagrams. But surprisingly, in the ambitwistor string no limit needs to be taken. The moduli space integral already gives the result in the $\alpha'\rightarrow0$ limit. This is possible since the integral in this case localizes to the solution set of a set of equations called the scattering equations[3]. These equations had previously been found in the opposite limit of the string, the high-energy limit $\alpha'\rightarrow\infty$[4], so their appearance in the ambitwistor string is a bit of a puzzle. Another related puzzle was how the ambitwistor string fits into the framework of conventional string theory. It shares several similarities, both in its set-up and in the calculations coming from it, but a naive attempt to take the $\alpha'\rightarrow0$ limit gives a very different result.

It is here that our recent work comes in. We showed how the ambitwistor string model can be derived from a more fundamental model, the null string. This third string theory is obtained by taking the $\alpha'\rightarrow\infty$ limit in the original string and then quantising. By making specific choices in the null string one can show that it coincides with the ambitwistor string. With this interpretation of the ambitwistor string in hand we can make sense of some of its more puzzling characteristics. First, it gives a rationale for the appearance of the scattering equations in what seemed to be the complete opposite limit. It also helps connecting the ambitwistor string to usual string theory and shows promise in generalizing the ambitwistor string to describe more general theories. But more importantly, it opens a new way in which we can approach it. By making different choices when setting up the null string, which don't affect the end result, we hope to obtain new scattering formulas and a new understanding of the geometric role played by the scattering equations and the moduli space of Riemann surfaces in these formulas. Finally, this might also shed light on the old problem of how the spacetime equations of motions are codified into ambitwistor space. An answer to this was given by LeBrun and Mason [5], but their construction is quite different from what the ambitwistor string seems to imply. That is, that the equations of motion should be somehow codified into embeddings of Riemann surfaces into ambitwistor space.

[1] E. Casali and P. Tourkine, “On the null origin of the ambitwistor string,” JHEP 11 (2016).

[2] L. Mason and D. Skinner, “Ambitwistor strings and the scattering equations,” JHEP 07 (2014). 

[3] F. Cachazo, S. He, and E. Y. Yuan, “Scattering of Massless Particles in Arbitrary Dimensions,” Phys. Rev. Lett. 113 no. 17, (2014).

[4] D. J. Gross and P. F. Mende, “The High-Energy Behavior of String Scattering Amplitudes,” Phys. Lett. B197 (1987) 129–134.

[5] C. LeBrun, “Spaces of complex null geodesics in complex-Riemannian geometry,” Trans. Amer. Math. Soc. 278 no. 1, (1983) 209–231.

Saturday, 4 February 2017

Oxford Mathematics Research - Rates of convergence in the method of alternating projections

As part of our series of research articles deliberately focusing on the rigour and intricacies of mathematics and its problems, Oxford Mathematician David Seifert discusses his and his collaborator Catalin Badea's work.

Given a point $x$ and a shape $M$ in three-dimensional space, how might we find the point in $M$ which is closest to $x$? In general there need not be an easy answer, but suppose we have the extra information that $M$ is in fact the intersection of several sets which are much simpler to handle (so that a point lies in $M$ if and only if it lies in each of the simpler sets). In this case we might put the sets in order and proceed iteratively, by letting $x=x_0$ be the starting point in a sequence and taking $x_1$ to be the point closest to $x_0$ among the points in the first of the simpler sets. Next we find the point $x_2$ in the second simple subset which lies closest to $x_1$, and we continue in this way, returning to the first simple subset when we have exhausted the list. Does this process lead to the answer we want?

Recent research by Oxford Mathematician David Seifert and his collaborator Catalin Badea of Université de Lille 1 tackles this question not just in familiar three-dimensional space but in the much more general setting of Hilbert spaces. (Here and in what follows it is more the broad underlying ideas that matter, not so much the mathematical details.) Given a Hilbert space $X$ and a closed subspace $M$ of $X$, the orthogonal projection $P_M$ onto $M$ is the linear operator defined by the property that $P_M(x)$ is the point in $M$ which lies closest to $x\in X$. Figure 1 illustrates the case where $X$ is the Euclidean plane and $M$ is a line through the origin, but $X$ could also be a Sobolev space or some other infinite-dimensional Hilbert space. Many problems in mathematics, from linear algebra to the theory of PDEs, involve finding $P_M(x)$ for a given vector $x\in X$ and a space $M$. Sometimes $P_M(x)$ can be computed easily, but in other cases it cannot. Nevertheless, there is a natural way of finding an approximate solution if it is possible, as in the initial example, to break down our problem into easier subproblems.

Indeed, suppose that $M=M_1\cap\dotsc\cap M_N$ where $M_1,\dots, M_N$ are themselves closed subspaces of $X$. We may not know much about $M$ or the operator $P_M$, but often the problem of finding the nearest point to a given vector $x\in X$ in any of the subspaces $M_k$, $1\le k\le N$, is much simpler. Writing $P_k$ for the orthogonal projection onto $M_k$, $1\le k\le N$, we may then successively find the vectors $P_1(x)$, $P_2P_1(x), \dots, P_N\cdots P_1(x)$, $P_1P_N\cdots P_1(x)$ and so on, projecting cyclically onto the subspaces $M_1,\dots,M_N$; see Figure 2.


This method of alternating projections has many different applications. These include surprising ones such as image restoration and computed tomography, but also linear algebra and the theory of PDEs. In linear algebra it corresponds to solving a system of linear equations one by one, at each stage finding the solution to the next equation which lies closest to the previous solution (the Kaczmarz method); in the theory of PDEs the method can capture the process of solving an elliptic PDE on a composite domain by solving it cyclically on each subdomain and using the boundary conditions to update the solution at each stage (the Schwarz alternating method). In general one is led to consider the single operator $T=P_N\cdots P_1$. It is known [2] that

$$ \|T^n(x)-P_M(x)\|\to0,\quad n\to\infty, \quad\quad\quad\quad(*) $$

for all $x\in X$, which means that by projecting cyclically onto the subspaces $M_1,\dotsc,M_N$ we may approximate the unknown solution $P_M(x)$ to arbitrary precision. In practice, though, this result is of limited value unless one has some knowledge of the rate at which the convergence takes place in $(*)$, so that one can estimate the number of iterations required to guarantee a specified level of precision. For example, in the Schwarz alternating method should we expect to require 50 iterations or 50,000 iterations in order to achieve a reasonably good approximation to the (unknown) true solution of our PDE?

There is a surprising dichotomy for the rate of convergence in $(*)$. Either the convergence is exponentially fast for all initial vectors $x\in X$, or one can make the convergence as slow as one likes by choosing appropriate initial vectors $x\in X$. In the example in Figure 2 the rate of convergence is determined by the angle between the lines $M_1$, $M_2$, and likewise in the general case the rate of convergence in $(*)$ depends in an interesting way on the geometric relationship between the subspaces $M_1,\dots,M_N$. In the case of the Schwarz alternating method, for instance, the crucial factor is the precise way in which the different subdomains overlap. If they overlap nicely then we will get exponentially fast convergence no matter where we start, and 50 iterations may well be enough to guarantee a good degree of approximation to the true solution. On the other hand, if the domains overlap in an unfavourable way then for certain starting points even 50,000 iterations may be insufficient. So is all lost if one is in the bad case of arbitrarily slow convergence?

Badea and Seifert showed in [1] that the answer is 'no'. More precisely they proved that even in the bad case there exists a dense subspace $X_0$ of $X$ such that for initial vectors $x\in X_0$ the rate of convergence in $(*)$ is faster than $n^{-k}$ for all $k\ge1$. This result provides a theoretical justification for a phenomenon observed by some practitioners, namely that even in bad cases one can usually achieve reasonably rapid convergence without having to experiment with too many different initial vectors $x\in X$. Badea and Seifert succeeded in improving the known results in other ways, too, for instance by giving a sharper estimate on the precise rate of convergence in the case where it is exponential. Underlying these results is the theory of (unconditional) Ritt operators. The theory of Ritt operators is closely related to some of David's earlier work [3, 4] on the quantified asymptotic behaviour of operators, and it is through this connection with operator theory that he first became interested in the method of alternating projections.

Is this the end of the story? As usual in mathematical research, answering one question opens up many more. In particular, Badea and Seifert's main result shows only that there is, in some sense, a rich supply of initial vectors leading to a decent rate of convergence in the method of alternating projections, but it does not say where these vectors should lie. This is an important open problem in approximation theory, and part of David Seifert's current research is concerned with developing techniques which can shed light on questions such as this.


  1. C. Badea and D. Seifert. Ritt operators and convergence in the method of alternating projections. J. Approx. Theory, 205:133–148, 2016.
  2. I. Halperin. The product of projection operators. Acta Sci. Math. (Szeged), 23:96–99, 1962.
  3. D. Seifert. A quantified Tauberian theorem for sequences. Studia Math., 227(2):183–192, 2015.
  4. D. Seifert. Rates of decay in the classical Katznelson-Tzafriri theorem. J. Anal. Math., 130(1):329–354, 2016
Friday, 3 February 2017

Mathematics and health promotion - discussing diabetes on Twitter

Social media for health promotion is a fast-moving, complex environment, teeming with messages and interactions among a diversity of users. In order to better understand this landscape a team of mathematicians and medical anthropologists from Oxford, Imperial College and Sinnia led by Oxford Mathematician Mariano Beguerisse studied a collection of 2.5 million tweets that contain the term "diabetes". In particular, the research focused on two main questions:

(1) Who are the most influential Twitter users that have posted about diabetes?

(2) What themes arise in these tweets?

The researchers used a mixed-methods approach to answer these questions, that relies on techniques from network science, information retrieval, and medical anthropology.

To answer question (1) the team constructed temporal retweet networks, in which the nodes are twitter users, and connections between them exist whenever a user "retweets" a message posted by another. The crucial feature of these networks is that the connections are "directed", that is, there is a distinction between who the author of the tweet is and who retweeted it. The directionality of connections is what allow us to extract the "hub" and "authority" centrality scores for each user in time. In networks, a centrality score is a proxy for importance; hubs and authority scores are useful to distinguish the different roles played by nodes in retweet networks. A good hub is a user that consistently retweets quality tweets, and a good authority is a user who posts them. Whereas the hub landscape is diffuse and has few consistent players, top authorities are highly persistent across time and comprise bloggers, advocacy groups and NGOs related to diabetes, as well as for-profit entities without specific diabetes expertise.

To get a closer look at who the most influential accounts are, the researchers constructed the follower network of the top authorities  (i.e., who follows whom among top authority nodes). An analysis of this network's communities places these top hubs in different groups with a distinct character such as Twitter accounts that are mostly focused on diabetes activism, health and science, lifestyle, commercial accounts, and comedians and parody accounts. 

To answer question (2) the team separated the tweets by weeks, and obtained the topics in each weekly bin using a technique known as "Latent Dirichlet Allocation", which estimates the probability that a tweet containing a specific word belongs to a topic. Once the topics were obtained, the researchers used thematic coding, a technique used by social scientists, to classify them in four broad thematic groups: health information, news, social interaction and commercial. Interestingly, humorous messages and references to popular culture appear consistently more than any other type of tweet. The abundance of jokes about diabetes in online social media is a signal that there is a baseline understanding about the disease and its causes, which may be the result of nutritional heath promotion over the past decades. This observation is at odds with the belief that more health education is required to help people to understand the sorts of foods which might contribute to the development of diabetes.

The results of this work indicate that the diabetes landscape on Twitter is complex, and it cannot be assumed that people can easily discern "good" and "bad" information, and that clearly there is more information available to consumers than they can be expected to absorb. Public health approaches that simply aim to "inform" the public might be insufficient or even be counterproductive, as they make a complicated cacophony of messages even busier. For example, information from bloggers, companies or automated accounts may be in line with broad health recommendations (and indeed may provide a valuable service to users), but without clear distinction from "legitimate" health advice, such information might also push an agenda that could lead to harm or greater health costs in future. In this case, public health agencies may have to develop new approaches to ensure that the electronic health information landscape is one that promotes healthy citizens and not only sweet profits.

Friday, 3 February 2017

Modelling the impact of scientific collaboration

If nations are to grow, both economically and intellectually, they must foster scientific creativity. To do that they must create scientific environments that stimulate collaboration. This is especially true of developing countries as they seek to prosper in a global economy.

Oxford Mathematician Soumya Banerjee’s work looks at scientific collaboration networks, finding novel patterns and clusters in the data that may give insights and guidelines into how the scientific development of developing countries can create richer and more prosperous societies.  

Scientific collaboration networks are an important component of scientific output. Examining a dataset from a scientific collaboration network, Soumya analysed this data using a combination of machine learning techniques and dynamical models.

Soumya's results found a range of clusters of countries with different characteristics of collaboration and corresponding to nations at different stages of development (see figure). Some of these clusters were dominated by developed countries (e.g. the USA and the UK) that have higher numbers of self-connections compared with connections to other countries. Another cluster was dominated by developing nations (such as Liberia and El Salvador) that have mostly connections and collaborations with other countries, but fewer self-connections (shown by different clusters in the figure). 

The research has implications for policy. Countries like El Salvador have a low percentage of foreign connections (this could be a result of the protracted civil war). Consequently the development of active science and research programs in such nations is crucial in generating the concomitant foreign connections. By contrast, Liberia has 100% external connections, suggesting that more effort needs to be taken to develop its own scientific infrastructure. Both a thriving internal and external network are crucial to development.

Proposing a complex systems dynamical model that explains these characteristics, the research explains how the scientific collaboration networks of impoverished and developing nations change over time. The models suggest that developing nations can over time become as successful as the developed nations of today. Soumya also found interesting patterns in the behaviour of countries that may reflect past foreign policies and relations and contemporary geopolitics.

Clearly the model and analyses give food for thought as to how the scientific growth of developing countries can be guided and how it cannot be separated from their existing socio-economic environment and their future prosperity. Big data, machine learning and complexity science are enabling unprecedented computational power to be brought to bear on the fundamental developmental challenges facing humanity.

The figure above plots the percentage of external connections that each country has vs. the distinct number of countries each country is connected with. Clustering is done with k-means and shows three distinct clusters. Click on the image to enlarge.

Soumya's talk on his work can be found here together with his slides and code.