'My avid fellow feeling' and 'Fleas': Playing with words on the computer
Abstract
Computers have been used to process natural language for many years. This talk considers two historical examples of computers used rather to play with human language, one well-known and the other a new archival discovery: Strachey’s 1952 love letters program, and a poetry programming competition held at Newcastle University in 1968. Strachey’s program used random number generation to pick words to fit into a template, resulting in letters of varying quality, and apparently much amusement for Strachey. The poetry competition required the entrants, mostly PhD students, to write programs whose output or source code was in some way poetic: the entries displayed remarkable ingenuity. Various analyses of Strachey’s work depict it as a parody of attitudes to love, an artistic endeavour, or as a technical exploration. In this talk I will consider how these apply to the Newcastle competition and add my own interpretations.
14:00
Minimum degree stability and locally colourable graphs
Abstract
We tie together two natural but, a priori, different themes. As a starting point consider Erdős and Simonovits's classical edge stability for an $(r + 1)$-chromatic graph $H$. This says that any $n$-vertex $H$-free graph with $(1 − 1/r + o(1)){n \choose 2}$ edges is close to (within $o(n^2)$ edges of) $r$-partite. This is false if $1 − 1/r$ is replaced by any smaller constant. However, instead of insisting on many edges, what if we ask that the $n$-vertex graph has large minimum degree? This is the basic question of minimum degree stability: what constant $c$ guarantees that any $n$-vertex $H$-free graph with minimum degree greater than $cn$ is close to $r$-partite? $c$ depends not just on chromatic number of $H$ but also on its finer structure.
Somewhat surprisingly, answering the minimum degree stability question requires understanding locally colourable graphs -- graphs in which every neighbourhood has small chromatic number -- with large minimum degree. This is a natural local-to-global colouring question: if every neighbourhood is big and has small chromatic number must the whole graph have small chromatic number? The triangle-free case has a rich history. The more general case has some similarities but also striking differences.
The first months of 2020 brought the world to an almost complete standstill due to the occurrence and outbreak of the SARS-CoV-2 coronavirus, which causes the highly contagious COVID-19 disease. Despite the hopes that rapidly developing medical sciences would quickly find an effective remedy, the last two years have made it quite clear that, despite vaccines, this is not very likely.
When machine learning deciphers the 'language' of atmospheric air masses
Abstract
Latent Dirichlet Allocation (LDA) is capable of analyzing thousands of documents in a short time and highlighting important elements, recurrences and anomalies. It is generally used in linguistics to study natural language: its word analysis reveals the theme(s) of a document, each theme being identified by a specific vocabulary or, more precisely, by a particular statistical distribution of word frequency.
In the climatologists' use of LDA, the document is a daily weather map and the word is a pixel of the map. The theme with its corpus of words can become a cyclone or an anticyclone and, more generally, a 'pattern' that the scientists term motif. Artificial intelligence – a sort of incredibly fast robot meteorologist – looks for correlations both between different places on the same map, and between successive maps over time. In a sense, it 'notices' that a particular location is often correlated with another location, recurrently throughout the database, and this set of correlated locations constitutes a specific pattern.
The algorithm performs statistical analyses at two distinct levels: at the word or pixel level of the map, LDA defines a motif, by assigning a certain weight to each pixel, and thus defines the shape and position of the motif; LDA breaks down a daily weather map into all these motifs, each of which is assigned a certain weight.
In concrete terms, the basic data are the daily weather maps between 1948 and nowadays over the North Atlantic basin and Europe. LDA identifies a dozen or so spatially defined motifs, many of which are familiar meteorological patterns such as the Azores High, the Genoa Low or even the Scandinavian Blocking. A small combination of those motifs can then be used to describe all the maps. These motifs and the statistical analyses associated with them allow researchers to study weather phenomena such as extreme events, as well as longer-term climate trends, and possibly to understand their mechanisms in order to better predict them in the future.
The preprint of the study is available as:
Lucas Fery, Berengere Dubrulle, Berengere Podvin, Flavio Pons, Davide Faranda. Learning a weather dictionary of atmospheric patterns using Latent Dirichlet Allocation. 2021. ⟨hal-03258523⟩ https://hal-enpc.archives-ouvertes.fr/X-DEP-MECA/hal-03258523v1