Season 10 Episode 0

OOMC is back for Season 10, and we're kicking things off with a guest appearance from none other than Ayliean, with all things Morse! -·-· ··- - ···· · ·-· · (see you there)
Further Reading
Ayliean
If you like fractal time-lapses, you might like to start with this TikTok.
How many letters are there?
The Welsh alphabet has 29 letters, including some that English doesn't have (written as digraphs, like double-l "ll").
The Unicode Consortium are constantly working on an enormous numbered list of all the characters in all the languages of the world. And also emoji, they’re in the list too.
There's a Tom Scott video with a version of the history of this project. In this video, he’s speaking an An Evening of Unnecessary Detail with Matt Parker.
As of Version 16.0, the Unicode Standard contains 154,998 characters. More or less. Even more than Welsh, because it includes things like snowman-emoji ☃.
Frequency analysis
In German, E is even more common than it is in English! Bear that in mind if you’re translating some encoded text and you think it might be in German. I can’t think why anyone would ever need to do that though, that's an enigma to me.
Someone in chat asked about avoiding the letter E in your writing to confuse decipherers. You might like A Void by Georges Perec (1969), which a 300-page novel written without the letter E, in French, which normally uses quite a few Es. Remarkably, it was translated into English by Gilbert Adair in 1995, still without using the letter E.
Presumably to right some sort of imbalance caused by A Void, Georges Perec went on to write another novel in which all of the vowels are E.
Library of Babel
While I’m recommending stories, you might like to know about The Library of Babel, which is a short story by Jorge Luis Borges.
The concept is that there is an enormous library that contains every single combination of letters or punctuation that fits on a standardised page. The library is arranged into hexagonal rooms that each contain a fixed number of books. The people in the library are looking for the singular Crimson Hexagon which they believe to be (1) red, and (2) magic. Possibly it contains true information about the location of other good books, which would be handy.
In a remarkable bit of technical engineering, the Library of Babel now exists as a website. You can browse the shelves (most of which are filled with gibberish, of course), or search for a string of letters. It is not randomly generated. I do not know if the website includes a single crimson webpage.
Steganography
Ayliean mentioned this in passing. It’s the art of hiding a message in plain sight. This might or might not include a layer of encryption. Think about invisible ink perhaps, or highlighting certain letters.
There was a suggestion earlier in the episode about sending messages by blinking in Morse code, and I think you should know that this has actually been used in a high-stakes situation (content warning; this Wikipedia link has descriptions of torture).
For more about codes and cryptography, you might like The Code Book by Simon Singh.
Simon Singh also set up the Parallel Circles series of livestreams, some of which feature Ayliean. We've come full circle!
Huffman coding
Several people in chat mentioned Huffman coding, which I hadn’t heard of. Luckily, there’s a BBC bitesize GCSE revision article for me to read.
Lychrel numbers
Ayliean described a process where you reverse a number and add that to the number itself. If you keep doing this, then you usually get a palindrome before long, but there are some starting numbers for which no-one knows whether or not you ever get a palindrome. Informally, the numbers get larger and palindromes get rarer as you go, so maybe the sequence can keep growing without ever hitting one. No-one knows! Here’s a webpage about these numbers.
We weren’t going to guess how to spell Lychrel I’m afraid – that word is made up based on someone naming this after their girlfriend Cheryl.
This process reminds me of two things; one is the 1089 trick and the other is the Collatz conjecture.
James’ terrible one-symbol code
Right, here we go. The starting point for this is that I quite like the way Morse code does “5”. It’s five dots, that’s easy to remember. Perhaps if that was the first thing you learned in Morse code, then you would guess that “1” is one dot, “2” is two dots, and “37” is thirty-seven dots. And so on. This is... not a good system. But it only uses a single symbol (dot).
What if you want to send a letter? The obvious suggestion, I think, is that you could convert it the letter to a number somehow, and then send that number. Perhaps you could use Unicode. That’s good, because then you have the option to send an emoji instead. For example, let’s say you want to send the snowman emoji ☃ to someone. That’s unicode symbol number 9731, so you send 9731 beeps down the line. The person receiving the message patiently counts the beeps, then looks up unicode character 9731 and receives the (seasonally-inappropriate) message: ☃.
What if we want to send more than one character? This could be a word or, since spaces and punctuation are just unicode characters, it could be a whole sentence or paragraph or book.
My very silly proposal is that instead of sending separate numbers (which would require a space, which I count as a second symbol), you could work out the sequence $a_1$, $a_2$, $a_3$, $a_4$, $\dots$, $a_n$ consisting of the number for each unicode character in your message, and then calculate $$2^{a_1}\times 3^{a_2}\times 5^{a_3}\times 7^{a_4}\times\dots \times p_n^{a_n},$$ where $n$ is the length of your message and the sequence 2, 3, 5, ... , $p_n$ is the sequence of prime numbers. Send that many beeps.
Then the person receiving the message can VERY patiently count the beeps, factorise the number, and they'll get each of the exponents $a_i$ from the prime factorisation.
For example, if I transmit
163,953,148,672,306,785,235,981,696,462,316,901,654,313,754,941,224,016,871,928,042,191,028,152,357,356,386,423,
991,704,445,066,082,282,398,711,507,312,101,674,742,952,521,828,622,795,177,846,780,861,810,409,024,191,857,582,
585,080,628,095,625,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000 beeps, then you can factorise that as $2^{72}3^{69}5^{76}7^{76}11^{79}$, then look up the unicode characters for each of the exponents and work out that I’ve said “HELLO”.
Operators using this code might prefer to use all-caps for speed, as the upper-case letters come before the lower-case letters in unicode, and therefore require fewer beeps. Operators might also be advised to say HI instead of HELLO.
Two strengths of this system though; firstly, since there are infinitely many primes, we can accommodate messages of arbitrary length without using any spaces. Secondly, since the exponents can be as large as we like, we’re future-proofed against the invention of any new emoji.
If you want to get in touch with us about any of the mathematics in the video or the further reading, feel free to email us on oomc [at] maths.ox.ac.uk.