Skeptophilia: cryptography

Showing posts with label cryptography. Show all posts

Thursday, December 1, 2022

The code breakers

I've always been in awe of cryptographers.

I've read a bit about the work British computer scientist and mathematician Alan Turing did during World War II regarding breaking the "unbreakable" Enigma code used by the Germans -- a code that relied on a machine whose settings were changed daily. And while I can follow a description of how Turing and his colleagues did what they did, I can't in my wildest dreams imagine I could do anything like that myself.

I had the same sense of awe when I read Margalit Fox's fantastic book The Riddle of the Labyrinth, which was about the work of linguists Alice Kober and Michael Ventris in successfully translating the Linear B script of Crete -- a writing system for which not only did they not initially know what the symbol-to-sound correspondence was, they didn't know if the symbols represented single sounds, syllables, or entire words -- nor what language the script represented! (Turned out it was Mycenaean Greek.)

I don't know about you, but I'm nowhere near smart enough to do something like that.

Despite my sense that such endeavors are way outside of my wheelhouse, I've always been fascinated by people who do undertake such tasks. Which is why I was so interested in a link a friend of mine sent me about the breaking of a code that had stumped cryptographers for centuries -- the one used by King Charles V of Spain back in the sixteenth century.

Charles was a bit paranoid, so his creation of a hitherto unbreakable code is definitely in character. When the letter was written, in 1547, he was in a weak position -- he'd signed the Treaty of Crépy tentatively ending aggression with the French, but his ally King Henry VIII of England had just died and was succeeded by his son, the sickly King Edward VI. Charles felt vulnerable...

... and in fact, when the letter was finally decrypted, it was found that it was about his fears of an assassination plot.

As it turned out, the fears were unfounded, and he went on to rule Spain and the Holy Roman Empire for another eleven years, finally dying of malaria at age 58.

His code remained unbroken until recently, however. But the team of Cécile Pierrot-Inria and Camille Desenclos finally was able to decipher it, thanks to a lucky find -- another letter between Charles and his ambassador to France, Jean de St. Mauris, which had a partial key scribbled in the margin. That hint included the vital information that nine of the symbols were meaningless, only thrown in to make it more difficult to break. (Which worked.)

Even with the partial solution in hand, it was still a massive task. As you can see from their solution, most of the consonants can be represented by two different symbols, and double letters are represented by yet another different (single) symbol. There are single symbols that stand for specific people.

But even with those difficulties, Pierrot-Inria and Desenclos managed to break the code.

All of this gives hope to linguists and cryptographers working on the remaining (long) list of writing systems that haven't been deciphered yet. (Wikipedia has a list of scripts that are still not translated -- take a look, you'll be amazed at how many there are.) I'm glad there are people still working on these puzzles. Even if I don't have the brainpower to contribute to the effort, I'm in awe that there are researchers who are allowing us to read writing systems that before were a closed book.

****************************************

Thursday, February 1, 2018

Cracking the code

Long-time readers of Skeptophilia may recall that a while back I did a post on the mysterious and beautiful Voynich Manuscript, a 15th century illustrated codex that has page after page of writing in an unknown orthography. The manuscript, which is named after Polish book seller Wilfred Voynich, who purchased it in 1912, had resisted all attempts to decipher, decode, or translate its text -- or even give any certain information that it was meaningful writing. The failure of the world's best cryptographers and linguists to make sense of it was, to me, a good indication that it was pretty but random -- i.e., most likely a Renaissance-era hoax.

Because, after all, the linguists are pretty damn good at what they do. They even eventually succeeded in translating the odd Linear B script from Crete, when there was no certainty even as to what language it represented, or whether the symbols corresponded to words, syllables, or single sounds. (The success was mostly due to the efforts of the brilliant Alice Kober and Michael Ventris; if you're interested in finding out more, I highly recommend the book The Riddle of the Labyrinth by Margalit Fox, which is fascinating reading.)

Anyhow, the Voynich Manuscript proved to be an intractable problem, which is why it became a favorite of woo-woos who think that The Da Vinci Code is non-fiction. It even inspired one guy, Veikko Latvala of Finland, to attempt a translation from "divine inspiration," producing results that sounded like what you'd get if Charles Darwin had attempted to write The Golden Guide to Flowers while on an acid trip.

A page from the Voynich Manuscript (this image and the one below courtesy of the Wikimedia Commons]

My sense was that it was probably destined to stay in the "intriguing but unsolved" column for the foreseeable future. So I was pretty shocked when a friend and former student sent me a link a couple of days ago about some computer scientists at the University of Alberta who have used a decryption program on the text...

... and have found out that the manuscript is probably written in an encrypted form of Hebrew.

I say "probably" because at this point the scientists, Greg Kondrak and Bradley Hauer, have only the preliminary findings that the script is consistent with Hebrew, and a small piece of it has been translated into a sensible sentence, “She made recommendations to the priest, man of the house and me and people."

Which is a little odd, but it's better than the nonsense Latvala came up with, divine inspiration or not.

Kondrak says that 80% of the words they've translated are in the Hebrew dictionary, which is pretty good evidence they're on to something. He and Hauer are hoping to team up with scholars of ancient Hebrew to try a complete translation.

So it looks like a long-standing mystery may, finally, have been solved. The paper which details their findings, "Decoding Anagrammed Texts Written in an Unknown Language and Script," appeared in Transactions of the Association for Computational Linguistics. So look for further developments soon -- with luck, either confirming their results and delving into translation, or a retraction if this turns out to be a blind alley. In either case, it's nice to know that people are still working on one of the most enduring puzzles in linguistics.

Thursday, March 5, 2015

Piecing together the puzzle

I'm curious about where the human drive to solve puzzles comes from.

It's a cool thing, don't get me wrong. But you have to wonder why it's something so many of us share. We are driven to know things, even things that don't seem to serve any particular purpose in our lives. The process is what's compelling; many times, the answer itself is trivial, once you find it. But still we're pushed onward by an almost physical craving to figure stuff out.

Every few weeks I devote a day in my Critical Thinking classes to solving divergent thinking puzzles. My rationale is that puzzle-solving is like mental calisthenics; if you want to grow your muscles, you exercise, and if you want to sharpen your intellect, you make it work. I tell the students at the outset that they're not graded and that I don't care if they don't get to all of them by the end of the period. You'd think that this would be license for high school students to blow it off, to spend the period chatting, but I find that this activity is one of the ones for which I almost never have to work hard to keep them engaged, despite more than once hearing kids saying things like, "This is making my brain hurt."

Here's a sample -- one of the most elegant puzzles I've ever seen:

A census taker goes to a man's house, and asks for the ages of the man's three daughters.

The man says, "The product of their ages is 36."

The census taker says, "That's not enough information to figure it out."

The man says, "Okay. The sum of their ages is equal to the house number across the street."

The census taker looks out of the window at the house across the street, and says, "That's still not enough information to figure it out."

The man says, "Okay. My oldest daughter has red hair."

The census taker says thank you and writes down the ages of the three daughters.

How old are they?

And yes, I just re-read this, and I didn't leave anything out. It's solvable from what I've given you. Give it a try!

This drive to figure things out, even things with no immediate application, reaches its apogee in two fields that are near and dear to me: science and linguistics. In science, it takes the form of pure research, which, as a scientist friend of mine put it, is "trying to make sense of one cubic centimeter of the universe." To be sure, a lot of pure research results in applications afterwards, but that's not usually why scientists pursue such knowledge. The thrill of pursuit, and the satisfaction of knowing, are motivations in and of themselves.

In linguistics, it has to do with deepening our understanding of how humans communicate, with figuring out the connections between different modes of communication, and with deciphering the languages of our ancestors. It's this last one that spurred me to write this post; just yesterday, I finished reading the phenomenal book The Riddle of the Labyrinth by Margalit Fox, which is the story of how three people set out, one after the other, to crack the code of Linear B.

Linear B was a writing system used in Crete 4,500 years ago, and for which neither the sound values of the characters, nor the language they encoded, was known. This is the most difficult possible problem for a linguist; in fact, most of the time, such scripts (of which there are a handful of other examples) remain closed doors permanently. If you neither know what sounds the letters represent, nor what language was spoken by the people who wrote them, how could you ever decipher it?

One of the Linear B tablets found at Knossos by Arthur Evans [image courtesy of the Wikimedia Commons]

I'd known about this amazing triumph of human perseverance and intelligence ever since I read John Chadwick's The Decipherment of Linear B when I was in college. I was blown away by the difficulty of the task these people undertook, and their doggedness in pursuing the quest to its end. Chadwick's book is fascinating, but Fox's is a triumph; and you're left with the dual sense of admiration at minds that could pierce such a puzzle, and wonderment at why they felt so driven.

Because once the Linear B scripts were decoded, the tablets and inscriptions turned out to be...

... inventories. Lists of how many jugs of olive oil and bottles of wine they had, how many arrows and spears, how many horses and cattle and sheep. No wisdom of the ancients; no gripping sagas of heroes doing heroic things; no new insights into history.

But none of that mattered. Because of the form that the inscriptions took, Arthur Evans, Alice Kober, and Michael Ventris realized pretty quickly that this was the sort of thing that the Linear B tablets were about. The scholars who deciphered this mysterious script weren't after a solution because they thought the inscriptions said something profound or worth knowing; they devoted their lives to the puzzle because it was one cubic centimeter of the universe that no one had yet made sense of.

That they succeeded is a testimony to this peculiar drive we have to understand the world around us, even when it seems to fall under the heading of "who cares?" We need to know, we humans. Wherever that urge comes from, it becomes an almost physical craving. All three of the people whose work cracked the code were united by one trait; a desperate desire to figure things out. Only one, in fact, had a particularly good formal background in linguistics. The other two were an architect and a wealthy amateur historian and archaeologist. Training wasn't the issue. What allowed them to succeed was persistence, and methodical minds that refused to admit that a solution was out of reach.

The story is fascinating, and by turns tragic and inspirational, but by the time I was done reading it I was left with my original question; why are we driven to know stuff that seems to have no practical application whatsoever? I completely understood how Evans, Kober, and Ventris felt, and in their place I no doubt would have felt the same way, but I'm still at a loss to explain why. It's one of those mysterious filigrees of the human mind, which perhaps is selected for because curiosity and inquisitiveness have high survival value in the big picture, even if they sometimes push us to spend our lives bringing light to some little dark cul-de-sac of human knowledge that no one outside of the field cares, or will even hear, about.

But as the brilliant geneticist Barbara McClintock, whose decades-long persistence in solving the mystery of transposable elements ("jumping genes") eventually resulted in a Nobel Prize, put it: "It is a tremendous joy, the whole process of finding the answer. Just pure joy."