Skeptophilia (skep-to-fil-i-a) (n.) - the love of logical thought, skepticism, and thinking critically. Being an exploration of the applications of skeptical thinking to the world at large, with periodic excursions into linguistics, music, politics, cryptozoology, and why people keep seeing the face of Jesus on grilled cheese sandwiches.
Showing posts with label language. Show all posts
Showing posts with label language. Show all posts

Saturday, August 30, 2025

The universal language

Sometimes I have thoughts that blindside me.

The last time that happened was a couple of days ago, while I was working in my office and our puppy, Jethro, was snoozing on the floor.  Well, as sometimes happens to dogs, he started barking and twitching in his sleep, and followed it up with sinister-sounding growls -- all the more amusing because while awake, Jethro is about as threatening as your average plush toy.

So my thought, naturally, was to wonder what he was dreaming about.  Which got me thinking about my own dreams, and recalling some recent ones.  I remembered some images, but mostly what came to mind were narratives -- first I did this, then the slimy tentacled monster did that.

That's when the blindside happened.  Because Jethro, clearly dreaming, was doing all that without language.

How would thinking occur without language?  For almost all humans, our thought processes are intimately tied to words.  In fact, the experience of having a thought that isn't describable using words is so unusual that we have a word for it -- ineffable.

Mostly, though, our lives are completely, um, effable.  So much so that trying to imagine how a dog (or any other animal) experiences the world without language is, for me at least, nearly impossible.

What's interesting is how powerful this drive toward language is.  There have been studies of pairs of "feral children" who grew up together but with virtually no interaction with adults, and in several cases those children invented spoken languages with which to communicate -- each complete with its own syntax, morphology, and phonetic structure.

A fascinating study that came out in the Proceedings of the National Academy of Sciences, detailing research by Manuel Bohn, Gregor Kachel, and Michael Tomasello of the Max Planck Institute for Evolutionary Anthropology, showed that you don't even need the extreme conditions of feral children to induce the invention of a new mode of symbolic communication.  The researchers set up Skype conversations between monolingual English-speaking children in the United States and monolingual German-speaking children in Germany, but simulated a computer malfunction where the sound didn't work.  They then instructed the children to communicate as best they could anyhow, and gave them some words/concepts to try to get across.

They started out with some easy ones.  "Eating" resulted in the child miming eating from a plate, unsurprisingly.  But they moved to harder ones -- like "white."  How do you communicate the absence of color?  One girl came up with an idea -- she was wearing a polka-dotted t-shirt, and pointed to a white dot, and got the idea across.

But here's the interesting part.  When the other child later in the game had to get the concept of "white" across to his partner, he didn't have access to anything white to point to.  He simply pointed to the same spot on his shirt that the girl had pointed to earlier -- and she got it immediately.

Language is defined as arbitrary symbolic communicationArbitrary because with the exception of a few cases like onomatopoeic words (bang, pow, ping, etc.) there is no logical connection between the sound of a word and its referent.  Well, here we have a beautiful case of the origin of an arbitrary symbol -- in this case, a gesture -- that gained meaning only because the recipient of the gesture understood the context.

I'd like to know if such a gesture-language could gain another characteristic of true language -- transmissibility.  "It would be very interesting to see how the newly invented communication systems change over time, for example when they are passed on to new 'generations' of users," said study lead author Manuel Bohn, in an interview with Science Daily.  "There is evidence that language becomes more systematic when passed on."

In time, might you end up with a language that was so heavily symbolic and culturally dependent that understanding it would be impossible for someone who didn't know the cultural context -- like the Tamarians' language in the brilliant, poignant, and justifiably famous Star Trek: The Next Generation episode "Darmok"?

"Sokath, his eyes uncovered!"

It's through cultural context, after all, that languages start developing some of the peculiarities (also seemingly arbitrary) that led Edward Sapir and Benjamin Whorf to develop the hypothesis that now bears their names -- that the language we speak alters our brains and changes how we understand abstract concepts.  In K. David Harrison's brilliant book The Last Speakers, he tells us about a conversation with some members of a nomadic tribe in Siberia who always described positions of objects relative to the four cardinal directions -- so at the moment my coffee cup wouldn't be on my right, it would be south of me.  When Harrison tried to explain to his Siberian friends how we describe positions, at first he was greeted with outright bafflement.

Then, they all erupted in laughter.  How arrogant, they told him, that you see everything as relative to your own body position -- as if when you turn around, suddenly the entire universe changes shape to compensate for your movement!



Another interesting example of this was the subject of a 2017 study by linguists Emanuel Bylund and Panos Athanasopoulos, and focused not on our experience of space but of time.  And they found something downright fascinating.  Some languages (like English) are "future-in-front," meaning we think of the future as lying ahead of us and the past behind us, turning time into something very much like a spatial dimension.  Other languages retain the spatial aspect, but reverse the direction -- such as the Peruvian language of Aymara.  For them, the past is in front, because you can remember it, just as you can see what's in front of you.  The future is behind you -- therefore invisible.

Mandarin takes the spatial axis and turns it on its head -- the future is down, the past is up (so the literal translation of the Mandarin expression of "next week" is "down week").  Asked to order photographs of someone in childhood, adolescence, adulthood, and old age, they will place them vertically, with the youngest on top.  English and Swedish speakers tend to think of time as a line running from left (past) to right (future); Spanish and Greek speakers tended to picture time as a spatial volume, as if it were something filling a container (so emptier = past, fuller = future).

All of which underlines how fundamental to our thinking language is.  And further baffles me when I try to imagine how other animals think.  Because whatever Jethro was imagining in his dream, he was clearly understanding and interacting with it -- even if he didn't know to attach the word "squirrel" to the concept.

****************************************


Saturday, July 19, 2025

Footprints

The southern tip of mainland Italy is called Calabria.  It's a strikingly beautiful place, containing three national parks (Pollino National ParkSila National Park and Aspromonte National Park), and a stretch of coastline -- near Reggio, facing across the Straits of Messina to Sicily -- that poet Gabriele D'Annunzio called "the most beautiful kilometer in Italy."  It's a region blessed with more than its share of dramatic scenery.

[Image licensed under the Creative Commons Cliff at Tropea, Italy, Sep 2005 , CC BY-SA 2.5]

Calabria forms the "toe of Italy's boot."  I remember noticing the country's odd shape when I was a kid and first became fascinated with maps (a fascination that remains with me today), and wondering why it looked like that; back then, when plate tectonics was still a new science, I doubt they really understood it on a level any deeper than "it's near a plate margin, and that moves stuff around."  Today, we have a much more detailed understanding of the geology of the area, and it is complex.

Tectonic map of southern Italy and Sicily [Image licensed under the Creative Commons Jpvandijk, J.P. van Dijk, Janpieter van Dijk, Johannes Petrus van Dijk, CentralMediterranean-GeotectonicMap, CC BY-SA 4.0]

On its simplest level, the entire southern half of Italy is being pushed to the southeast, and it's riding up and over the northern edge of the African Plate.  This process is responsible not only for the volcanism of the region -- Mount Etna being the most obvious example -- but the massive earthquakes that have shaped it, in part creating the gorgeous topography.  (It also has made it a dangerous place to live.  The Messina Earthquake of 1908, with an epicenter right across the straits from Calabria, had a magnitude of 7.1 and killed an estimated eighty thousand people, most of them in the first three minutes after the quake struck and the majority of the buildings collapsed.)

As interesting as the geology of the region is, that's not what spurred me to write about the topic today.  What I'd like to tell you about is Calabria's tremendous linguistic diversity, an embarrassment of riches packed into a small geographical area.  The main language, of course, is standard Italian, but a great many people there (especially in the southern parts) speak Calabrian, a Greek-influenced-Latin derivative that is mostly mutually intelligible with Italian but has some distinct vocabulary and pronunciations. 

Then there's Grecanico, which is derived from an archaic dialect of Byzantine Greek, and is spoken by a group of people descended from folks who settled in the region more than a thousand years ago and have somehow maintained their ethnic identity the whole time.  It's written with the Latin, not Greek, alphabet -- but other than that has more in common with Thessalian Greek than with Italian.

Another language that has little to do with Italian is Arbëresh, a dialect of Albanian brought in with migrants during the Late Middle Ages.  From some of its idiosyncrasies, it appears to be related to Tosk Albanian, a group of dialects spoken in the southern parts of Albania, near the border of Greece.  It's astonishing that we can still identify the part of the world the ancestors of the Arbëreshë people came from centuries ago -- by the peculiarities of the language they have spoken during the more than six hundred years they've lived in isolated communities in Calabria.

Finally, there's Gardiol, which is related to Occitan (also known as Provençal or Languedoc), the Romance language widely spoken in the southern half of France.  Like with Calabrian (and also Catalan in Spain), most Occitan speakers in France speak the majority language as well, but use Occitan when speaking with family, friends, and locals.  The ancestors of the speakers of Gardiol came in with the persecution of the Waldensian "heretics" in France in the thirteenth century, who found a refuge in a thinly-populated part of northern Calabria.  Once again -- amazingly -- they've retained their ethnic identity and language through all the vagaries of time since their arrival.

All of that -- and standard Italian as well -- in an area of around fifteen thousand square kilometers, a little more than the size of the state of Connecticut.

UNESCO describes all four of these languages -- Calabrian, Grecanico, Arbëresh, and Gardiol -- as "in serious danger of disappearing."  It's sad to think of these footprints of history vanishing, and taking along with them pieces of human culture that somehow had persisted for centuries.  I understand why this happens; in modern life, speaking and writing the dominant language is not only useful, it's often essential for getting a job and making a living.  These little pockets of other languages survived better when people had little mobility and even less connectedness to others living far away.  In today's world, they seem doomed.

Change is the fate of all things, but it inevitably comes with a sense of loss.  The linguistic diversity of the beautiful region of Calabria will, very likely, soon be gone.  Like biodiversity loss, this diminishes the richness of our world.  I hope that linguists are working to catalog and study these unique languages -- before the last native speakers are gone forever.

****************************************


Monday, July 7, 2025

Dord, fnord, and nimrod

We were having dinner with our younger son a while back, and he asked if there was a common origin for the -naut in astronaut and the naut- in nautical.

"Yes," I said.  "Latin nauta, meaning 'sailor.'  Astronaut literally means 'star sailor.'  Also cosmonaut, but that one came from Latin to English via Russian."

"How about juggernaut?" he asked.

"Nope," I said.  "That's a false cognate.  Juggernaut comes from Hindi, from the name of a god, Jagannath.  Every year on the festival day for Jagannath, they'd bring out his huge stone statue on a wheeled cart, and the (probably apocryphal) story is that sometimes it would get away from them, and roll down the hill and crush people.  So it became a name for a destructive force that gets out of hand."

Nathan stared at me for a moment.  "How the hell do you know this stuff?" he asked.

"Two reasons.  First, M.A. in historical linguistics.  Second, it takes up lots of the brain space that otherwise would be used for less important stuff, like where I put my car keys and remembering to pay the utility bill."

I've been fascinated with words ever since I was little, which probably explains not only my degree but the fact that I'm a writer.  And it's always been intriguing to me how words not only shift in spelling and pronunciation, but shift in meaning, and can even pop into and out of existence in strange and unpredictable ways.  Take, for example, the word dord, that for eight years was in the Merriam-Webster New International Dictionary as a synonym for "density."  In 1931, Austin Patterson, the chemistry editor for Merriam-Webster, sent in a handwritten editing slip for the entry for the word density, saying, "D or d, cont./density."  He meant, of course, that in equations, the variable for density could either be a capital or a lower case letter d.  Unfortunately, the typesetter misread it -- possibly because Patterson's writing left too little space between words -- and thought that he was proposing dord as a synonym.

Well, the chemistry editor should know, right?  So into the dictionary it went.

It wasn't until 1939 that editors realized they couldn't find an etymology for dord, figured out how the mistake had come about, and the word was removed.  By then, though, it had found its way into other books.  It's thought that the error wasn't completely expunged until 1947 or so.

Then there's fnord, which is a word coined in 1965 by Kerry Thornley and Greg Hill as part of the sort-of-parody, sort-of-not Discordian religion's founding text Principia Discordia.  It refers to a stimulus -- usually a word or a picture -- that people are trained as children not to notice consciously, but that when perceived subliminally causes feelings of unease.  Government-sponsored mind-control, in other words.  It really took off when it was used in the 1975  Illuminatus! Trilogy, by Robert Shea and Robert Anton Wilson, which became popular with the counterculture of the time (for obvious reasons).

Fnord isn't the only word that came into being because of a work of fiction.  There's grok, meaning "to understand on a deep or visceral level," from Robert Heinlein's novel Stranger in a Strange Land.   A lot of you probably know that the quark, the fundamental particle that makes up protons and neutrons, was named by physicist Murray Gell-Mann after the odd line from James Joyce's Finnegan's Wake, "Three quarks for Muster Mark."  Less well known is that the familiar word robot is also a neologism from fiction, from Czech writer Karel Čapek's play R.U.R. (Rossum's Universal Robots); robota in Czech means "hard labor, drudgery," so by extension, the word took on the meaning of the mechanical servant who performed such tasks.  Our current definition -- a sophisticated mechanical device capable of highly technical work -- has come a long way from the original, which was closer to slave.

Sometimes words can, more or less accidentally, migrate even farther from their original meaning than that.  Consider nimrod.  It was originally a name, referenced in Genesis 10:8-9 -- "Then Cush begat Nimrod; he began to be a mighty one in the Earth.  He was a mighty hunter before the Lord."  Well, back in 1940, the episode of Looney Tunes called "A Wild Hare" was released, the first of many surrounding the perpetual chase between hunter Elmer Fudd and the Wascally Wabbit.  In the episode, Bugs calls Elmer "a poor little Nimrod" -- poking fun at his being a hunter, and a completely inept one at that -- but the problem was that very few kids in 1940 (and probably even fewer today) understood the reference and connected it to the biblical character.  Instead, they thought it was just a humorous word meaning "buffoon."  The wild (and completely deserved) popularity of Bugs Bunny led to the original allusion to "a mighty hunter" being swamped; ask just about anyone today what nimrod means and they're likely to say something like "an idiot."


Interestingly, another of Bugs's attempted coinages meaning "a fool" -- maroon, from the hilarious 1953 episode "Bully for Bugs" -- never caught on in the same way.  When he says about the bull, "What a maroon!", just about everyone got the joke, probably because both the word he meant (moron) and the conventional definition of the word he said (a purplish-red color) are familiar enough that we realized he was mispronouncing a word, not coining a new one.


It's still funny enough, though, that I've heard people say "What a maroon!" when referring to someone who's dumb -- but as a quote from a fictional character, not because they think it's the correct word.

Languages shift and flow constantly.  Fortunately for me, since language evolution is my area of study.  It's why the whole prescriptivism vs. descriptivism battle is honestly pretty comical -- the argument over whether, respectively, linguists are recording the way languages should be used (forever and ever amen), or simply describing how they are used.  Despite the best efforts of the prescriptivists, languages change all the time, sometimes in entirely sudden and unpredictable ways.  Slang words are the most obvious examples -- when I was a teacher, I was amazed at how slang came and went, how some words would be en vogue one month and passé the next, while others had real staying power.  (And sometimes resurface.  I still remember being startled the first time I heard a student unironically saying "groovy.")

But that's part of the fun of it.  That our own modes of communication change over time, often in response to cultural phenomena like books, television, and movies, is itself an interesting feature of our ongoing attempt to be understood. 

And I'm sure Bugs would be proud of how he's influenced the English language, even if it was inadvertent.

****************************************


Saturday, June 21, 2025

The labyrinths of meaning

A recent study found that regardless how thoroughly AI-powered chatbots are trained with real, sensible text, they still have a hard time recognizing passages that are nonsense.

Given pairs of sentences, one of which makes semantic sense and the other of which clearly doesn't -- in the latter category, "Someone versed in circumference of high school I rambled" was one example -- a significant fraction of large language models struggled with telling the difference.

In case you needed another reason to be suspicious of what AI chatbots say to you.

As a linguist, though, I can confirm how hard it is to detect and analyze semantic or syntactic weirdness.  Noam Chomsky's famous example "Colorless green ideas sleep furiously" is syntactically well-formed, but has multiple problems with semantics -- something can't be both colorless and green, ideas don't sleep, you can't "sleep furiously," and so on.  How about the sentence, "My brother opened the window the maid the janitor Uncle Bill had hired had married had closed"?  This one is both syntactically well-formed and semantically meaningful, but there's definitely something... off about it.

The problem here is called "center embedding," which is when there are nested clauses, and the result is not so much wrong as it is confusing and difficult to parse.  It's the kind of thing I look for when I'm editing someone's manuscript -- one of those, "Well, I knew what I meant at the time" kind of moments.  (That this one actually does make sense can be demonstrated by breaking it up into two sentences -- "My brother opened the window the maid had closed.  She was the one who had married the janitor Uncle Bill had hired.")

Then there are "garden-path sentences" -- named for the expression "to lead (someone) down the garden path," to trick them or mislead them -- when you think you know where the sentence is going, then it takes a hard left turn, often based on a semantic ambiguity in one or more words.  Usually the shift leaves you with something that does make sense, but only if you re-evaluate where you thought the sentence was headed to start with.  There's the famous example, "Time flies like an arrow; fruit flies like a banana."  But I like even better "The old man the boat," because it only has five words, and still makes you pull up sharp.

The water gets even deeper than that, though.  Consider the strange sentence, "More people have been to Berlin than I have."

This sort of thing is called a comparative illusion, but I like the nickname "Escher sentences" better because it captures the sense of the problem.  You've seen the famous work by M. C. Escher, "Ascending and Descending," yes?


The issue both with Escher's staircase and the statement about Berlin is if you look at smaller pieces of it, everything looks fine; the problem only comes about when you put the whole thing together.  And like Escher's trudging monks, it's hard to pinpoint exactly where the problem occurs.

I remember a student of mine indignantly telling a classmate, "I'm way smarter than you're not."  And it's easy to laugh, but even the ordinarily brilliant and articulate Dan Rather slipped into this trap when he tweeted in 2020, "I think there are more candidates on stage who speak Spanish more fluently than our president speaks English."

It seems to make sense, and then suddenly you go, "... wait, what?"

An additional problem is that words frequently have multiple meanings and nuances -- which is the basis of wordplay, but would be really difficult to program into a large language model.  Take, for example, the anecdote about the redoubtable Dorothy Parker, who was cornered at a party by an insufferable bore.  "To sum up," the man said archly at the end of a long diatribe, "I simply can't bear fools."

"Odd," Parker shot back.  "Your mother obviously could."

A great many of Parker's best quips rely on a combination of semantic ambiguity and idiom.  Her review of a stage actress that "she runs the gamut of emotions from A to B" is one example, but to me, the best is her stinging jab at a writer -- "His work is both good and original.  But the parts that are good are not original, and the parts that are original are not good."

Then there's the riposte from John Wilkes, a famously witty British Member of Parliament in the last half of the eighteenth century.  Another MP, John Montagu, 4th Earl of Sandwich, was infuriated by something Wilkes had said, and sputtered out, "I predict you will die either on the gallows or else of some loathsome disease!"  And Wilkes calmly responded, "Which it will be, my dear sir, depends entirely on whether I embrace your principles or your mistress."

All of this adds up to the fact that languages contain labyrinths of meaning and structure, and we have a long way to go before AI will master them.  (Given my opinion about the current use of AI -- which I've made abundantly clear in previous posts -- I'm inclined to think this is a good thing.)  It's hard enough for human native speakers to use and understand language well; capturing that capacity in software is, I think, going to be a long time coming.

It'll be interesting to see at what point a large language model can parse correctly something like "Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo."  Which is both syntactically well-formed and semantically meaningful.  

Have fun piecing together what exactly it does mean.

****************************************


Tuesday, May 20, 2025

Talking to the animals

An Introduction to Language (by Victoria Fromkin and Robert Rodman, Third Edition, 1974) defines language as "rule-governed arbitrary symbolic communication."

The "rule-governed" and "arbitrary" parts might seem contradictory, but they're not.  That language has rules is self-evident whether you are a prescriptivist (someone who believes there are correct and incorrect ways to use language) or a descriptivist (someone who believes that as long as communication is occurring, it's language; so the primary role of the linguist is not to enforce rules but to document them).  Being that my master's degree is in historical linguistics, I'm strongly of a descriptivist bent; if I thought there were an inflexible lexicon and set of grammatical rules that never ever changed, I'd kind of be out of a job.

The arbitrary part is less obvious.  It has to do with the sound-to-meaning correspondence.  Dog in English is inu in Japanese, chien in French, kare in Hausa, and hundur in Icelandic; none of those words are, in fact, especially doggy in nature.  Other than a handful of onomatopoeic words like bang, oink, meow, and hiccup, the connection between a word and its meaning is essentially accidental.

Curiously, humans are the only species on Earth that we are certain have true language, by the Fromkin and Rodman definition.  There's long been a suspicion that dolphin and whale vocalizations might be language, but as of this writing, that remains conjecture.  Recently, there have been some interesting studies of other primates indicating that certain features of language might exist outside of Homo sapiens -- a paper out of the University of Warwick last week suggests that orangutan vocalizations might exhibit recursion, the nesting structure you see in the children's rhyme "This is the House That Jack Built."  The researchers found that the sounds orangutans make are grouped into clusters, and those clusters put together in at least two additional tiers of structure, hinting that their vocalizations might have a much richer information-carrying capacity than we'd thought.

Another recent study, this one out of the University of Vienna, found that chimps might use drumming as a means of long-distance communication -- that the spacing of beats when they drum on tree roots varies but is non-random.  Like the recursion found in orangutans, the fact that the rhythm of drumming in chimps isn't just random noise opens up the possibility that it might be meaningful.  The researchers found that different chimps have different rhythmic styles, and that groups also developed their own unique patterns of drumming -- suggestive that drumming in chimps could be a cultural phenomenon.

How we developed language, and (likely) no other extant species did, is still open to question.  There are some interesting genetic pieces to the puzzle; the forkhead box protein 2 (FOX-P2) gene seems to be an important one, as the human variant of FOX-P2 isn't found in any known living species other than ourselves, and mutations in that sequence result in significant problems with learning and utilizing language.  (Genetic studies of Neanderthal remains found that Neanderthals had an identical FOX-P2 gene to that of modern humans; obviously we can't be sure that they had language, but it seems likely.)

[Image licensed under the Creative Commons Emw, Protein FOX-P2 PDB 2a07, CC BY-SA 3.0]

Actually, it was genetics that got me thinking about this topic today; yet another study, this one out of Rockefeller University and Cold Springs Harbor Laboratory, did a gene insertion on mice, replacing the murine version of the NOVA-1 gene with the human variant.  The human NOVA-1 has only a single base pair substitution as compared with that of other mammals, but -- like FOX-P2, damage to this gene is known to impair language learning and production.

And when you replace a mouse embryo's NOVA-1 gene with a human's, the resulting adult mouse is capable of making strikingly more complex vocalizations than your ordinary mouse can do.

"When adult male mice were genetically altered with the human NOVA-1 variant, their squeaks during courtship didn't become higher pitched like the pups," said Robert Darnell, who was lead author on the paper.  "Instead, their vocalizations included more complex syllables.  They 'talked' differently to the female mice.  One can imagine how such changes in vocalization could have a profound impact on evolution....  NOVA-1 encodes a protein that can cut out and rearrange sections of messenger RNA when it binds to neurons.  This changes how brain cells synthesize proteins, probably creating molecular diversity in the central nervous system...  The 'humanized' mice with the NOVA-1 variant had molecular changes in the RNA splicing seen in brain cells, especially in regions associated with vocal behavior."

So we're one step closer to figuring out a uniquely human phenomenon.  That communication in the animal world exists on a spectrum of complexity is certain, but by the Fromkin/Rodman definition, we're kind of it for true language, as far as we know.  How we gained that ability is still not entirely clear, but its advantages are obvious -- and it may be that mutations in two regulatory genes are what kickstarted a capacity for chatter that in large part is responsible for our dominance of the entire biosphere.

****************************************


Monday, May 19, 2025

The loss of memory

British science historian James Burke has a way of packing a lot of meaning into a small space.

I still recall the first time I watched his amazing series The Day the Universe Changed, in which he looked at moments in history that radically altered the direction of human progress.  The final installment, titled "Worlds Without End," had several jaw-hanging-open scenes, but one that stuck with me was near the beginning, where he's recapping some of the inventions that had led to our current scientific outlook and high-tech world.  "In the fifteenth century," Burke said, "the invention of the printing press by Johannes Gutenberg took our memories away."

Being someone who has always loved the written word, it had honestly never occurred to me that writing -- and, even more, mass printing -- had a downside; the fact that we no longer have to commit information to memory, but can rely on what amount to external memory storage devices.  Burke, of course, is hardly the first person to make this observation.  Back in around 370 B.C.E., Socrates (as recorded by his disciple Plato in the dialogue Phaedrus) comments that the invention of writing is as much a curse as a blessing, a viewpoint he frames as a discussion between the Egyptian gods Thamus and Thoth, the latter of whom is credited with the creation of Egyptian hieroglyphics:

"This invention, O king," said Thoth, "will make the Egyptians wiser and will improve their memories; for it is an elixir of memory and wisdom that I have discovered."  But Thamus replied, “Most ingenious Thoth, one man has the ability to beget arts, but the ability to judge of their usefulness or harmfulness to their users belongs to another; and now you, who are the father of letters, have been led by your affection to ascribe to them a power the opposite of that which they really possess.

"For this invention will produce forgetfulness in the minds of those who learn to use it, because they will not practice their memory.  Their trust in writing, produced by external characters which are no part of themselves, will discourage the use of their own memory within them.  You have invented an elixir not of memory, but of reminding; and you offer your pupils the appearance of wisdom, not true wisdom, for they will read many things without instruction and will therefore seem to know many things, when they are for the most part ignorant and hard to get along with, since they are not wise, but only appear wise."

Socrates also points out that once written, a text is open to anyone's interpretation; it can't say, "Hey, wait, that's not what I meant:"

I cannot help feeling, Phaedrus, that writing is unfortunately like painting; for the creations of the painter have the attitude of life, and yet if you ask them a question they preserve a solemn silence.  And the same may be said of speeches.  You would imagine that they had intelligence, but if you want to know anything and put a question to one of them, the speaker always gives one unvarying answer.  And when they have been once written down they are tumbled about anywhere among those who may or may not understand them, and know not to whom they should reply, to whom not: and, if they are maltreated or abused, they have no parent to protect them; and they cannot protect or defend themselves.

And certainly he has a point.  A writer can write down nonsense just as easily as universal truth, and (as I've found out with my own writing!) two people reading the same passage can come to completely different conclusions about what it means.  Even the most careful and skillful writing can't avoid all ambiguity.

I'm not clear that we're on any surer footing with the oral tradition, though.  Not only do we have the inevitable "mutations" in lineages passed down orally (a phenomenon that was used to brilliant effect by sociolinguist Jamshid Tehrani in his delightful research into the phylogeny of "Little Red Riding Hood"), there's the problem that suppression of cultures from invasion, colonization, or conquest often wipes out (or at least drastically alters) the cultural memory.

How much of our history, mythology, and knowledge has been erased simply because the last person who had the information died without ever passing it on?

[Image licensed under the Creative Commons Planemad, Chart of world writing systems, CC BY-SA 3.0]

Swiss philosopher Jean-Jacques Rousseau seems to side with Socrates, though.  In his Essay on the Origin of Languages, he writes:

Writing, which would seem to crystallize language, is precisely what alters it.  It changes not the words but the spirit, substituting exactitude for expressiveness.  Feelings are expressed in speaking, ideas in writing.  In writing, one is forced to use all the words according to their conventional meaning.  But in speaking, one varies the meanings by varying one’s tone of voice, determining them as one pleases.  Being less constrained to clarity, one can be more forceful.  And it is not possible for a language that is written to retain its vitality as long as one that is only spoken.
I wonder about that last bit.  Chinese has been a written language for over eight millennia, and I think you'd be hard-pressed to defend the opinion that it has "lost its vitality."  Seems to me that like most arguments of this ilk, the situation is complex.  Writing down our ideas may mean losing nuance and increasing the dependence on interpretation, but the gain in (semi-) permanence is pretty damn important.

And of course, this has bearing on our own century's old-school pearl-clutching; people decrying the shift toward electronic (rather than print) media, and in English, the fact that cursive isn't being taught in many elementary schools.  My guess is that like the loss of memory Socrates predicted, and Rousseau's concerns over the "crystallization" of language into something flat and dispassionate, the human mind -- and our ability to communicate meaningfully -- will survive this latest onslaught.

So I'm still in favor of the written word.  Obviously.  My own situation is a little like the exchange between the Chinese philosophers Lao Tsu and Zhuang Zhou.  Lao Tsu, in his book Tao Te Ching, famously commented, "Those who say don't know, and those who know don't say."  To which Zhuang Zhou wryly responded, "If 'those who say don't know and those who know don't say,' why is Lao Tsu's book so long?"

****************************************


Thursday, May 15, 2025

Borrowers and lenders

My master's thesis is titled, "The Linguistic Effects of the Viking Invasions on England and Scotland," which should put it in contention for winning the Scholarly Research With The Least Practical Applications Award.

Even so, I still think it's a pretty interesting topic.  My contention was that the topography of the two countries are a big part of the reason that their languages, Old English and Old Gaelic respectively, were affected so differently.  England, with its largely level countryside and a networked road system even back then, adopted hundreds of Old Norse borrow-words into every lexical category, even though the explicit rule by Scandinavia (the "Danelaw") was confined to the eastern half of the country and only lasted two centuries.  Hundreds of place names in England are Norse in origin; any town ending in "-by" owes that part of its name to the Norse word for "town."  (Similarly. places ending in -thorpe, -thwaite, -foss, -toft, or -ness reflect a Norse influence; and all the streets in the city of York that end in -gate -- well, gata is Old Norse for "street.")  

The usual pattern is that languages borrow words for concepts they didn't already have covered, but Old English saw Norse supersede even perfectly good native words that were in wide use.  The result is that Modern English has way more words of Norse origin than you might expect, including many in the common, everyday vocabulary.  A few examples of the more than two hundred documented Norse borrow-words:

  • window
  • gift
  • sky
  • egg
  • scare
  • scream
  • anger
  • awkward
  • fellow

Even the pronoun "they" is Norse in origin; the Old English words for "he," "she," and "they," hé, híe, and héo, respectively, were pronounced so much alike that it could be confusing knowing who you were talking about.  The practical English fixed this by palatalizing híe to she and adopting the Norse third-person plural pronoun ∂eira as our modern "they" and "their."

Gaelic, though, responded differently.  Scotland was (and is) rugged terrain, and the big settlements tended to be clustered around the coast and inland waterways.  Even though Scandinavian rule in Scotland lasted much longer -- Norwegian rule of the Hebrides didn't end until 1266 -- the influence on the language was minor, and largely restricted to place names (the -ey found in the names of lots of the islands of Scotland simply means "island" in Old Norse) and terms related to living near water.  The Gaelic words for net, sail, anchor, boat, ford, delta, beach, seagull, seaweed, and skiff are all Norse in origin, but of the common vocabulary, only a few are (including the words for noise, shoe, guide, time, and scatter).

[Nota bene: The Orkneys were a different matter entirely.  Norse rule in the Orkneys continued until 1472, and the people there actually lost Gaelic altogether.  Until the eighteenth century the main language was Norn, a dialect of West Norse, at which point it was superseded by the Orcadian dialect of Scots English.  The last native speaker of Norn died in 1850.]

Of course, English is an amalgam of a great many languages; not only did the Vikings leave their thumbprint on it, but the Normans in the eleventh century brought in a great many words of French origin.  Additionally, a lot of our technical vocabulary comes from Latin and Greek.  Until the eighteenth century, English was kind of a backwater language spoken only by people in one corner of Europe, so when scientists and other academics from different countries were communicating, they usually did so in Latin.  The result is that we still have a ton of Latin and Greek borrow-words in English, including most of our scientific, legal, and scholarly vocabulary.  To demonstrate how dependent the sciences are on Latin and Greek roots, the brilliant science fiction author Poul Anderson wrote a piece on the atomic theory using only words native to Old English -- and the result ("Uncleftish Beholding") sounds like some ancient mythological tale, and gives you an idea of just how much Latin and Greek have influenced the cadence of our language.  Here's a short excerpt to give the flavor, but you really should read the whole thing, because it's just that wonderful:

For most of its being, mankind did not know what things are made of, but could only guess.  With the growth of worldken, we began to learn, and today we have a beholding of stuff and work that watching bears out, both in the workstead and in daily life.

The underlying kinds of stuff are the *firststuffs*, which link together in sundry ways to give rise to the rest.  Formerly we knew of ninety-two firststuffs, from waterstuff, the lightest and barest, to ymirstuff, the heaviest. Now we have made more, such as aegirstuff and helstuff.

The firststuffs have their being as motes called *unclefts*.  These are mightly small; one seedweight of waterstuff holds a tale of them like unto two followed by twenty-two naughts.  Most unclefts link together to make what are called *bulkbits*.  Thus, the waterstuff bulkbit bestands of two waterstuff unclefts, the sourstuff bulkbit of two sourstuff unclefts, and so on.  (Some kinds, such as sunstuff, keep alone; others, such as iron, cling together in ices when in the fast standing; and there are yet more yokeways.)  When unlike clefts link in a bulkbit, they make *bindings*.  Thus, water is a binding of two waterstuff unclefts with one sourstuff uncleft, while a bulkbit of one of the forestuffs making up flesh may have a thousand thousand or more unclefts of these two firststuffs together with coalstuff and chokestuff.
Everywhere English speakers went -- which, for better or worse, was kind of everywhere -- we picked up and adopted new words.  The result is a rich, often confusing patchwork quilt of a language, with strange sound-to-spelling correspondences, remnants of grammar and morphology from a dozen different places, and weird attempts to blend it all together.  (I don't know how many times I told students that the plurals of hippopotamus and rhinoceros were not hippopotami and rhinoceri.  That'd be trying to pluralize them like Latin words, and they're actually Greek -- hippopotamus is Greek for "river horse," and rhinoceros for "nose horn" -- so if you want to be fancy about it, it'd be hippoipotamou and rhinoucerates.  But that sounds pretentious as hell, so let's stick with hippopotamuses and rhinoceroses.)

Anyhow, that's our excursion into our peculiar hodgepodge of a language.  Hodgepodge, by the way, is French in origin, from hochepot, meaning "a stew."  The hoche part comes from the Old Germanic word hocher, meaning "to shake."

Okay, I'd better stop here.  I could do this all day.

****************************************


Friday, January 3, 2025

Word search

I've always wondered why words have the positive or negative connotations they do.

Ask people what their favorite and least-favorite sounding words are, and you'll find some that are easily explicable (vomit regularly makes the "least-favorite" list), but others are kind of weird.  A poll of linguists identified the phrase cellar door as being the most beautiful-sounding pair of words in the English language -- and look at how many names from fantasy novels have the same cadence (Erebor, Aragorn, Celeborn, Glorfindel, Valinor, to name just a handful from the Tolkien mythos).  On the other hand, I still recall passing a grocery store with my son one day and seeing a sign in the window that said, "ON SALE TODAY: moist, succulent pork."

"There it is," my son remarked.  "A single phrase made of the three ugliest words ever spoken."

Moist, in fact, is one of those universally loathed words; my surmise is the rather oily sound of the /oi/ combination, but that's hardly a scholarly analysis.  The brilliant British comedian Miranda Hart had her own unique take on it:


Another question is why some words are easier to bring to mind than others. This was the subject of a fascinating paper in Nature Human Behavior titled, "Memorability of Words in Arbitrary Verbal Associations Modulates Memory Retrieval in the Anterior Temporal Lobe," by neuroscientists Weizhen Xie, Wilma A. Bainbridge, Sara K. Inati, Chris I. Baker, and Kareem A. Zaghloul of the National Institute of Health.  Spurred by a conversation at a Christmas party about why certain faces are memorable and others are not, study lead author Weizhen Xie wondered if the same was true for words -- and if so, that perhaps it could lead to more accuracy in cognitive testing for patients showing memory loss or incipient dementia.

"Our memories play a fundamental role in who we are and how our brains work," Xie said in an interview with Science Daily.  "However, one of the biggest challenges of studying memory is that people often remember the same things in different ways, making it difficult for researchers to compare people's performances on memory tests.  For over a century, researchers have called for a unified accounting of this variability.  If we can predict what people should remember in advance and understand how our brains do this, then we might be able to develop better ways to evaluate someone's overall brain health."

What the team did is as fascinating as it is simple; they showed test subjects pairs of functionally-unrelated words (say, "hand" and "apple"), and afterward, tested them by giving them one word and asking them to try to recall what word it was paired with.  What they found is that some words were easy to recall regardless of what they were paired with and whether they came first or second in the pair; others were more difficult, again irrespective of position or pairing.

"We saw that some things -- in this case, words -- may be inherently easier for our brains to recall than others," said study senior author Kareem Zaghloul.  "These results also provide the strongest evidence to date that what we discovered about how the brain controls memory in this set of patients may also be true for people outside of the study."

[Image licensed under the Creative Commons Mandeep SinghEmotions wordsCC BY 4.0]

Neither the list of easy-to-remember words nor the list of harder-to-remember ones show any obvious commonality (such as abstract versus concrete nouns, or long words versus short ones) that would explain the difference.  Each list included some extremely common words and some less common ones -- tank, doll, and pond showed up on the memorable list, and street, couch, and cloud on the less-memorable list.  It was remarkable how consistent the pattern was; the results were unequivocal even when the researchers controlled for such factors as educational level, age, gender, and so on.

"We thought one way to understand the results of the word pair tests was to apply network theories for how the brain remembers past experiences," Xie said.  "In this case, memories of the words we used look like internet or airport terminal maps, with the more memorable words appearing as big, highly trafficked spots connected to smaller spots representing the less memorable words.  The key to fully understanding this was to figure out what connects the words."

The surmise is that it has to do with the way our brains network information.  Certain words might act as "nodes" -- memory points that connect functionally to a great many different concepts -- so the brain more readily lands on those words when searching.  Others, however familiar and common they might be, act more as "dead-ends" in brain networking, making only a few conceptual links.  Think of it as trying to navigate through a city -- some places are easy to get to because there are a great many paths that lead there, while others require a specific set of roads and turns.  In the first case, you can get to your destination even if you make one or two directional goofs; in the second, one wrong turn and you're lost.

All of which is fascinating. I know as I've gotten older I've had the inevitable memory slowdown, which most often manifests as my trying to recall a word I know that I know. I often have to (with some degree of shame) resort to googling something that's a synonym and scanning down the list until I find the word I'm looking for, but it makes me wonder why this happens with some words and not with others.  Could it be that in my 64-year-old brain, bits of the network are breaking down, and this affects words with fewer working functional links than ones with a great many of them?

All speculation, of course. I can say that whatever it is, it's really freakin' annoying.  But I need to wrap up this post, because it's time for lunch.  Which is -- I'm not making this up -- leftover moist, succulent pork.

I'll try not to think about it.

****************************************

Tuesday, December 17, 2024

A linguistic labyrinth

It's funny the rabbit holes fiction writers get dragged down sometimes.

This latest one occurred because of two things that happened kind of at the same time.  First, I was chatting with a friend about one of my books, a fall-of-civilization novel called In the Midst of Lions that in the current national and global situation is seeming to cut a little close to the bone.  In the story, one of the characters is a linguist who saw what was coming, and wrote a conlang -- a constructed (invented) language -- so he could communicate with people he trusted without it being decipherable by enemies.

My friend asked how I managed to develop the conlang, which is called Kalila, and what process I'd gone through to make it sound like a real language.

Following in the footsteps of the Star Trek folks with Klingon and J. R. R. Tolkien with Quenya and Sindarin (two of the languages of the Elves) was not an easy task.  My MA is in linguistics (yes, I know, I spent my career teaching biology; it's a long story) so I know a good bit about language structure, and I wanted to make the language different enough from the familiar Indo-European languages to seem (1) an authentic language, not just a word-for-word substitution, and (2) something a smart linguist would think up.  Unfortunately, my specialty is Indo-European languages, specifically Scandinavian languages.  (My wife gives me grief about having studied Old Norse.  My response is that if the Vikings ever take over the shipping industry, I'm gonna have the last laugh.)


A sample of Tolkien's lovely Quenya script [Image is in the Public Domain]

So I started out with a pair of blinders on.  There are a lot of rules specific to Indo-European languages that we tend to take for granted, which was exactly what I didn't want to do with my conlang.  But in order to identify those, you have to somehow lift yourself out of your own linguistic box -- which is awfully hard to do.

The second thing, though, was that shortly after chatting about my conlang with my friend, I stumbled on a question on Quora that asked, "What is the hardest language to learn to speak fluently?"  By "hardest" most people assumed "for speakers of English," which went right to what I'd been discussing earlier -- finding out what would seem odd/counterintuitive (and therefore difficult) for English speakers.

Well, between the conversation and the post on Quora, I was led directly into an online research labyrinth, literally for hours.

One respondent to the hardest-language-question said his choice would be the Northwest Caucasian languages of Georgia, Azerbaijan, and Armenia -- a group made up of Abaza, Abkhaz, Adyghe, Kabardian, and Ubykh -- the last-mentioned of which became extinct in 1992 when the last native speaker died of old age.  These languages form an isolate family, related to each other but of uncertain (and undoubtedly distant) relationship to other languages.

So naturally, I had to find out what's weird about them.  Here's what I learned.

Let's start out with the fact that they only have two vowels, but as many as 84 consonants depending on exactly how finely you want to break them up based on the articulation.  They use SOV (subject-object-verb) word order, plopping the verb at the end of the sentence, but that's hardly unique; Latin does that, giving rise to the old quip that by the time a Roman got to the verb in his sentence, his listeners had forgotten who he was talking about.

But in the parlance of the infomercial, "Wait, there's more!"  The Northwest Caucasian languages use agglutination -- gluing together various bits and pieces to make a more specific word -- but only for verbs.  In these languages, a verb is actually a cluster of parts called morphemes that tell you not only what the core verb is, but the place, time, manner of action, whether it's positive or negative, and even the subject's and object's person.

Then, there's the fact that they're ergative-absolutive languages.  When I hit this, I thought, "Okay, I used to know what this meant," and had to look it up.  It has to do with how the subject and object of a sentence are used.  In English (a nominative-accusative language), the subject has the same form regardless of what kind of verb follows it; likewise, the object always is the same.  So the subject of an intransitive verb like "to walk" is the same as the subject for a transitive verb like "to watch."  (We'd say, "she walked" and "she watched [someone or something];" in both cases, you use the form "she.")  The object form of "he" is always "him," regardless of any other considerations in the sentence.

Not so in the Northwest Caucasian languages, and other ergative-absolutive languages, such as Tibetan, Basque, and Mayan.  In these languages, the subject of an intransitive verb and the object of a transitive one have the same form; the subject of a transitive verb is the one with the different form.  (If English was an ergative-absolutive language, we might say "He watched her," but then it'd be "her walked.")

So there are lots of things that seem normal, obvious even, which in fact are simply arbitrary rules that we've learned are universal to English, but which are hardly universal to other languages.  It always puts me in mind of the Sapir-Whorf hypothesis, which is that the language you speak shapes your cognitive processes.  In other words, that speakers of languages differently structured from English literally perceive the world a different way because the form of the languages force different conceptualizations of what they see.

I've gone on long enough about all this, and I haven't even scratched the surface.  There are tonal languages like Thai, where the pitch and pitch change of a syllable alter its meaning.  There are languages like Finnish and Japanese where vowel length -- literally, how long you say the vowel for -- changes the meaning of the word it's in.  There are inflected languages like Greek, where the ending of a word tells you how it's being used in the sentence (e.g., in the phrases "the cat walked," "she pet the cat," "it's the cat's bowl," "give the food to the cat," and "the dog is with the cat," the word "cat" would in each case have a different suffix).

So it was a struggle to make my conlang something that would be believable to a linguist, and I can only hope I succeeded well enough to get by.  (Or, in the context of the story, something an actual linguist would invent.)  Of course, being that it's only one small piece of the story, in the end I used something like a dozen phrases total from the language, so it was kind of a lot of work with very little obvious result.

But I figure that in any case, what I came up with has still gotta be more realistic than the Judoon "ro po fo so no do" language from Doctor Who, which I'm only throwing in here because after yesterday's post my author friend Andrew Butters commented that I can always somehow find a way to work in a Doctor Who reference regardless of the topic, and I couldn't just refuse to rise to that challenge.


So there.

****************************************