Skeptophilia (skep-to-fil-i-a) (n.) - the love of logical thought, skepticism, and thinking critically. Being an exploration of the applications of skeptical thinking to the world at large, with periodic excursions into linguistics, music, politics, cryptozoology, and why people keep seeing the face of Jesus on grilled cheese sandwiches.
Showing posts with label Google Translate. Show all posts
Showing posts with label Google Translate. Show all posts

Saturday, November 16, 2024

Doomsday translation

In my Latin and Greek classes, I always warned my students to avoid Google Translate.

It's not that it's a bad tool, honestly, as long as you don't push it too far.  If you want to look up a single word -- i.e., use it like an online dictionary -- it's reasonably solid.  The problem is, it has a good word-by-word translation ability, but a lousy capacity for understanding grammar, especially with highly inflected languages like Latin.  For example, the phrase "corvus oculum corvi non eruit" -- "a crow will not pluck out another crow's eye," meaning more or less the same thing as "there's honor among thieves" -- gets translated as "do not put out the eye of the raven, raven."  Even worse is Juno's badass line from The Aeneid -- "Flectere si nequeo superos, Acheronta movebo"  ("If I cannot bend the will of heaven, I will raise hell") -- comes out "Could be bent if you cannot bend, hell, I will move."

Which I think we can all agree doesn't quite have the same ring.

But today I found out, over at the site Mysterious Universe, that there's another reason to avoid Google Translate:

It's been infiltrated by the Powers of Darkness.

At least that's how I interpret it.  Some users of Reddit (where else?) discovered that if you typed the word "dog" into Google Translate twenty times and have it translate from Hawaiian to English, it gave you the following message:
Doomsday Clock is three minutes at twelve.  We are experiencing characters and a dramatic developments in the world, which indicate that we are increasingly approaching the end times and Jesus’s return.
Within hours of the message being reported on Reddit, it had vanished, which of course only made people wiggle their eyebrows in a significant fashion.

Which brings up a few questions.
  1. Who thought of putting "dog" in twenty times and then translating it from Hawaiian?  It's kind of a random thing to do.  Of course, Redditors seem to have a lot of free time, so I guess at least that much makes sense.  But you have to wonder how many failed attempts they had.  ("Okay, I put in 'weasel' fifteen times and translated it from Lithuanian, but it didn't work.  Then I put in 'warthog' seventy-eight times, and translated it from Urdu.  No luck there either.  The search continues.")
  2. Even if it's a valid message, what did it tell us that we didn't already know?  It's not like we haven't all just watched Donald Trump hand over the control of government agencies to a mob of incompetents, degenerates, lunatics, and the downright evil, and nearly all of the Republicans responding by issuing a stern rebuke ("Bad Donald!  Naughty Donald!  If you do that again, we'll have to roll over on our backs and piss all over our own bellies!  That will sure show you!")  So we're definitely not hurting for dramatic developments, with or without the message.
  3. Even if the message was real, isn't it far more likely that it's the result of some bored programmers over at Google sticking an Easter egg into the code than it is some kind of message from the Illuminati?
  4. Don't you think the fact that it vanished after being reported is because the aforementioned bored programmers' supervisor ordered that it be taken down, not because the Illuminati found out we're on to them?  I see it more like how the Walmart supervisors dealt with Shane:



So I'm not all that inclined to take it seriously.  Brett Tingley at Mysterious Universe, however, isn't so sure:
As always though, it’s an interesting thought to think that Google’s vast AI networks might be trying to warn us, finding obscure places to hide these warnings where their human overlords won’t find them.  When AI becomes self-aware and starts taking over, will we even know it before it’s too late, or will odd and seemingly meaningless stories like this serve as prescient warnings for those who know where to look?
Somehow, I think if AI, or anyone else, were trying to warn us of impending doom, they wouldn't put it online and wait for Steve Neckbeard to find it by asking Google to translate "dog dog dog dog dog etc." from Hawaiian.

So that's our trip into the surreal for today.  I still think it's a prank, although a fairly inspired one.  Note that I'm not saying the overall message is incorrect, though.  Considering this week's news, I figure one morning soon I'll get up and find out that Donald Trump has nominated Vladimir Putin to be the head of the Department of Homeland Security, and the Republican Congresspersons responded by tweeting that they're "disappointed" and then widdling all over the floor.

At that point, I think I'd be in favor of offering the presidency to Shane.

****************************************


Tuesday, December 20, 2022

Language machines

If you've ever used Google Translate, you've probably noticed that it can be a little wonky.

Take, for example, the anecdote about the French guy who was wooing an American woman long-distance, and texted to her, "Prends une photo coquine pour moi."  ("Take a naughty picture for me.")  The woman wasn't certain what that meant, so she popped it into Google Translate, and was told it meant, "Take a photo for me, slut."

I think my favorite, though, is some feedback that a company called Koyu Matcha Green Tea received via their website, from a customer in Finland.  When they ran what the customer wrote through a Finnish-to-English Google Translate, it came out as the following:
If it resonated with cold to the bone?  Matcha Latté is guaranteed fireman, green tea with hot steamed milk.  Behold, thou hast already tasted.
Um... thanks?  We think?

The difficulty is that languages are complex entities, full of idioms and peculiarities and exceptions, so trying to find a mechanistic, totally rule-based way to characterize them is somewhere beyond tricky.  But because of the work of a Ph.D. student at the University of Cambridge, we have come one step closer to doing exactly that -- at least for Sanskrit.

About 2,500 years ago, a man named Dakṣiputra Pāṇini living in what is now northwestern Pakistan wrote a work called Aṣṭādhyāyī, which created a set of rules for the morphology -- the way words, prefixes, suffixes, and so on combine -- of the Sanskrit language.  An example of linking together these fragments, called morphemes, in English is the word incomprehensibly -- made up of in- (prefix meaning "not"), comprehend (stem of the word, altered to replace /d/ with /s/), -ible (suffix meaning "capable of"), and -ly (adverbial marker), in that order.

Imagine trying to come up with a list of rules for all the ways morphemes can combine in English, such that the rules only produced well-formed words and not garbled messes like iblecomprehendlyin.

That's what Pāṇini tried to do for Sanskrit.

The problem is that Pāṇini's rules seemed sometimes to lead to self-contradictions.  Given a particular combination of morphemes, there are often two or more rules that apply, so which should you use?  Linguists analyzing the rule-set discovered that Pāṇini had written a "metarule" -- a rule determining how other rules should be applied -- which said that if two rules seem to conflict, the "later rule should take precedence."  Everyone had interpreted this to mean that the one mentioned later in the book was the more important.

But that sometimes led to ungrammatical words.  So something was off, but what?

Enter Cambridge student Rishi Rajpopat, who had been toiling over Pāṇini's rules for months.  Then he had a brainstorm; what if the problem was that the metarule itself had been mistranslated?  He altered the metarule to read that if two rules are in conflict, the one that applies to the latter part of the word (the suffix) takes precedence over the one that applies to the first part of the word (the stem).

With that one change in interpretation, Pāṇini's rule system works to combine morphemes and produces grammatically-correct words almost one hundred percent of the time.

Which, of course, is a cause for much rejoicing amongst both linguists and people who are attempting to create high-quality translation software.

I wonder, though, how any such attempt would fare for English.  English is an amalgam of a Germanic root language, with heavy borrowing from French, Latin, and Spanish, and less-frequent (but still significant) borrowing from Old Norse, Italian, Greek, Dutch, Gaelic, and several Indigenous American languages.  This has introduced spellings, pronunciations, and morphologies that defy easy characterization.


Even some of the simple rules you learned in elementary school can't be applied with anything like real consistency.  "I before e except after c" -- unless your weird foreign neighbor Keith forfeits eight beige sleighs to a feisty caffeinated weightlifter.

You see the difficulty.

So as much as I'm impressed by Rajpopat's accomplishment, I don't think it's going to go very far toward fixing Google Translate's problem.

No matter.  The delight of being told the tea is so good it's "guaranteed fireman" makes up for any potential awkwardness incurred because you accidentally called your girlfriend an unpleasant name while attempting to initiate sexytimes.  You gotta take the good with the bad.

****************************************


Wednesday, July 18, 2018

Doomsday translation

In my Latin and Greek classes, I always warn my students to avoid Google Translate.

It's not that it's a bad tool, honestly, as long as you don't push it too far.  If you want to look up a single word -- i.e., use it like an online dictionary -- it's pretty solid.  The problem is, it has a good word-by-word translation ability, but a lousy capacity for understanding grammar, especially with highly inflected languages like Latin.  For example, the phrase "corvus oculum corvi non eruit" -- "a crow will not pluck out another crow's eye," meaning more or less the same thing as "there's honor among thieves" -- gets translated as "do not put out the eye of the raven, raven."  Even worse is Juno's badass line from The Aeneid -- "Flectere si nequeo superos, Acheronta movebo" ("If I cannot bend the will of heaven, I will raise hell") -- comes out "Could be bent if you cannot bend, hell, I will move."

Which I think we can all agree doesn't quite have the same ring.

But today I found out, over at the site Mysterious Universe, that there's another reason to avoid Google Translate:

It's been infiltrated by the Powers of Darkness.

At least that's how I interpret it.  Some users of Reddit (where else?) discovered that if you typed the word "dog" into Google Translate twenty times and have it translate from Hawaiian to English, it gave you the following message:
Doomsday Clock is three minutes at twelve We are experiencing characters and a dramatic developments in the world, which indicate that we are increasingly approaching the end times and Jesus’s return.
Within hours of the message being reported on Reddit, it had vanished, which of course only made people wiggle their eyebrows in a significant fashion.

Which brings up a few questions.
  1. Who thought of putting "dog" in twenty times and then translating it from Hawaiian?  It's kind of a random thing to do.  Of course, Redditors seem to have a lot of free time, so I guess at least that much makes sense.  But you have to wonder how many failed attempts they had.  ("Okay, I put in 'weasel' fifteen times and translated it from Lithuanian, but it didn't work.  Then I put in 'warthog' seventy-eight times, and translated it from Urdu.  No luck there either.  The search continues.")
  2. Even if it's a valid message, what did it tell us that we didn't already know?  It's not like we didn't all just watch Donald Trump wink at Vladimir Putin and then commit high treason in full view on television, or witness all of the Republicans respond by issuing a stern rebuke ("Bad Donald!  Naughty Donald!  If you do that again, we'll have to roll over on our backs and piss all over our own bellies!  That will sure show you!")  So we're definitely not hurting for dramatic developments, with or without the message.
  3. Even if the message was real, isn't it far more likely that it's the result of some bored programmers over at Google sticking an Easter egg into the code than it is some kind of message from the Illuminati?
  4. Don't you think the fact that it vanished after being reported is because the aforementioned bored programmers' supervisor ordered that it be taken down, not because the Illuminati found out we're on to them?  I see it more like how the Walmart supervisors dealt with Shane:


So I'm not all that inclined to take it seriously.  Brett Tingley at Mysterious Universe, however, isn't so sure:
As always though, it’s an interesting thought to think that Google’s vast AI networks might be trying to warn us, finding obscure places to hide these warnings where their human overlords won’t find them.  When AI becomes self-aware and starts taking over, will we even know it before it’s too late, or will odd and seemingly meaningless stories like this serve as prescient warnings for those who know where to look?
Somehow, I think if AI, or anyone else, were trying to warn us of impending doom, they wouldn't put it online and wait for Steve Neckbeard to find it by asking Google to translate "dog dog dog dog dog" from Hawaiian.

So that's our trip into the surreal for today.  I still think it's a prank, although a fairly inspired one.  Note that I'm not saying the overall message is incorrect, though.  Considering this week's news, I figure one morning soon I'll get up and find out that the US has been renamed the "Amerikan Autonomous Soviet Socialist Republik," and the Republican Congresspersons responded by tweeting that they're "disappointed" and then widdling all over the floor.

At that point, I think I'd be in favor of offering the presidency to Shane.

***********************************

This week's Skeptophilia book recommendation is a must-read for anyone concerned about the current state of the world's environment.  The Sixth Extinction, by Elizabeth Kolbert, is a retrospective of the five great extinction events the Earth has experienced -- the largest of which, the Permian-Triassic extinction of 252 million years ago, wiped out 95% of the species on Earth.  Kolbert makes a persuasive, if devastating, argument; that we are currently in the middle of a sixth mass extinction -- this one caused exclusively by the activities of humans.  It's a fascinating, alarming, and absolutely essential read.  [If you purchase the book from Amazon using the image/link below, part of the proceeds goes to supporting Skeptophilia!]