Skeptophilia (skep-to-fil-i-a) (n.) - the love of logical thought, skepticism, and thinking critically. Being an exploration of the applications of skeptical thinking to the world at large, with periodic excursions into linguistics, music, politics, cryptozoology, and why people keep seeing the face of Jesus on grilled cheese sandwiches.
Showing posts with label voices. Show all posts
Showing posts with label voices. Show all posts

Wednesday, May 22, 2024

Hallucinations

If yesterday's post -- about creating pseudo-interactive online avatars for dead people -- didn't make you question where our use of artificial intelligence is heading, today we have a study out of Purdue University that found an application of ChatGPT to solving programming and coding problems resulted in answers that half the time contained incorrect information -- and 39% of the recipients of these answers didn't recognize the answers as incorrect.

The problem of an AI system basically just making shit up is called a "hallucination," and it's proven to be extremely difficult to eradicate.  This is at least partly because the answers are still generated using real data, so they can sound plausible; it's the software version of a student who only paid attention half the time and then has to take a test, and answers the questions by taking whatever vocabulary words he happens to remember and gluing them together with bullshit.  Google's Bard chatbot, for example, claimed that the James Webb Space Telescope had captured the first photograph of a planet outside the Solar System (a believable lie, but it didn't).  Meta's AI Galactica was asked to draft a paper on the software for creating avatars, and cited a fictitious paper by a real author who works in the field.  Data scientist Teresa Kubacka was testing ChatGPT and decided to throw in a reference to a fictional device -- the "cycloidal inverted electromagnon" -- just to see what the AI would do with it, and it came up with a description of the thing so detailed (with dozens of citations) that Kubacka found herself compelled to check and see if she'd by accident used the name of something obscure but real.

It gets worse than that.  A study of an AI-powered mushroom-identification software found it only got the answer right fifty percent of the time -- and, frighteningly, provided cooking instructions when presented with a photograph of a deadly Amanita mushroom.  Fall for that little "hallucination" and three days later at your autopsy they'll have to pour your liver out of your abdomen.  Maybe the AI was trained on Terry Pratchett's line that "All mushrooms are edible.  Some are only edible once."

[Image licensed under the Creative Commons Marketcomlabo, Image-chatgpt, CC BY-SA 4.0]

Apparently, in inventing AI, we've accidentally imbued it with the very human capacity for lying.

I have to admit that when the first AI became widely available, it was very tempting to play with it -- especially the photo modification software of the "see what you'd look like as a Tolkien Elf" type.  Better sense prevailed, so alas, I'll never find out how handsome Gordofindel is.  (A pity, because human Gordon could definitely use an upgrade.)  Here, of course, the problem isn't veracity; the problem is that the model is trained using art work and photography that is (to put not too fine a point on it) stolen.  There have been AI-generated works of "art" that contained the still-legible signature of the artist whose pieces were used to train the software -- and of course, neither that artist nor the millions of others whose images were "scrubbed" from the internet by the software received a penny's worth of compensation for their time, effort, and skill.

It doesn't end there.  Recently actress Scarlett Johansson announced that she actually had to sue Sam Altman, CEO of OpenAI, to get him to discontinue the use of a synthesized version of her voice that was so accurate it fooled her family and friends.  Here's her statement:


Fortunately for Ms. Johansson, she's got the resources to sue Altman, but most creatives simply don't.  If we even find out that our work has been lifted, we really don't have any recourse to fight the AI techbros' claims that it's "fair use." 

The problem is, the system is set up so that it's already damn near impossible for writers, artists, and musicians to make a living.  I've got over twenty books in print, through two different publishers and a handful that are self-published, and I have never made more than five hundred dollars a year.  My wife, Carol Bloomgarden, is an astonishingly talented visual artist who shows all over the northeastern United States, and in any given show it's a good day when she sells enough to pay for her booth fees, lodging, travel expenses, and food.

So throw a bunch of AI-insta-generated pretty-looking crap into the mix, and what happens -- especially when the "artist" can sell it for one-tenth of the price and still turn a profit? 

I'll end with a plea I've made before; until lawmakers can put the brakes on AI to protect safety, security, and intellectual property rights, we all need to stop using it.  Period.  This is not out of any fundamental anti-tech Luddite-ism; it's simply from the absolute certainty that the techbros are not going to police themselves, not when there's a profit to be made, and the only leverage we have is our own use of the technology.  So stop posting and sharing AI-generated photographs.  I don't care how "beautiful" or "precious" they are.  (And if you don't know the source of an image with enough certainty to cite an actual artist or photographer's name or Creative Commons handle, don't share it.  It's that simple.)

As a friend of mine put it, "As usual, it's not the technology that's the problem, it's the users."  Which is true enough; there are a myriad potentially wonderful uses for AI, especially once they figure out how to debug it.  But at the moment, it's being promoted by people who have zero regard for the rights of human creatives, and are willing to steal their writing, art, music, and even their voices without batting an eyelash.  They are shrugging their shoulders at their systems "hallucinating" incorrect information, including information that could potentially harm or kill you.

So just... stop.  Ultimately, we are in control here, but only if we choose to exert the power we have.

Otherwise, the tech companies will continue to stomp on the accelerator, authenticity, fairness, and truth be damned.

****************************************



Thursday, January 5, 2023

Voices and faces

I've blogged before about my difficulties with prosopagnosia (better known as "face blindness").  My ability to recognize faces is damn near nonexistent; when I do recognize someone, it's either through context or because I remember a specific feature or features (she's the woman with the blonde hair, green eyes, and lots of freckles; he's the guy with curly gray hair and a little scar on the forehead; and so forth).  This, of course, backfires badly when someone changes their appearance.  It's why I have an extremely poor track record of recognizing actors in unexpected roles, where makeup and costumes can dramatically change what distinctive features they may have.  I was absolutely flattened when I found out that Jim the Vampire in What We Do In the Shadows was played by none other than Mark Hamill, and that Peter Davison -- the Fifth Doctor in Doctor Who, a show I'm absolutely obsessed with -- played the suave French teacher Mr. Clayton in Miranda.

When I figure it out, it's often because the actor has a distinctive voice that even being in a different character can't quite hide.  I know British actress Zoë Wanamaker from three very different roles -- Quidditch instructor Madam Hooch in Harry Potter, the scheming Lady Cassandra in Doctor Who, and hapless mystery writer Ariadne Oliver in Agatha Christie's Poirot.  But in each role, she keeps a very distinct clipped, staccato cadence in her voice that, for me, is instantly recognizable.

So I'm above average at voice recognition, whereas I can't form mental images of faces at all.  Hell, sitting here right now, I can't picture my own face.  I know I have sandy blond hair, gray eyes, black plastic-framed glasses, and a narrow face, but it doesn't come together into any sort of image.  If I see a photograph of myself in a group shot, I often have a hard time finding myself, unless (1) I know where I was standing, (2) I recognize the shirt I'm wearing, or (3) there aren't any other skinny blond guys with glasses in the photo.

As I've mentioned before, to anyone local who is reading this; if I've walked past you on the streets of the village with a blank look, and not said hi, please don't take it personally.  I had no idea who you were, or that I'd ever seen you before.  I have no problem if you say hi and mention your name; in fact, I really appreciate it.  It's much less awkward to have someone say, "Hi, Gordon, it's Bill" than to have me standing there trying frantically to search for clues so I can figure out who I'm talking to, or worse, ignoring someone I actually like.

[Image licensed under the Creative Commons mikemacmarketing, Facial Recognition22, CC BY 2.0]

The reason this topic comes up is because of a puzzling piece of research in the Journal of Neurophysiology this week, that looked at the brain firing patterns in people when they heard famous people speaking (they used the voices of Barack Obama, George W. Bush, and Bill Clinton).  The test subjects were epileptic -- such studies often use epileptic volunteers who already have electrodes implanted in their brains to monitor their seizures, and the same technology can be used to study their other brain responses -- but were not prosopagnosic. 

The reason I say the research was puzzling is they found that very same part of the brain that seems to be miswired in prosopagnosia, the fusiform gyrus of the basal temporal lobe, was extremely active during the volunteers' attempts to identify voices.  Put a different way, the face-responsive sites in the brain are also involved with vocal recognition.

How, then, does one of those responses go so badly wrong in people like me, and the other one is largely unimpaired?

The current research is preliminary; identifying the site in the brain where a response occurs is only the first step toward figuring out what exact pathway the firing sequence takes or how it's mediated.  The parts of the brain have a remarkable degree of functional overlap, and this is hardly the only example of two seemingly related abilities working in very different ways.  

In fact, I can think of another instance of this phenomenon from my own experience.  I have near-perfect recall for music; my wife calls it my "superpower."  I hear a melody a couple of times, and I pretty much have it for life, and if it's in the range of my instrument, I can play it for you.  My ability to remember text, though, is mediocre at best, the main reason I gave up on doing community theater -- memorizing lines was painfully difficult for me.  It's hard to imagine why two different examples of recall involving sound would be so dramatically different, but they are.

So here, there's obviously something going on in the fusiform gyrus in face-blind individuals that interferes with visual recognition and leaves vocal recognition largely unaffected.  It'd be interesting to look at the electrocorticography for prosopagnosic volunteers.  (To use the technique in the paper, though, they'd have to find face-blind people who were also epileptic and had surgical electrode implants, which would be a small subset of a small subset of a small subset of humanity.  Kind of limits the possibilities for volunteers.)

In any case, it's interesting research, and I'm curious to see where it will lead.  We're only at the beginning of understanding how our own brains work, and the next twenty years should see some significant strides toward the maxim engraved on the walls of the temple of the Oracle of Delphi -- γνῶθι σεαυτόν (know thyself).

****************************************


Tuesday, November 24, 2020

The sound of a friendly voice

Given my inability to recognize faces, I've developed a number of compensatory mechanisms.  One is that I remember people by memorizing specific features; he's the guy with curly black hair, she's the woman with small oval glasses and a tattoo on her right hand.  I notice how people walk and how they carry their posture; I can sometimes recognize people I know well even if they're walking away from me, if they have a distinctive gait (which many people do, whether they realize it or not).

But for me the most important thing is the sound of their voices.  I think that may be why it took me so long to figure out I'm face blind; often, all people have to do is say a few words and I immediately know who they are, so the fact that their faces don't trigger the immediate recognition most people have doesn't hamper me as much.

It turns out that I'm not alone in relying on vocalizations for identifying who's around.  According to a paper last week in Science Advances, zebra finches have an ability to recognize their flock mates' unique vocalizations that rivals that of most humans.

In "High-Capacity Auditory Memory for Vocal Communication in a Social Songbird," a team composed of biologists Kevin Yu, William Wood, and Frederic Theunissen, all of the University of California-Berkeley, used rewards to train a bunch of Australian zebra finches (Taeniopygia guttata) and see how far they could push the birds' ability to distinguish between the vocalizations of different members of their species.  And surprisingly -- at least to anyone who has heard the twittering cacophony of a cageful of zebra finches -- these birds could distinguish between the voices of forty or more of their friends.

The authors write:

Effective vocal communication often requires the listener to recognize the identity of a vocalizer, and this recognition is dependent on the listener’s ability to form auditory memories.  We tested the memory capacity of a social songbird, the zebra finch, for vocalizer identities using conditioning experiments and found that male and female zebra finches can remember a large number of vocalizers (mean, 42) based solely on the individual signatures found in their songs and distance calls.  These memories were formed within a few trials, were generalized to previously unheard renditions, and were maintained for up to a month.  A fast and high-capacity auditory memory for vocalizer identity has not been demonstrated previously in any nonhuman animals and is an important component of vocal communication in social species.

This is the first time this kind of individual vocal recognition has been demonstrated in a non-human animal.  "For animals, the ability to recognize the source and meaning of a cohort member's call requires complex mapping skills, and this is something zebra finches have clearly mastered," study co-author Theunissen said, in an interview with Science Direct.  "They have what we call a 'fusion fission' society, where they split up and then come back together.  They don't want to separate from the flock, and so, if one of them gets lost, they might call out 'Hey, Ted, we're right here.'  Or, if one of them is sitting in a nest while the other is foraging, one might call out to ask if it's safe to return to the nest...   I am really impressed by the spectacular memory abilities that zebra finches possess in order to interpret communication calls.  Previous research shows that songbirds are capable of using simple syntax to generate complex meanings and that, in many bird species, a song is learned by imitation.  It is now clear that the songbird brain is wired for vocal communication."

Social behavior is fascinating, and requires an astonishing repertoire of subtle perceptual skills to work well.  Take, for example, flocking behavior in starlings.  If you live in the United States, Canada, or western Europe, you've probably seen the flocks of black birds that swirl and move, almost in unison, as if the entire flock shared a single mind.  Scientists still don't know exactly how they manage it, but experiments have demonstrated that each bird monitors its seven nearest neighbors on either side, and determines its own flight path from those neighbors' movements.  We see that kind of thing in human crowds and in herds of cattle, of course; but the speed and degree of sophistication shown by starlings is mind-boggling.  The passage of information from one bird to the next is lightning-fast and shows almost no signal degradation (the kind of thing that happens in the game of Telephone) across the entire flock.  The result: they can move very nearly as one.  Take a look at this incredible video of a starling flock in motion:


So we aren't the only ones with fancy communication abilities.  Everywhere we look in the natural world, we see the amazing ways in which the species we share the Earth with survive, interact, and reproduce.  It can seem like a harsh, bleak world at times -- but if you want to be reminded of the astonishing beauty and wonder this planet contains, all you have to do is look around you.

**************************************

I'm fascinated with history, and being that I also write speculative fiction, a lot of times I ponder the question of how things would be different if you changed one historical event.  The topic has been visited over and over by authors for a very long time; three early examples are Ray Bradbury's "The Sound of Thunder" (1952), Keith Roberts's Pavane (1968), and R. A. Lafferty's screamingly funny "Thus We Frustrate Charlemagne" (1967).

There are a few pivotal moments that truly merit the overused nametag of "turning points in history," where a change almost certainly would have resulted in a very, very different future.  One of these is the Battle of the Teutoburg Forest, which happened in 9 C.E., when a group of Germanic guerrilla fighters maneuvered the highly-trained, much better-armed Seventeenth, Eighteenth, and Nineteenth Roman Legions into a trap and slaughtered them, almost to the last man.  There were twenty thousand casualties on the Roman side -- amounting to half their total military forces at the time -- and only about five hundred on the Germans'.

The loss stopped Rome in its tracks, and they never again made any serious attempts to conquer lands east of the Rhine.  There's some evidence that the defeat was so profoundly demoralizing to the Emperor Augustus that it contributed to his mental decline and death five years later.  This battle -- the site of which was recently discovered and excavated by archaeologists -- is the subject of the fantastic book The Battle That Stopped Rome by Peter Wells, which looks at the evidence collected at the location, near the village of Kalkriese, as well as the historical documents describing the massacre.  This is not just a book for history buffs, though; it gives a vivid look at what life was like at the time, and paints a fascinating if grisly picture of one of the most striking David-vs.-Goliath battles ever fought.

[Note: if you purchase this book using the image/link below, part of the proceeds goes to support Skeptophilia!]