Skeptophilia (skep-to-fil-i-a) (n.) - the love of logical thought, skepticism, and thinking critically. Being an exploration of the applications of skeptical thinking to the world at large, with periodic excursions into linguistics, music, politics, cryptozoology, and why people keep seeing the face of Jesus on grilled cheese sandwiches.

Wednesday, April 3, 2024

Marching into the uncanny valley

"Your scientists were so preoccupied with whether or not they could, they didn't stop to think if they should."

That quote from Michael Crichton's Jurassic Park kept going through my head as I read about the latest in robotics from Columbia University -- a robot that can recognize a human facial expression, then mimic it so fast that it looks like it's responding to emotion the way a real human would.

One of the major technical problems with trying to get robots to emulate human emotions is that up until now, they hadn't been able to respond quickly enough to make it look natural.  A delayed smile, for example, comes across as forced; on a mechanical face it drops right into the uncanny valley, the phenomenon noted by Japanese roboticist Masahiro Mori in 1970 as an expression or gesture that is close to being human, but not quite close enough.  Take, for example, "Sophia," the interactive robot invented back in 2016 that was able to mimic human expressions, but for most people generated an "Oh, hell no" response rather than the warm-and-trusting-confidant response which the roboticists were presumably shooting for.  The timing of her expressions and comments was subtly off, and the result was that very few of us would have trusted Sophia with the kitchen knives when our backs were turned.

This new creation, though -- a robot called "Emo" -- is able to pick up on human microexpressions that signal a smile or a frown or whatnot is coming, and respond in kind so fast that it looks like true empathy.  They trained it using hours of videos of people interacting, until finally the software controlling its face was able to detect the tiny muscle movements that preceded a change in facial expressions, allowing it to emulate the emotional response it was watching.

Researcher Yuhang Hu interacting with Emo  [Image credit: Creative Machines Lab, Columbia University]

"I think predicting human facial expressions accurately is a revolution in HRI [human-robot interaction]," Hu said.  "Traditionally, robots have not been designed to consider humans' expressions during interactions. Now, the robot can integrate human facial expressions as feedback.  When a robot makes co-expressions with people in real-time, it not only improves the interaction quality but also helps in building trust between humans and robots.  In the future, when interacting with a robot, it will observe and interpret your facial expressions, just like a real person."

Hod Lipson, professor of robotics and artificial intelligence research at Columbia, at least gave a quick nod toward the potential issues with this, but very quickly lapsed into superlatives about how wonderful it would be.  "Although this capability heralds a plethora of positive applications, ranging from home assistants to educational aids, it is incumbent upon developers and users to exercise prudence and ethical considerations," Lipson said.  "But it’s also very exciting -- by advancing robots that can interpret and mimic human expressions accurately, we're moving closer to a future where robots can seamlessly integrate into our daily lives, offering companionship, assistance, and even empathy.  Imagine a world where interacting with a robot feels as natural and comfortable as talking to a friend."

Yeah, I'm imagining it, but not with the pleased smile Lipson probably wants.  I suspect I'm not alone in thinking, "What in the hell are we doing?"  We're already at the point where generative AI is not only flooding the arts -- resulting in actual creative human beings finding it hard to make a living -- but deepfake AI photographs, audio, and video are becoming so close to the real thing that you simply can't trust what you see or hear anymore.  We evolved to recognize when something in our environment was dangerously off; many psychologists think the universality of the uncanny valley phenomenon is because our brains long ago evolved the ability to detect a subtle "wrongness" in someone's expression as a warning signal.

But what happens when the fake becomes so good, so millimeter-and-millisecond accurate, that our detection systems stop working?

I don't tend to be an alarmist, but the potential for misusing this technology is, to put not too fine a point on it, fucking enormous.  We don't need another proxy for human connection; we need more opportunities for actual human connection.  We don't need another way for corporations with their own agendas (almost always revolving around making more money) to manipulate us using machines that can trick us into thinking we're talking with a human.

And for cryin' in the sink, we don't need more ways in which we can be lied to.

I'm usually very much rah-rah about scientific advances, and it's always seemed to me an impossibly thorny ethical conundrum to determine whether there are things humans simply shouldn't investigate.  Who sets those limits, and based upon what rules?  Here, though, we're accelerating the capacity for the unscrupulous to take advantage -- not just of the gullible, anymore, but everyone -- because we're rapidly getting to the point that even the smart humans won't be able to tell the difference between what's real and what's not.

And that's a flat-out dangerous situation.

So a qualified congratulations to Hu and Lipson and their team.  What they've done is, honestly, pretty amazing.  But that said, they need to stop, and so do the AI techbros who are saying "damn the torpedoes, full speed ahead" and inundating the internet with generative AI everything. 

And for the love of all that's good and holy, all of us internet users need to STOP SHARING AI IMAGES.  Completely.  Not only is it often passing off a faked image as real -- worse, the software is trained using art and photography without permission from, compensation to, or even the knowledge of the actual human artists and photographers.  I.e. -- it's stolen.  I don't care how "beautiful" or "cute" or "precious" you think it is.  If you don't know the source of an image, and can't be bothered to find out, don't share it.  It's that simple.

We need to put the brakes on, hard, at least until we have lawmakers consider -- in a sober and intelligent fashion -- how to evaluate the potential dangers, and set some guidelines for how this technology can be fairly and safely used.

Otherwise, we're marching right into the valley of the shadow of uncanniness, absurdly confident we'll be fine despite all the warning signs.


1 comment:

  1. A comment on the latest EU initiative to regulate IA?