Skeptophilia (skep-to-fil-i-a) (n.) - the love of logical thought, skepticism, and thinking critically. Being an exploration of the applications of skeptical thinking to the world at large, with periodic excursions into linguistics, music, politics, cryptozoology, and why people keep seeing the face of Jesus on grilled cheese sandwiches.

Thursday, March 14, 2013

Digital fingerprints

I've always been fascinated with patterns.  Starting with a love for geometric patterns when I was a kid, I remember finding out about the Fibonacci sequence, and then its connection to the Golden Ratio, in 8th grade -- and feeling like I'd touched something magical, some fundamental superstructure of the universe.  Then I discovered tessellations, and thought that was the coolest thing I'd ever seen.  Then on to M. C. Escher, Penrose tiles, fractals, the Mandelbrot set...
We're all pattern-finders, really.  That's how the human brain works.  It's just that some of us are a little more obsessed than others.
Patterns exist all over nature, however chaotic it may appear, and those patterns apply to our behavior, as well.  We may think we're spontaneous and unpredictable, but our actions leave traces -- and those traces form patterns.  And if you analyze enough of the traces, you can make some pretty shrewd guesses about who left them.  This is the basis of a lot of forensic pathology work, and is the fundamental idea behind some fascinating new research out of Cambridge.  [Source]
Researchers at the Cambridge Psychometrics Centre developed software that can be used to analyze digital traces left by users -- in this case, Facebook "likes."  58,000 Facebook users agreed to be part of the study, and gave the study group demographic profiles as well as access to their Facebook accounts.  After that, the software went to town, coming up with correlations between a variety of demographics and which pages users had "liked."
And here's where even the researchers got a surprise.
Just from the Facebook "likes," the software achieved:
  • 88% accuracy at determining gender
  • 95% accuracy at telling African Americans from other ethnic groups
  • 85% accuracy at telling Republicans from Democrats
  • 82% accuracy at determining religious affiliation
  • between 65% and 72% accuracy at determining relationship status
  • between 65% and 72% accuracy at determining whether the user engaged in substance abuse
  • 60% accuracy in determining if the user's parents were divorced
  • "high" (but unstated, in the sources I read) accuracy at detecting such traits as extroversion, emotional stability, and openness
  • a correlation between liking "Curly Fries" and high IQ (no, I didn't make that up)
Pretty stunning, eh?
The researchers made a point of checking to see if there were any "red flag" sorts of "likes;" but it turned out that in fact, there weren't, for the most part.  The software was quite good at determining sexual preference -- and yet, according to the study, less than 5% of homosexual users had "liked" such pages as "Gay Marriage."  (And, it's to be hoped, a good many progressive heterosexuals had "liked" that page as well.)  It was the aggregate of all of the person's "likes" that counted, not one or two specific ones.  It was the overall pattern that allowed the software to be so eerily accurate.
Of course, this opens up new avenues for data mining -- for good reasons and bad ones.  Expect targeted advertisement software to get a lot more sophisticated soon.  There could be more dire results, too.  "Similar predictions could be made from all manner of digital data, with this kind of secondary ‘inference’ made with remarkable accuracy -- statistically predicting sensitive information people might not want revealed," said Michal Kosinski, director of the study team.  "Given the variety of digital traces people leave behind, it’s becoming increasingly difficult for individuals to control...  I am a great fan and active user of new amazing technologies, including Facebook.  I appreciate automated book recommendations, or Facebook selecting the most relevant stories for my newsfeed.  However, I can imagine situations in which the same data and technology is used to predict political views or sexual orientation, posing threats to freedom or even life."
So, naturally, I had to go check out some of the things I'd "liked" on Facebook.  And no, unfortunately, "Curly Fries" wasn't one of them.  Here are a few of mine:
  • Beck
  • J. S. Bach
  • Fun
  • Angélique Kidjo
  • Fiona Apple (okay, I have pretty eclectic musical tastes)
  • Foucault's Pendulum
  • Richard Dawkins
  • Terry Pratchett
  • Lord of the Rings
  • Watership Down
  • The Usual Suspects
  • Vanilla Sky
  • The Matrix
  • Ruthless People
  • O Brother, Where Art Thou?
  • I "Heart" Huckabee's
  • Dogma
  • Memento
  • Scotland, PA
  • The X Files
  • Arrested Development
  • Seinfeld
  • Northern Exposure
  • Scuba Diving
  • Wine Tasting
  • Travel
  • Writing
  • Music Performance
  • Kolibri Birdwatching Tours
  • This American Life
  • George Rodrigue (an artist I really like)
  • Cthulhu
  • The Tattoo Page
  • Americans Against Protestors at Military Funerals
So, okay.  I'm not seeing a pattern here.  I guess that's not surprising, really.  This software is taking metrics on the entire sample, and coming up with a best guess -- however good the human brain is at ascertaining patterns, that kind of subtlety really requires a computer.  So other than a few obvious ones (anyone who makes a point of "liking" Richard Dawkins is pretty certain to be an atheist), it's no wonder that I don't see anything particularly pattern-like in my group of "likes."
Also, of course, the problem may just be that I don't "like" "Curly Fries."

No comments:

Post a Comment