Skeptophilia (skep-to-fil-i-a) (n.) - the love of logical thought, skepticism, and thinking critically. Being an exploration of the applications of skeptical thinking to the world at large, with periodic excursions into linguistics, music, politics, cryptozoology, and why people keep seeing the face of Jesus on grilled cheese sandwiches.
Showing posts with label Michal Kosinski. Show all posts
Showing posts with label Michal Kosinski. Show all posts

Tuesday, January 31, 2017

Tell me what you like

I always wince a little when I see those silly things pop up on Facebook that say things like, "Can you see the number in the pattern?  Only geniuses can!  Click 'like' if you see it, then share."  And, "Are you one of the 5% of people who can think of a city starting with the letter E?  Reply with your answers!"

I'm certainly no expert in online data analysis, but those seem to me to be obvious attempts to get people to click or respond for some purpose other than the (stupid) stated one.  People still share these things all over the place, much to my perplexity.

What I didn't realize is how deep this particular rabbit hole can go.  Until I read an article that came out last week in Motherboard called "The Data That Turned the World Upside Down," by Hannes Grassegger and Mikael Krogerus, that illustrates a far darker reason for worry regarding where we place our online clicks.

The article describes the science of psychometrics -- using patterns of responses to predict personalities, behaviors, even things like religious affiliation and membership in political parties.  Psychometric analysis used to rely on test subjects filling out lengthy questionnaires, and even then it wasn't very accurate.

But a psychologist named Michal Kosinski found a better way to do it, using data we didn't even know we were providing -- using patterns of "likes" and "shares" on Facebook.


Kosinski had discovered something groundbreaking -- that although one person's "likes" on Facebook doesn't tell you very much, when you look at aggregate data from millions of people, you can use what people click "like" on to make startlingly accurate predictions about who they are and what they do.   Grassegger and Krogerus write:
Remarkably reliable deductions could be drawn from simple online actions. For example, men who “liked” the cosmetics brand MAC were slightly more likely to be gay; one of the best indicators for heterosexuality was “liking” Wu-Tang Clan.  Followers of Lady Gaga were most probably extroverts, while those who “liked” philosophy tended to be introverts.  While each piece of such information is too weak to produce a reliable prediction, when tens, hundreds, or thousands of individual data points are combined, the resulting predictions become really accurate.
By 2012, Kosinski and his team had refined their model so well that it could predict race (95% accuracy), sexual orientation (88% accuracy), political party (85% accuracy), and hundreds of other metrics, up to and including whether or not your parents were divorced.  (I wrote about some of Kosinski's early results in a post back in 2013.)

The precision was frightening, and the more data they had access to, the better it got.  A study of Kosinski's algorithm showed that ten "likes" were sufficient to allow the model to know a person better than an average work colleague; seventy, and it exceeded what a person's friends knew; 150, what their parents knew; and 300, what their partner knew.  Studies showed that targeting advertisements on Facebook based on psychometric data resulted in 63% more clicks than did non-targeted ads.

So it was only a matter of time before the politicians got wind of this.  Because not only can your data be used to predict your personality, the overall data can be used to identify people with a particular set of traits -- such as undecided voters.

Enter Alexander Nix, CEO of Cambridge Analytica, an online data analysis firm, and one of the big guns with respect to both the recent U.S. election and the Brexit vote.  Because Nix started using Kosinski's algorithm to target individuals for political advertising.

"Only 18 months ago, Senator Cruz was one of the less popular candidates," Nix said in a speech political analysts in June 2016.  "Less than 40 percent of the population had heard of him...  So how did he do this?  A really ridiculous idea.  The idea that all women should receive the same message because of their gender—or all African Americans because of their race."

Nix went on to explain that through psychometrics, political candidates can create laser-focus appeals to specific people.  The approach became "different messages for different voters," and Donald Trump's team embraced the model with enthusiasm.  Grassegger and Krogerus write:
On the day of the third presidential debate between Trump and Clinton, Trump’s team tested 175,000 different ad variations for his arguments, in order to find the right versions above all via Facebook.  The messages differed for the most part only in microscopic details, in order to target the recipients in the optimal psychological way: different headings, colors, captions, with a photo or video...  In the Miami district of Little Haiti, for instance, Trump’s campaign provided inhabitants with news about the failure of the Clinton Foundation following the earthquake in Haiti, in order to keep them from voting for Hillary Clinton.  This was one of the goals: to keep potential Clinton voters (which include wavering left-wingers, African-Americans, and young women) away from the ballot box, to “suppress” their vote, as one senior campaign official told Bloomberg in the weeks before the election.  These “dark posts”—sponsored news-feed-style ads in Facebook timelines that can only be seen by users with specific profiles—included videos aimed at African-Americans in which Hillary Clinton refers to black men as predators, for example.
All in all, the Trump campaign paid between $5 and $15 million to Cambridge Analytica for their services -- the total amount is disputed.

Of course, it's impossible to know how much this swayed the results of the election, but given the amount of money Trump and others have spent to use this algorithm, it's hard to imagine that it had no effect.

All of which is not to say that you shouldn't "like" anything on Facebook.  Honestly, I'm unconcerned about what Alexander Nix might make of the fact that I like Linkin Park, H. P. Lovecraft, and various pages about running, scuba diving, and birdwatching.  It's more that we should be aware that the ads we're seeing -- especially about important things like political races -- are almost certainly not random any more.  They are crafted to appeal to our personalities, interests, and biases, using the data we've inadvertently provided, meaning that if we're not cognizant of how to view them, we're very likely to fall for their manipulation.

Wednesday, January 18, 2017

Honest vulgarity

*Note to the more sensitive members of the studio audience: as the subject of this post is profanity, there's gonna be some profane language herein.  Be thou forewarned.*

My dad had a rather ripe vocabulary, probably largely due to the 29 years he spent in the Marine Corps.  My mother, on the other hand, was strait-laced to the point that even saying the word "sex" in her presence resulted in a raised eyebrow and the Fear-Inducing Stare of Disapproval.  My dad solved this problem by inventing new swear words (such as "crudbug") or repurposing actual words for swearing (such as "fop").  When my mom would get on my dad's case about it, he would respond, completely deadpan,"Those aren't vulgar words, Marguerite," which was true in detail if not in spirit.

It's probably obvious by this juncture that I take after my dad a lot more than my mom.  I tend to have a pretty bad mouth, a habit I have to be careful about because my job involves guiding Tender Young Minds (although I think I could make a pretty good case that most of those Tender Young Minds have a worse vocabulary than I do).  But by this point in my life, my mom's litany of "the only people who need to use vulgar language are the ones who don't have any better words in their vocabulary to say" is ringing pretty hollow.  I may have a lot of faults, but I'm damn sure that a poor vocabulary is not amongst them.

[image courtesy of the Wikimedia Commons]

I tend to use swear words on two occasions -- for the humor value, and when I'm mad.  And to me, those are two very valid instances in which to let fly.  I still recall the great jubilation I felt when as a graduate student I first ran across John J. McCarthy's seminal paper on the linguistics of swearing, "Prosodic Structure and Expletive Infixation," in which we find out the rules governing inserting the word "fucking" into another word, and thus why it's okay to say "abso-fucking-lutely" but no one says "ab-fucking-solutely."

Even more cheering was the paper I just read yesterday by Gilad Feldman, Huiwen Lian, Michal Kosinski, and David Stillwell called "Frankly, We Do Give a Damn: The Relationship Between Profanity and Honesty" in which we find out that habitual swearers tend to be more honest, and which also should be the winner of the 2017 Clever Academic Paper Title Award.  The authors write:
There are two conflicting perspectives regarding the relationship between profanity and dishonesty.  These two forms of norm-violating behavior share common causes and are often considered to be positively related.  On the other hand, however, profanity is often used to express one’s genuine feelings and could therefore be negatively related to dishonesty.  In three studies, we explored the relationship between profanity and honesty. We examined profanity and honesty first with profanity behavior and lying on a scale in the lab, then with a linguistic analysis of real-life social interactions on Facebook, and finally with profanity and integrity indexes for the aggregate level of U.S. states.  We found a consistent positive relationship between profanity and honesty; profanity was associated with less lying and deception at the individual level and with higher integrity at the society level.
Besides the general finding that profanity is positively correlated with honesty, I thought the variation in profanity use state-by-state was absolutely fascinating.  Connecticut had the highest levels of swearing, followed by Delaware, New Jersey, Nevada, and New York (not too goddamn shabby, fellow New Yorkers, and I'm proud to have done my part in our state's fifth-place finish).  Utah came in dead last, followed by Arkansas, Idaho, South Carolina, and Tennessee.  One has to wonder if religiosity has something to do with this, given the bible-belt status of most of the states at the bottom of the pile, but establishing any sort of causation was beyond the scope of this study.

Okay, so I'm coming across as self-congratulatory here, but I still think this research is awesome.  Given the amount of grief I got from my mom about my inappropriate vocabulary when I was a teenager, I think I'm to be allowed a moment of unalloyed pleasure at finding out that I and other habitual swearers are more likely to be honest.  So while I'll still have to watch my mouth at school, it's nice to know that my turning the air blue at home when I wallop my shin on the coffee table is just my way of honestly expressing that bone bruises hurt like a motherfucker.