Skeptophilia (skep-to-fil-i-a) (n.) - the love of logical thought, skepticism, and thinking critically. Being an exploration of the applications of skeptical thinking to the world at large, with periodic excursions into linguistics, music, politics, cryptozoology, and why people keep seeing the face of Jesus on grilled cheese sandwiches.
Showing posts with label demographics. Show all posts
Showing posts with label demographics. Show all posts

Monday, May 18, 2015

Knowing the score

One of the beautiful things about science is that it self-corrects.

If the data don't support the prevailing theory, you double-check the data to make sure you're not misinterpreting it, re-run the experiments to make sure they're well controlled, and then if you get the same results...

... you alter your model.

You have no other option.  Science is a method for understanding the world based upon logic and evidence.  If you accept that as a protocol for knowing the universe, you are dedicating yourself to following where it leads, even if you don't like the conclusions sometimes.

A pity, isn't it, that we can't introduce this approach into other fields?

Like, for example, education.  I'm a hard-core linguistics geek; in fact, my master's degree is in Scandinavian linguistics.  (Yes, I know, I teach biology.  It's a long story.)  So one of the things that galls me about education is the fact that we've known for decades that children learn languages better, more easily, and become more fluent the earlier you start them.  Kids put into language immersion classes in preschool learn a second language nearly as easily as the first, without tedious memorization of vocabulary lists and conjugations.

And when do most school districts begin language classes?  Middle school.  Right around the time kids' brains start getting bad at learning languages.

We don't need no scholarly research, educational leaders seem to be saying.  We've done it this way for years, and it's working just fine.

So with that scientific-method approach to analysis in mind, let's look at a piece of research regarding the current fad -- high-stakes standardized tests, now being used to evaluate not only students, but teachers, schools, and entire school districts.

[image courtesy of the Wikimedia Commons]

Christopher Tienken, associate professor of Education Administration at Seton Hall University, has just published an interesting, and disturbing, paper in the Journal of Scholarship and Practice.  In it, he shows that he can predict how a school population will perform on standardized tests, using only U.S. Census data.

You read that right.  Give Tienken information on demographics, ethnic makeup, socioeconomic status, community size, and so on, he can tell you how the school will do on standardized tests before the students actually take them.  Tienken writes:
In all, our regression models begin with about 18-21 different indicators.  We clean the models and usually end up with 2-4 indicators that demonstrate the greatest predictive power.  Then we enter those indicators into an algorithm that most fourth-graders, with an understanding of order or operations, could construct and calculate. Not complicated stuff.

Our initial work at the 3rd-8th and 11th grade levels in NJ, and grades 3-8 in CT and Iowa have proven fairly accurate.  Our prediction accuracy ranges from 62% to over 80% of districts in a state, depending on the grade level and subject tested.
I hope you recognize how devastating this is to the claim that standardized tests tell you anything worth knowing about teacher competence.  If census data alone predicts student performance, then how are "underperforming" teachers supposed to improve their scores?  Tienken's research implies that poor teachers will suddenly become more competent... if they move to a different district.

Tienken doesn't mince words about the implications of his study:
The findings from these and other studies raise some serious questions about using results from state standardized tests to rank schools or compare them to other schools in terms of standardized test performance.  Our forthcoming results from a series of school level studies at the middle school level produced similar results and raise questions about the appropriateness of using state test results to rank or evaluate teachers or make any potentially life-impacting decisions about educators or children.
Now, Tienken isn't saying that teachers make no difference; we all, I think, can attest to the power of a truly skilled teacher in making a difference to a child's life.  I had three teachers who stand out as having turned the course of my life in some way -- my high school biology and creative writing teachers, and my first-year college calculus teacher.  Each of them engendered in me a passion for learning and a fascination with the topic, such that I looked forward to each and every class and wanted more when I was done.

But the point is, this sort of thing is not measurable with a standardized test.  The real value of truly gifted teachers is their capacity for making learning relevant and engaging, making dry academic subjects come to life.  And whatever standardized tests are measuring -- a point no one, even the policy wonks at the state and federal Departments of Education, seems to be entirely clear on -- they certainly don't measure that.

So teacher evaluation, astonishingly enough, is best done by a competent administrator, who knows the teacher, the subject, and the students, not by some paper-and-pencil exam.  Who'd'a thought.

And it'd be nice if the people in charge would look at Tienken's research, and do a forehead smack, and say, "Wow!  We were wrong!  Better reconsider how we're applying standardized test scores!"  But given that scientific rules of validity and analysis don't seem to apply to education, I have the feeling that the result of Tienken's study will be: nothing.  We will almost certainly keep moving down the same road, letting test scores drive more and more decision-making, up to and including teacher salaries and retention.

Can't let a little thing like facts get in the way of educational reform, after all.

Saturday, October 5, 2013

Self-revelation through social media

There's a sense of being anonymous, or at least one step removed, on electronic media.  We post statuses, comments, and "tweets," and it feels very much like talking in an empty room -- that the likelihood of anyone hearing what we're saying, or that if they did hear, that they'd pay attention, or (especially) that we'd reveal something we didn't intend to, is slim.

Two recently released studies have shown that we are as transparent to others while online as we are in person -- perhaps more.


The first, done by H. Andrew Schwartz et al. of the University of Pennsylvania, is called "Personality, Gender, and Age in the Language of Social Media: The Open-Vocabulary Approach."  In it, the researchers used computer software to analyze 700,000 words from the Facebook status updates of 75,000 volunteers, who also agreed to take a battery of personality tests.  The software then calculated word frequencies for all of the words in the statuses, and then matched up word frequencies with personality markers.  Here's a piece of what they concluded:
Our analyses shed new light on psychosocial processes yielding results that are face valid (e.g., subjects living in high elevations talk about the mountains), tie in with other research (e.g., neurotic people disproportionately use the phrase ‘sick of’ and the word ‘depressed’), suggest new hypotheses (e.g., an active life implies emotional stability), and give detailed insights (males use the possessive ‘my’ when mentioning their ‘wife’ or ‘girlfriend’ more often than females use ‘my’ with ‘husband’ or 'boyfriend’).
Which thus far is interesting but not particularly alarming.  What I found more curious, and perhaps troubling, came up when I saw how many times word frequency could be related to other factors -- age, gender, degree of extroversion, and so on.   The age breakdown, I thought, was particularly interesting.  The 13 to 18 crowd unsurprisingly had "school," "homework," and "tomorrow" as their most common words; the clear winner from 19 to 22 were the various tenses and forms of the word "fuck;" by 23 to 28, there was a shift to "work," "office," "wedding," and "beer."  (The absence of swear words in this age bracket is likely to reflect an awareness of how public Facebook is, and not wanting to get fired for posting something inadvisable online.)

Males of all ages have a great many macho words in their statuses, involving video games, movies, and sports.  Unsurprisingly, "fuck" makes a reappearance in the male statuses.  Women's statuses were almost stereotypically girly -- "shopping," "boyfriend," "love," "yummy," and "my hair" being some of the most common words.

While none of these were particularly surprising, I think this raises two questions -- one of them more serious than the other.  The less serious one is whether our online presence is more revealing who we'd like to be seen as than who we actually are -- after all, we create these statuses, so the macho masculine statuses and girly feminine ones are just projections, ghosts of real people that we've built and then put on public display.

A more serious concern is how this sort of thing could be turned against us.  Now, please don't think that I've suddenly turned conspiracy theorist; I'm not particularly worried that the government is going to start data mining my Facebook looking for some reason to lock me away.  But think of the usefulness of this to marketing firms, who are always looking for ways to hook into demographic information so that they can focus their ads better.

If we reveal who we are even by the word choice in our status updates, that certainly is going to be something that advertisers are going to use.

The second study used Twitter, and the author, Burr Settles, came up with an algorithm (again based on word use) to sort out "geek" tweets from "nerd" tweets.  As settles sees it:
In my mind, “geek” and “nerd” are related, but capture different dimensions of an intense dedication to a subject:
  • geek - An enthusiast of a particular topic or field. Geeks are “collection” oriented, gathering facts and mementos related to their subject of interest. They are obsessed with the newest, coolest, trendiest things that their subject has to offer.
  • nerd - A studious intellectual, although again of a particular topic or field. Nerds are “achievement” oriented, and focus their efforts on acquiring knowledge and skill over trivia and memorabilia.
Similar to the study by Schwartz et al., Settles tried to group words together that seemed to indicate something about the demographic that produced them -- resulting in a graph (you can take a look at it on the link posted above) that sorts our tweets out by character.

I see this one as a bit more lighthearted than the first study, but still, it says something very interesting; that we reveal ourselves online every time we post anything, whether we want to or not.

Of course, all of this made me go back and check my own status updates and tweets, just to see what I'd inadvertently told the world about myself.  Ignoring what most of my social media activity is about -- posting links to cool stuff -- I found, in the last couple of weeks, a status mourning the death of my 16-year-old cat, Puck; a status that described my elation at finding out that my high school creative writing teacher (who, amazingly, hasn't retired yet!) is teaching one of my novels in her English class this year;  and a status about how much I enjoyed getting to see Laurie Anderson in concert.  As far as tweets, I had to go a lot further back to hit one that wasn't just posting a link to something, but I did find one expressing frustration about the teacher evaluation scheme in New York State, and another one about the last day of school that used up a good many of the 140 character limit with the word "YIPPEEEEEEEEEEE!!!!"

My guess is that this trend of figuring out our demographic information from our electronic presence is only going to get more sophisticated.  Should we be worried?  My sense is probably not; just as with the conspiracy theorist's concern that the government is monitoring his whereabouts (and text messages and phone conversations), if they were doing this for everyone it would be such a mammoth amount of data that it would be impossible to manage.  At least for now, I think this sort of thing will only be of interest to marketing firms.

And it's not like they haven't already been doing this for years, starting out with targeting advertisements to particular demographics on television (compare the ads on daytime soap operas and the Syfy channel, if you want a particularly good example).  If you have a Facebook, check the ads along the sidebar -- no surprise that mine frequently have to do with travel, scuba diving, wine, and pets, is it?

So as long as we think before we post (which we should already be doing), this sort of thing may not make much difference, except in what sorts of things we're encouraged to purchase.  At least I hope so.  Last thing I want is the government keeping track of what concerts I go to.