Skeptophilia (skep-to-fil-i-a) (n.) - the love of logical thought, skepticism, and thinking critically. Being an exploration of the applications of skeptical thinking to the world at large, with periodic excursions into linguistics, music, politics, cryptozoology, and why people keep seeing the face of Jesus on grilled cheese sandwiches.

Saturday, April 27, 2013

Breaking the lockstep of standardized tests

I've been an educator for 26 years.  During that time, I've handed out, proctored, and graded more quizzes and exams than I would even try to estimate.  Through it all, when asked why I give conventional tests (at times when I have a choice), my answer has been that they act as formative assessments -- allowing students and teachers to see how far their understanding of the subject at hand has progressed, and (importantly) to give them feedback on where the "holes" in their knowledge lie.

Standardized tests are defended with some of the same arguments, with the added one that (given that everyone is taking the same exam at the same time) it also allows administrators to judge how the school as a whole is performing.  In other words, it gives a basis for evaluating the entire system.

The Educational Testing Service, which is responsible for a large percentage of the standardized tests given in the United States, defends standardized testing in schools as having the following purposes:
  • Placement: Determine which courses or level of course a student should take.
  • Curriculum-based End of Course Testing: Determine whether students have mastered the objectives of the course taken.
  • Exit testing: Find out whether students have learned the amount necessary to graduate from a level of education.
  • Policy tools: Provide data to policymakers that helps them make decisions regarding funding, class size, curriculum adjustments, teacher development and more.
  • Course credit: Indicate whether a students should receive credit for a course he or she didn't take through demonstration of course content knowledge.
  • Accountability: Hold various levels of the education system responsible for test results that indicate if students learn what they should have learned.
This is predicated, however, on a pair of assumptions that runs through all of these justifications.  These assumptions are rarely questioned, but if either one of them is false, it would be sufficient to call into serious question our increasing reliance on test scores.  These assumptions are:

 (1)  Test scores are an accurate measure of student understanding;

and (2) How well students do on tests is solely due to how well they're taught.

I have come to believe that both of these statements are wrong.

The flaw in Assumption #1 comes from the definition of the word "understanding."  What does it mean to "understand" something?  Does it mean that you can recall, and use correctly, the relevant vocabulary?  Does it mean that you can apply your knowledge in some practical way?  Does it mean that you can draw connections between that knowledge and your knowledge of other fields?  I would argue that traditional tests -- even well-designed ones -- measure vocabulary-related knowledge fairly well, but almost never measure practical application or creative divergent thinking.  To measure those would take a great deal of time -- far more time than teachers or students are ever given for testing purposes.

It brings up the question, too, of "how does understanding happen, and how do tests contribute to that understanding?"  In my experience, understanding is unpredictable, sudden, and frequently comes out of collaborative problem solving; and that tests, as they're usually administered, almost never improve understanding in any way.  More often than not, test scores are looked upon as an end in themselves, not as a benchmark for growth or an opportunity to remediate.

A recent experiment by Peter Nonacs, a professor of behavioral ecology at UCLA, turned the whole exam model on its head by creating a novel testing environment.  Students were told a week ahead of time that they'd be allowed to "cheat" on a major exam.  They could do whatever they wanted during the exam, short of anything illegal.  They could bring in books, bring in notes, bring in a knowledgeable friend.  They could talk to each other, talk to students who'd taken the class before, call someone on their cellphone, leave the room to go consult a reference they'd forgotten.  They could ask the professor for hints (whether he provided them would be his decision.)  Work alone, work in groups, have the whole class take part and turn in identical answers.  In short: it was a free-for-all.

The result?  Most of the students chose to collaborate.  They divided up the class into teams, and gave each team a piece of the test -- but the individual groups had to present their answers to everyone to make sure they were good enough.  They argued points, proposing solutions that were ranked for plausibility and eliminating weak arguments.

In the end, they turned in a strong, well-reasoned examination, and I would argue that they learned far more from that experience than they would have learned by studying, and testing, alone.  Nonacs writes, "In the end, the students learned what social insects like ants and termites have known for hundreds of millions of years. To win at some games, cooperation is better than competition. Unity that arises through a diversity of opinion is stronger than any solitary competitor."


Then, there is Assumption #2; that somehow, test scores are well-correlated with what the classroom teacher is doing, and that teachers (and, by extension, the curriculum) are accurately assessed by how well students do on examinations.  If this were true, shouldn't there be far greater uniformity on assessment scores given by the same teacher using the same curriculum?  Of course, the flaw in this idea is glaringly obvious to everyone who has spent any time teaching; students are not little empty vessels that we can fill with knowledge, and measure by opening up their brains at the end of the school year and seeing how much is still there.  They come with differences in their mental hardwiring, differences in attitude, differences in their emotional and physical maturity.  They have different home lives, different amounts of parental support, differences in the demands they deal with outside of school.  Some use drugs and alcohol.  Some are mentally ill or developmentally disabled.

And we pretend, for some reason, that a sufficiently trained and motivated teacher, using an excellent curriculum, can get all of these children to the same place at the same time.

Get real.

The problem is, oversight agencies haven't admitted that reality yet, so that is exactly what they do pretend.  The pressures to "succeed" in that impossible task (whatever form "success" would actually take) are incredible, and the penalties for failing are harsh.  More and more there is a push to tie teacher salaries and job retention to test scores, and to link educational funding for school districts to the pooled results on standardized examinations.

The result has been panic on the part of a lot of school administrators, and some of the solutions they have come up with have been byzantine, not to mention disheartening.  Just this week, the Broward County (Florida) School District proposed that the minimum grade for students be raised from 0 to 50.  Students would receive the same grade -- a 50 -- for doing half of the required work as they would for sleeping through class, every single day, for 180 days straight.

The argument by the school board is that it creates a safety net.  "It's eliminates situations a child cannot possibly recover from, thus allowing them an opportunity," said Cynthia Park, the district's director of college and career readiness.  "Once they become hopeless, it's like why should I try?"

I would like to ask Ms. Park, however, if the real message here isn't that grades simply don't mean what educators have claimed that they mean, and that we need to reconsider our reliance on them.

But how can we change things?  To alter this model, it would take a complete overhaul of how we approach education; it would be costly.  It would require administrators to let go of their demand that everything in student and teacher performance be turned into numbers.  It would require us to redefine what we mean by "learning," to include the kind of creative, collaborative problem solving that Professor Nonacs saw in his class.

But it might, perhaps, change the face of education, and pull us out of the downward spiral in which schools have been locked for decades, and create an environment where all children get the opportunity to learn the knowledge they need, and progress as fast as they are able.  It might free us from the lockstep march toward uniformity that insists on throwing away talent that it cannot, or will not, foster.

Is this utopian?  Why?  If our commitment is, as it should be, to create smart, versatile, creative individuals, we had better rethink what we're doing -- because the system, as it is, is not working.

No comments:

Post a Comment