Thoughts on Standardized Testing: October 2010

Thursday, October 14, 2010

Response to LA Times Article

This is our response to a recent Los Angeles Times article regarding school test scores that was published in the LA Times on October 1, 2010.

Collateral Damage?: The Problems of Teacher Assessment
By Phillip Harris, Bruce Smith, & Joan Harris

"We've got to be able to identify teachers who are doing well [and] teachers who are President Obama said on September 27 in an interview with Matt Lauer on NBC's Today show. "And, ultimately, if some teachers aren't doing a good job, they've got to go."

Don't think too hard about it, and everything about education reform seems so simple, doesn't it? Find out who are the ineffective teachers, try to help them improve, and if that fails, then fire them. What could we possibly be overlooking?

For starters, let's look at the President's first point: distinguishing between teachers who are doing well and teachers who aren't. That should be easy enough. That's what the various value-added systems of evaluation seek to do: compare students' test scores early in the year with the same students' scores late in the year and, after some statistical legerdemain, voilà!: a measure of growth to judge a teacher's effectiveness. How simple it all seems to politicians and policy makers!

But like most of what passes for reform in public education, the more you know about it, the less likely it seems that it will achieve what you hope for. Like charter schools and merit pay (which also depends on finding a sound way to judge teaching performance), using value-added efforts to improve the teaching force has surface appeal that just doesn't stand up to close scrutiny. And anything that will affect the lives of so many teachers and so many of our children is worth at least a little close scrutiny.

We've argued in our new book, The Myths of Standardized Testing, that the tests aren't very good at measuring real student achievement, or predicting future success, or motivating improvement, or even being objective. So using these flawed measures for value-added assessment, a purpose they weren't designed for, just seems way off base. But we're not assessment experts, so maybe we're missing something.

Here's what those who know best say. Five years ago, Henry Braun, then at ETS, now at Boston College, argued that value-added assessment wasn't yet ready for prime time -- and might never be the panacea some of its proponents hoped. Now, just three weeks before the President sat down with Matt Lauer, Eva Baker of the National Center for Evaluation Standards, and Student Testing at UCLA and a list of co-authors that constitutes a Who's Who of Assessment issued a report titled Problems with the Use of Student Test Scores to Evaluate Teachers (Economic Policy Institute, 2010). In that report, the co-authors cited non-random assignment of students and teachers, the failure to distinguish the contributions of multiple teachers over time, and the instability of the ratings from year to year for the same teacher as problems that made using value-added methods an unwise choice, at least for the time being. We think if you can't resolve the instability problem, the whole effort becomes a crap shoot.

But using a complex assessment mechanism for unsupported purposes is always fraught with problems and unintended consequences. Already blowback has begun. With the LA Times' recent publication of the test scores of students linked to individual teachers and schools, we now have the apparent suicide of Rigoberto Ruelas, Jr., a fifth-grade teacher who was upset that his scores were not higher. Described by former students as someone who "took the worst students, and tried to change their lives," Ruelas has now lost his own. Collateral damage?

Phillip Harris is Executive Director of the Association for Educational Communications & Technology. He is the former Director of the Center for Professional Development at Phi Delta Kappa International and was a member of the faculty of Indiana University for 22 years, serving in both the Psychology Department and the School of Education. He is the author of The Myths of Standardized Tests: Why They Don't Tell You What You Think They Do (December 2010), with co-authors Bruce Smith and Joan Harris.

Reponse to the New York Times Report on School Performance

This is our reponse to the article titled "New York School Test, Warning Signs Ignored", which was published on October 10th, 2010 in the New York Times

October 13, 2010
Heads Up, Mr. Mayor!
By Phillip Harris and Bruce Smith

No doubt it comes as a shock to Glen Beck fans and to many politicians and policy makers, including those who run our nation's school systems, but experts really do know a thing or two about their areas of expertise. Could it hurt to pay some attention to them?

One obvious example came to the fore this past week when the New York Times ran a longer-than-usual story on the release of test scores for New York City schools. Headlined "On New York School Tests, Warning Signs Ignored," the story by Jennifer Medina makes a real effort to tell a complicated tale of numbers-based accountability gone rogue, with warnings from experts ignored and another unholy marriage of political ambition and good intentions. "The mayor uses data and metrics to determine whether policies are failing or succeeding," says Howard Wolfson, deputy mayor for government affairs and communications. Sounds like a good idea -- until you come down from the rhetorical clouds and see how that's worked out for the schools. We have argued that a technology as limited as standardized testing could never be expected to give you a fair and complete picture of student learning, teacher performance, or the success of any school system, much less one as large and complex as New York City's (see The Myths of Standardized Tests). But even if you disagree with us, when you reward or punish schools and educators on the basis of those test scores, you should expect them to rise -- and rapidly. To everyone's consternation but no one's surprise, that's just what happened in the Big Apple.

The downside, well-known to social scientists, is that when you tie important consequences to a quantitative measure, you will "corrupt" the measure. In this case, the test scores will be artificially "inflated." That doesn't necessarily mean that anyone broke any laws or did anything morally questionable. When any district uses a similar form of a test for a number of years in a row, the scores of its students will rise. That's score "inflation," it's predictable, and it means that the indicator -- i.e., the test -- is "corrupted." It no longer gives a valid measure of the student skills it was supposed to measure.

How does this come about if no one is blatantly cheating? Almost all teachers really do care about their students. So when teachers are familiar with the form of a test that's used year after year, they may adapt some of their teaching, consciously or not, to help their students perform better. If schools, teachers, or students are rewarded or punished according to the scores -- that is, if the stakes are high -- then teachers will try even harder to prepare their students for the tests. Throw in the national industry that creates and markets test-preparation materials that many schools use, and you have the perfect incubator for score inflation: familiar tests, high stakes, and organized preparation efforts.

Now a confusing mishmash of misunderstanding has the mayor and the school chancellor defending scores that are so high that ordinary observers -- much less testing experts -- suspect that they must be "inflated." Does it seem likely that 82% of the city's students were proficient in math in 2009? What were your city's scores? Now when a bona fide testing expert, Harvard's Daniel Koretz, proposes a plan to "audit" the tests and get a sound measure of the score inflation, so as not to deceive the public, he and his colleagues are turned down, more than once. Meanwhile, Hizzoner goes on claiming a record of success in running the city's schools. Maybe so, maybe no. Maybe the mayor should allow a disinterested look at his chosen measuring stick.

Phillip Harris is Executive Director of the Association for Educational Communications & Technology. He is the former Director of the Center for Professional Development at Phi Delta Kappa International and was a member of the faculty of Indiana University for 22 years, serving in both the Psychology Department and the School of Education. Bruce Smith was a member of the editorial staff of the Phi Delta Kappan, the flagship publication of Phi Delta Kappa International, the association for professional educators. They are co-authors with Joan Harris of The Myths of Standardized Tests: Why They Don't Tell You What You Think They Do.

Thoughts on Standardized Testing

Thursday, October 14, 2010

Response to LA Times Article

Reponse to the New York Times Report on School Performance

About Us

Followers

Blog Archive