|
Grading the WASL -- Part 2
Aired Friday, May 26, 2006
By Liam Moriarty
Email this story to a friend
Anchor Lead:
Yesterday, on The Learning Curve, KPLU's Liam Moriarty spoke with former employees of the company that scores
standardized tests for more than 20 states, including Washington. Today, he looks at whether the problems those
scorers encountered could happen with the WASL.
Listen Now! MP3
Full Story Text:
It was The Test Question from Hell.
KOUDELA: I felt that what I saw at the test scoring center was devastating.
In the spring of 2003, Burien resident John Koudela took a temp job in Auburn, scoring
standardized tests for the nation's top scoring company, NCS Pearson. He and his coworkers graded the Ohio
Proficiency Test, Ohio's equivalent of the WASL. Almost immediately, one essay question on the 6th grade
science test started going badly wrong.
KOUDELA: It started creating a stress level that got to be really acute.
The variety of students' responses to the complex and wide-ranging question made
it necessary to add page after page of revisions to the scoring key. Confused scorers were constantly
turning to their supervisors for guidance. Eventually there were 27 handwritten pages of notes that
workers had to wade through to score the question. Koudela is convinced many students did not get
accurate scores.
KOUDELA: The vast majority of test scorers cannot maintain a consistency level
long enough to give every single student a fair score.
The experience soured him on standardized tests and when the job was over, he
declined an offer to stay on to score the next project, the WASL. Koudela approached KPLU because
he was concerned that what he saw as a scorer on the Ohio test might be happening on the WASL,
as well. So, we went straight to the source.
TAYLOR: I've been involved with the WASL pretty much since the beginning.
We visited Professor Cathy Taylor in her office at the University of Washington.
She's an expert in educational testing and has had a big hand in creating the WASL.
TAYLOR: From what are the test questions going to look like to how are we going to
structure the test to what kind of scoring is going to be used.
We showed her the Ohio science test item that caused John Koudela and his co-workers
so much grief. And we laid it side-by-side with a comparable WASL question. Then we ask, what's to keep
Washington students taking the WASL from having to face the same kind of mess the Ohio students did?
TAYLOR: This wouldn't happen. Because we wouldn't have such an open-ended question
that could be interpreted in so many ways by so many children. It's too vague, the question is too vague.
It's a bad question.
Taylor points out that WASL questions typically have diagrams, charts or other
graphic elements that focus the question and make it easier for students to understand.
TAYLOR: You have to have questions that tell students explicitly what they're
being asked to do, and so that they will give you what you want to see.
By developing the questions well, Taylor says, Washington avoids a lot of problems
later during scoring. That makes for a long trip between coming up with the idea for a test question
and that question finally showing up on an actual test.
WILHOFT: Typically about 18 months of item development before an item actually
appears on the WASL and is presented to youngsters and is counted for points.
That's Joe Wilhoft. He's in charge of public school testing for the state of
Washington. Wilhoft says first, a potential test question is written by a committee of Washington
teachers and other educators. Then a Content Review Committee looks it over. After another review for
bias, our prospective question is ready for a test drive. That means putting it in an actual WASL, but
not counting it toward the students' scores. Wilhoft says it's surprising how often a question can pass
through all this adult review but still run into problems once it's put before youngsters.
WILHOFT: Maybe it's confusing to them in ways we weren't aware of, or maybe
there are alternate responses or alternate solutions that we had not anticipated.
Then comes a process called "range finding", that establishes the range of
correct answers and what scores they'll be given. Now, the new test question is ready for prime time.
Wilhoft says the test scoring contractor has multiple checks and balances to catch and correct any
issues that arise during actual scoring. That might sound pretty air tight. But Jim Popham has a
word of caution. He's written a number of well-regarded books on educational testing.
POPHAM: Educational testing is far less exact than most people think it is. Many
more mistakes are made in gauging whether students acquire certain skills. And once they recognize that,
then they should more be reluctant to use it for the kinds of high-stakes decisions that they're often
employing test results for.
Popham is professor emeritus at UCLA. He gives Washington credit for heading off
the kind of mess seen on the Ohio science test. But, he says, Washington's test is not universally admired.
POPHAM: I spend a lot of time in many states around the country. Never is the
WASL held up as an example of what a first-rate test ought to be. You never see people saying, "Oh, we
want to have our test more WASL-like."
Joe Wilhoft has an answer for anyone with questions about the WASL. He urges them to
take a look at the test questions the state has released on the internet, and make up their own mind.
Liam Moriarty, KPLU News.
|