Program Schedule
Reports Archive
Resources
E-Zine Sign-Up
Feedback
About
The Learning Curve



 
KPLU 88.5

Grading the WASL -- Part 2

Email this story to a friendEmail this story to a friend

Anchor Lead: Yesterday, on The Learning Curve, KPLU's Liam Moriarty spoke with former employees of the company that scores standardized tests for more than 20 states, including Washington. Today, he looks at whether the problems those scorers encountered could happen with the WASL.

Full Story Text:

It was The Test Question from Hell.

KOUDELA: I felt that what I saw at the test scoring center was devastating.

In the spring of 2003, Burien resident John Koudela took a temp job in Auburn, scoring standardized tests for the nation's top scoring company, NCS Pearson. He and his coworkers graded the Ohio Proficiency Test, Ohio's equivalent of the WASL. Almost immediately, one essay question on the 6th grade science test started going badly wrong.

KOUDELA: It started creating a stress level that got to be really acute.

The variety of students' responses to the complex and wide-ranging question made it necessary to add page after page of revisions to the scoring key. Confused scorers were constantly turning to their supervisors for guidance. Eventually there were 27 handwritten pages of notes that workers had to wade through to score the question. Koudela is convinced many students did not get accurate scores.

KOUDELA: The vast majority of test scorers cannot maintain a consistency level long enough to give every single student a fair score.

The experience soured him on standardized tests and when the job was over, he declined an offer to stay on to score the next project, the WASL. Koudela approached KPLU because he was concerned that what he saw as a scorer on the Ohio test might be happening on the WASL, as well. So, we went straight to the source.

TAYLOR: I've been involved with the WASL pretty much since the beginning.

We visited Professor Cathy Taylor in her office at the University of Washington. She's an expert in educational testing and has had a big hand in creating the WASL.

TAYLOR: From what are the test questions going to look like to how are we going to structure the test to what kind of scoring is going to be used.

We showed her the Ohio science test item that caused John Koudela and his co-workers so much grief. And we laid it side-by-side with a comparable WASL question. Then we ask, what's to keep Washington students taking the WASL from having to face the same kind of mess the Ohio students did?

TAYLOR: This wouldn't happen. Because we wouldn't have such an open-ended question that could be interpreted in so many ways by so many children. It's too vague, the question is too vague. It's a bad question.

Taylor points out that WASL questions typically have diagrams, charts or other graphic elements that focus the question and make it easier for students to understand.

TAYLOR: You have to have questions that tell students explicitly what they're being asked to do, and so that they will give you what you want to see.

By developing the questions well, Taylor says, Washington avoids a lot of problems later during scoring. That makes for a long trip between coming up with the idea for a test question and that question finally showing up on an actual test.

WILHOFT: Typically about 18 months of item development before an item actually appears on the WASL and is presented to youngsters and is counted for points.

That's Joe Wilhoft. He's in charge of public school testing for the state of Washington. Wilhoft says first, a potential test question is written by a committee of Washington teachers and other educators. Then a Content Review Committee looks it over. After another review for bias, our prospective question is ready for a test drive. That means putting it in an actual WASL, but not counting it toward the students' scores. Wilhoft says it's surprising how often a question can pass through all this adult review but still run into problems once it's put before youngsters.

WILHOFT: Maybe it's confusing to them in ways we weren't aware of, or maybe there are alternate responses or alternate solutions that we had not anticipated.

Then comes a process called "range finding", that establishes the range of correct answers and what scores they'll be given. Now, the new test question is ready for prime time. Wilhoft says the test scoring contractor has multiple checks and balances to catch and correct any issues that arise during actual scoring. That might sound pretty air tight. But Jim Popham has a word of caution. He's written a number of well-regarded books on educational testing.

POPHAM: Educational testing is far less exact than most people think it is. Many more mistakes are made in gauging whether students acquire certain skills. And once they recognize that, then they should more be reluctant to use it for the kinds of high-stakes decisions that they're often employing test results for.

Popham is professor emeritus at UCLA. He gives Washington credit for heading off the kind of mess seen on the Ohio science test. But, he says, Washington's test is not universally admired.

POPHAM: I spend a lot of time in many states around the country. Never is the WASL held up as an example of what a first-rate test ought to be. You never see people saying, "Oh, we want to have our test more WASL-like."

Joe Wilhoft has an answer for anyone with questions about the WASL. He urges them to take a look at the test questions the state has released on the internet, and make up their own mind.

Liam Moriarty, KPLU News.



LearningCurveOnline.org    KCTS.org    KPLU.org    StuartFoundation.org