School Test Givers Face Their Own Test

By Helen Zelon.

Published May 25, 2010

Earlier this month students took the New York State Math and English Language Arts (ELA) tests. More than a million kids in grades 3 through 8 spent parts of two or three days answering multiple-choice items and questions that ask them to provide a written response or solve a math problem.

Students are classified by how many points they get. Answer only a few, and Johnny has serious academic problems (Level 1); correctly answer one or two more, and Jenny partially meets state learning standards (Level 2); pass the Level 3 cutoff score, and Jesse is deemed to meet them.

Level 3—considered a criterion of proficiency—has become the goal line. Districts and schools are evaluated and administrators are judged on the percentage of students crossing it. Amid the federal No Child Left Behind Act’s insatiable demand for progress, more New York State students are reaching Level 3 each year.

But Regents Chancellor Merryl Tisch has tried to distance the educational establishment from 2009’s record-setting achievement levels—77.4 percent of students scored at or above Level 3 on English language arts and 86.4 percent did so on math. Her admonitions about the need for honest data and more rigorous testing are an implicit admission of testing deficiencies and how ludicrous the results have become.

Faced with criticism that the exams have gotten easier over the past eight years the State Education Department (SED) says the 2010 version of the tests will be tougher to pass—broader and less predictable—raising the achievement bar for students.

Before we accept promises of reform, however, a formal inquiry into key issues is needed or SED and its perennial test vendor, CTB/McGraw-Hill, will spin us through another cycle of preposterous testing and damaging results. Truth about the past must precede trust in the future.

Let’s start with the cutoff points—the raw scores that partition students into categories of achievement. Who set the final cut scores on the 2009 tests, and were they established before or after the tests had been given?

CTB’s technical reports describe a process in which the State Education Department makes the final recommendation, with the state taking “various educational policies” into consideration during the final adjustment process. This language suggests Albany has had discretion in setting cut points and that its decisions could reflect political and budgetary calculations, and not just educational principles.

Placement of the cut scores is pivotal to every test purpose. In New York City, the ELA and math results are used to award bonuses to principals and teachers whose classes do well; to grade schools and identify “failing” ones that are subject to closure for poor performance; and to hold back students who don’t reach Level 2. Test data are also being incorporated into formulas to appraise teacher effectiveness.

While the setting of Level 3 cut scores—which ostensibly indicate “grade-level” proficiency—warrants study, it’s imperative that the cut points dividing Level 1 from Level 2 also become the focus of investigation.

Across all tests for all grades, there were no instances where a higher ELA or Math score was needed to attain Level 2 in 2009 than 2008.

Students could reach Level 2 on many 2009 exams by guessing right on a quarter of the multiple-choice items and (as CTB’s item analysis data show) receiving an automatic point on each essay question. How could SED identify who had serious academic problems on this flimsy basis?

2008’s 6th graders needed an ELA score of 11 out of 39 points to make Level 2 compared to 7 in 2009. This represents a 10.3 percent drop in the cut score.

Statewide from 2008 to 2009, Level 1s fell from 3,472 to 271 out of 198,000 students. In New York City, Level 1s shrunk from 1,941 to 146 out of 70,000. The design of the 6th grade ELA test and its cut score guaranteed that students would be given a Get-Out-of-Level 1 pass.

No well-developed test can set forth decision points that are passable by chance alone. Yet that’s what happened here to the detriment of students most in need of help and services. This is the sign of a defective instrument or an injurious, fraudulent testing process, or both.

If the state and its test-maker lowered the Level 2 cutoffs before the 2009 tests were given, how could they justify the predictable outcome—that in several grades, students with severe reading/writing deficits would be denied adequate support because the possibility of landing in Level 1 had been all but eliminated?

If cut points were set after the operational test was given, that’s disturbing too. By then, the score distributions were known to the state and publisher—telling them the exact number of students who’d be classified as L1 and L2, depending on where the cuts were established. This renders cut point adjustments subject to the whims of expedience.

The largest decrease in the ELA Level 2 cut score occurred in Grade 8. It dropped from 19 to 13. Had it remained 19, more than 7,600 students would have been classified as Level 1s and should have been given programs and resources to meet their educational needs.

SED spokesmen and test people maintain that cut scores were lowered because the items got harder. Where is their evidence? The facts will remain elusive until definitive information is submitted to or compelled by an independent agency empowered to examine all aspects of the testing program.

I ask Chancellor Tisch to address what took place in 2009. And right now, in this new age of integrity, there’s no need for the 2010 cut scores to remain a mystery. I propose a simple step that will tell us whether testing is undergoing change we can believe in: Chancellor, in the next few days, if the 2010 Level 2 and Level 3 cutoff points have been set, release them to the public before the tests have been scored. Nothing is compromised by disclosing the information. It’s the first step toward more honest and open testing.

The public, parents and educators deserve truthful answers from the same Albany and CTB partners who have produced the state testing program since the millennium began. Without being held accountable for their actions, they are boldly moving into another decade of finding ingenious ways to leave no shortcut behind.