The time has come to scrub New York’s field testing program before launch next week. The State Education Department is reverting to stand-alone field tests, an approach it acknowledged to be fundamentally flawed in 2009.

Stand-alone field tests are self-contained. Students take them so the publisher can try out new items for future use. Results, good or bad, have no direct consequences for the students.

SED wants to try out new questions on 488,000 students in grades three through eight at 4,078 schools statewide, including 1,029 New York City public schools. This non-mandated testing is in addition to the embedded field test questions that were planted inside April’s statewide exams.

The aim is to create a pool of items the publisher, NCS Pearson, can draw upon to assemble next year’s English and math exams. The quality of next year’s exams therefore hinges on how well students in the field test sample represent the population for whom the next round of exams is being developed—not only in composition, but also in their motivation to do well on try-out questions.

And motivation is the rub.

Common sense tells us that students will not strive to perform well in June—even less so on experimental tests they know won’t count against them—having suffered countless testing drills during the year capped off by April’s grinding high-stakes exams.

Recent history sounds the alarm. In 2009, when a different company was producing New York’s exams, the results reached implausible heights. Regents Chancellor Merryl Tisch thought the scores were suspicious. “We have to stop lying to our kids,” she said. Test experts quietly observed that the underlying trouble was with separate field testing—exactly what’s happening next week—because it provided inaccurate information. Students taking the field tests knew they weren’t real and didn’t give their all. Their indifferent effort and resulting poor performance on the field tests led SED and the test-maker to see the items as relatively difficult and to underestimate the level of success students would truly reach when try-out questions went onto the real (a.k.a. operational) exams.

Duped by the data, they lowered the score needed to pass. When students took the operational exams seriously, they performed better than projected, and extremely high percentages were deemed proficient.

That sequence moved SED to adopt the “embedded field testing” approach taken in April—in which try-out questions are interspersed in the same test booklets with questions that count. Students can’t tell the difference, so they should try hard on all of them.

An SED memo to superintendents said the benefits of embedding multiple-choice questions included “a better representation of the student population and more reliable field test data.”

But SED did not embed enough items. While the exams took considerably longer to administer, they didn’t yield sufficient material from which to construct future exams.

To overcome this deficiency, SED and Pearson have altered their five-year, $32 million agreement in order to impose many different tests next week, particularly 12 that contain reading passages with multiple-choice items.

Meanwhile, SED/Pearson has kept parents and the public in the dark. This year’s Field Tests School Administrators Manual tellingly omits a single sentence from last year’s: Parents should be informed of the dates and the purpose of the tests.

Why the secrecy surrounding these non-mandatory tests? Do they fear a backlash of resistance?

Perhaps they should. A growing number of parents have begun to see the pernicious impact testing overkill has had on education. They’re tired of hearing that next year the program will get it right. Parents don’t want to keep putting their children through the testing wringer. They are saying “Enough!” and questioning the validity of such blunt instruments in judging students, teachers and schools.

The return to stand-alone field testing only sharpens their appropriate concerns. Surely both SED and Pearson know that stand-alone field testing takes New York back to a costly, unviable exercise that cannot produce solid information because student motivation will be lacking.

An independent investigation is long overdue. It’s finally time to forcefully correct the disastrous course we are on.