Earlier this month students took the New York State Math and English Language Arts (ELA) tests. More than a million kids in grades 3 through 8 spent parts of two or three days answering multiple-choice items and questions that ask them to provide a written response or solve a math problem.

Students are classified by how many points they get. Answer only a few, and Johnny has serious academic problems (Level 1); correctly answer one or two more, and Jenny partially meets state learning standards (Level 2); pass the Level 3 cutoff score, and Jesse is deemed to meet them.

Level 3—considered a criterion of proficiency—has become the goal line. Districts and schools are evaluated and administrators are judged on the percentage of students crossing it. Amid the federal No Child Left Behind Act's insatiable demand for progress, more New York State students are reaching Level 3 each year.

But Regents Chancellor Merryl Tisch has tried to distance the educational establishment from 2009's record-setting achievement levels—77.4 percent of students scored at or above Level 3 on English language arts and 86.4 percent did so on math. Her admonitions about the need for honest data and more rigorous testing are an implicit admission of testing deficiencies and how ludicrous the results have become.

Faced with criticism that the exams have gotten easier over the past eight years the State Education Department (SED) says the 2010 version of the tests will be tougher to pass—broader and less predictable—raising the achievement bar for students.

Before we accept promises of reform, however, a formal inquiry into key issues is needed or SED and its perennial test vendor, CTB/McGraw-Hill, will spin us through another cycle of preposterous testing and damaging results. Truth about the past must precede trust in the future.

Let's start with the cutoff points—the raw scores that partition students into categories of achievement. Who set the final cut scores on the 2009 tests, and were they established before or after the tests had been given?

CTB's technical reports describe a process in which the State Education Department makes the final recommendation, with the state taking "various educational policies" into consideration during the final adjustment process. This language suggests Albany has had discretion in setting cut points and that its decisions could reflect political and budgetary calculations, and not just educational principles.

Placement of the cut scores is pivotal to every test purpose. In New York City, the ELA and math results are used to award bonuses to principals and teachers whose classes do well; to grade schools and identify "failing" ones that are subject to closure for poor performance; and to hold back students who don't reach Level 2. Test data are also being incorporated into formulas to appraise teacher effectiveness.

While the setting of Level 3 cut scores—which ostensibly indicate "grade-level" proficiency—warrants study, it's imperative that the cut points dividing Level 1 from Level 2 also become the focus of investigation.

Across all tests for all grades, there were no instances where a higher ELA or Math score was needed to attain Level 2 in 2009 than 2008.

Students could reach Level 2 on many 2009 exams by guessing right on a quarter of the multiple-choice items and (as CTB's item analysis data show) receiving an automatic point on each essay question. How could SED identify who had serious academic problems on this flimsy basis?