COE Professor Joni Lakin researches dangers of “high stakes” school testing

February 19, 2016

President Obama, a long-time advocate of school accountability with a heavy emphasis on testing, recently made a surprising announcement — American students are tested too much and we should re-examine our emphasis on testing. The United States Department of Education released a plan calling for “fewer and smarter assessments.”

To underscore his point, Obama said the great teachers who shaped his life didn’t emphasize testing, but “taught me to believe in myself, to be curious about the world, to take charge of my own learning so that I could reach my full potential…”

Joni Lakin
Dr. Joni Lakin conducts research into the use and misuse of testing.

Dr. Joni Lakin, an Assistant Professor in the College of Education’s Department of Educational Foundations, Leadership and Technology, took notice of this unexpected event. Lakin is a testing expert, and conducts research into the use and misuse of testing. Before coming to Auburn Lakin worked at Educational Testing Service.

“I am a validity researcher,” she explained. “I do foundational and fairness research. In other words, I evaluate tests in order to see if they are biased in some particular way, or perhaps whether the test is asking the right questions to the right audience. Professionals in the testing world think tests have a place and are valuable, but they can be misused.”

In her undergraduate work at Georgia Tech, Lakin worked in a psychology lab studying human ability and predicting outcomes, which included creativity, motivation, and intelligence. In her graduate work at Iowa she helped write intelligence tests and focused on test development.

“When I was in graduate school, No Child Left Behind (NCLB) was the nation’s educational policy for public schools. We focused on how that bill took high-stakes testing to another level,” she said. “NCLB put a lot of pressure on teachers and tied school appropriations to the success of test results, which was a major way teachers and administrators were then evaluated.”

However, these test experts began noticing that instead of focusing on no child being left behind, the focus was on a very narrow slice of the student population.

“The talk was all ‘bubble students,’” she said. “The idea was to move everyone up between categories of proficiency, so the kids who were on the bubble got all the attention to get them over the cut score. That’s good in one way, but if you were far above or far below the cut, you were basically irrelevant. Ultimately, the people who designed NCLB knew full well that not every school could hit 100 percent. So teachers had to tread water and teach to the test and wait and see when the law would change.”

At this point, all of the states have waivers to use the Obama Administration’s Race to the Top-type systems instead of NCLB. Instead of proficiency, the focus now is being on track.

“In the world of developing tests, we want to write a test that is worth teaching to,” Lakin said. “Great tests engage students, of course, but you also must create a test that cannot be gamed and that leads teachers to focus on real skills.”

“In the world of developing tests, we want to write a test that is worth teaching to,” Lakin said. “Great tests engage students, of course, but you also must create a test that cannot be gamed and that leads teachers to focus on real skills.”

For example, Lakin cites how teachers can “game” a question on the Pythagorean Theorem. Instead of teaching the theorem, teachers started teaching the 3-4-5 triangle. Students knew the answer was 5. In this case, students are not learning critical math skills, but narrow test-taking strategies.

“Why would a teacher do this?” Lakin asks rhetorically. “Because they are being evaluated on how many students answer the question correctly on the test! So there are problems all over the place with such high-stakes testing.”

Then there are pressures to change test answers, most notably in the Atlanta testing scandal. During so-called “clean up parties” teachers and principals actually changed answers so they would be rewarded for high test scores.

Obama’s statement deliberately vague

Lakin said it’s likely that testing companies are really working over Obama’s statement, in which he says students are spending too much time in test preparation. He proposes 2 percent testing time, which equates to about 21 hours a year in Alabama.

“We’re not talking here about subject matter classroom tests, but the federal policy that requires state testing,” Lakin said. “Obama’s comments are intentionally vague. I support his big idea – that we want to change the motivation to be testing all the time. What he said is a policy statement, not a guideline. Obama is suggesting that we ought to have a law capping testing time. Nothing is really in place in terms of rigid policy right now.”

Lakin said test specialists have developed tests that might actually show real achievement.

“Because of the Common Core Assessment Consortia, we now have better tests that take longer to complete,” she said. “These are tests that may be worth teaching to. For example, an open-ended writing test can take a long time. The Common Core has long reading passages and takes four times as long as old-school multiple choice tests. Better tests take longer. I am definitely in favor of reviewing testing practices, but the devil is in the details.”

Congress must continually reauthorize the Elementary and Secondary School Act. The 2001 re-authorization was NCLB. Now it is Obama’s Race to the Top. The recent reauthorization, which returns most oversight to the state, is the Every Student Succeeds Act. It is likely, according to Lakin, that the future Secretary of Education was involved in developing Obama’s policy statement on fewer and better tests.

Conspiracy theories abound

“One of the big issues out there in terms of testing is that public schools must test all of their students, whereas certain charter schools might find ways to exclude English learners, for example, or students with disabilities,” Lakin said.

“The big conspiracy theory behind NCLB, of course, was that it was designed to discredit public schools and push charters and other forms of school privatization,” she added. “Everyone knows that you can never reach 100 percent proficiency, especially in poorer school districts. But in an exclusive school you can reach 100 percent. And sometimes schools in bad neighborhoods get to the top by expelling problem students. Policies like Every Student Succeeds make it easy to give public schools a black eye. And who can argue with the name?”

But Lakin likes certain ideas in these rigid structures.

“Honestly, as much as the old system had problems, I liked NCLB’s move towards consistency across states in terms of assessments, standards, and educational accountability,” she said. “Common Core was an even better step in this direction. The best school systems in the world mostly have a national system where all students are taught based on the same standards. Consistency across states ensures all students are held to high standards, it makes comparison of different school systems more straightforward, and it doesn’t disadvantage kids who move between districts.”

On Wednesday, December 2, the U.S. House of Representatives approved a final bill, the Every Student Succeeds Act, that reduces the federal footprint in education and replaces the controversial No Child Left Behind Act of 2002.

“I wish the new legislation kept some of the cross-state consistency while also fixing some of the excessive focus on proficiency testing in the old system,” she added. “It is good that the new policy includes protections for subgroups of students. Some people were really worried a new system would not require accountability for racial and linguistic minorities.”

RAISE/PREP bill raises new concerns in Alabama

Even with all the federal activity about evaluation, a new bill, drafts of which are floating around the Alabama Senate, may dramatically impact the future of high-stakes testing in the state. The Rewarding Advancement in Instruction and Student Excellence (RAISE) Act, now renamed the Preparing and Rewarding Educational Professionals (PREP) Act, would, at its root, change tenure laws and the way Alabama school teachers are compensated. But in terms of testing, a great deal of a teacher’s evaluation – perhaps up to 50 percent — would come from student performance on standardized testing.

A petition against the bill has been signed by over 5,000 people, mostly teachers, who say things like this:

“It is ridiculous to base pay raises on student performance. That would be like having our legislators have their pay raises based on how well our country is functioning at the present level. When is the attack on teachers in Alabama going to stop? Enough is enough! We are not the enemy!”

Also of concern to higher education in the state, Marsh’s bill would cease to compensate teachers who acquire graduate degrees.

There are many other concerns being voiced by the education community, which seems united in its opposition to the bill, but the heavy reliance on high-stakes testing is a big part of it.

“This act seems to be modelled on NCLB—like NCLB for the state,” Lakin said. “Breaking down the tenure system is definitely a problem. Furthermore, it seems the raises based on performance will come from local funds. So that means that many poor districts may not be able to pay out these promised funds, meaning they will have more trouble recruiting the best teachers and they will not be able to incentivize strong performance because there’s no money for the rewards.”

But Lakin’s biggest concern remains the heavy reliance on student test scores.

“All of those issues about gaming and cheating we talk about here are greatly increased when teacher and administrator pay is tied to test scores. This was seen to be a major motivator for the Atlanta cheating scandal.”

“All of those issues about gaming and cheating we talk about here are greatly increased when teacher and administrator pay is tied to test scores. This was seen to be a major motivator for the Atlanta cheating scandal.”

The evaluations would be in the form of what is called a Value Added Model (VAM).

“I’ve done some work on student growth models for accountability and they are highly problematic,” Lakin said. “Depending on the test, we sometimes see that the most able students don’t show any growth year-to-year because they are already scoring highly on the test. A Florida teacher recently made very compelling testimony about this. These VAM scores are also found to be highly unreliable. Multiple years of data would be needed to make reasonably reliable decisions. There’s just a lot that doesn’t work about VAM models.”