This study came out in November of 2006, but I haven’t heard much reference to it.
Teacher Incentives in Developing Countries by Karthik Muralidharan and Venkatesh Sundararaman
Many studies in educational statistics don’t do any legwork; they just pick some district X that implemented some program Y to see what happens. This doesn’t have the precision of a close-up study, and external forces are horribly difficult to isolate. (When a district implements an initiative, it’s usually one out of a set of plans; they’re trying to fix their district, not do science.)
This study really does go for the full course: they have two sets of schools, one set of “control” schools and one set of “test” schools where they implement a merit pay system, so that individual teachers get bonuses based on their student test score performance.
There’s also a third group. I’ll get to that in a moment.
The study takes care to address concerns like “does offering merit pay help with broader educational questions, or does it just make for more drill for higher test scores?” The authors of the study asked the test writers to include both “mechanical” and “conceptual” questions (see page 18 of the paper):
(4th Grade Math)
Concrete question: 34 x 5 = ___
Conceptual question: Put the correct number in the empty box.
8 + 8 + 8 + 8 + 8 + 8 = 8 x ____
The authors of the study also weren’t content to just collect test scores and leave it at that; they had people observe classrooms and track categories: Calls Student by Name, Address Questions to Students, Active Blackboard Usage, Provided Homework Guidance, Teacher Was in Control of the Class, etc. Obviously rating these are subjective calls, but including them is better than just praying test scores give the whole picture.
All well and good, but where the curious bit comes in is with the third group. See, even though the control group schools didn’t get any merit pay, they were still being observed regularly to track all the data above. So the researchers included a third “pure” control group — a larger set of schools, but each school was observed only once on a random day. Check these out (from page 47 of the study):
The percents meander up and down, but the really major difference came between those who were regularly observed and those who were randomly observed once.
Note that with those two groups no merit pay was involved. This suggests that the act of being observed, or at least being part of a larger scientific study, somehow has a stronger effect than offering money.
I’m not implying “all classrooms should now be staked out with video cameras” or such, only that when a policy change affects classroom performance, it may be due to a different thing than you’d expect.
Filed under: Education