My wife, one son-in-law, and sundry others have emailed me David Brooks’s column, “The Harlem Miracle,” about Harlem Children’s Zone schools that are said to have erased the usual black-white difference in academic test scores, asking what I think. Answer: It will be wonderful if the results are as good as they sound, but hold the champagne.
I’m not being mindlessly pessimistic. The problem is that we have had 40 years of “Miracle in X”—the early Head Start results, the Milwaukee Project, Perry Preschool, the Abecedarian Project, Marva Collins’s schools, and the Infant Health Development Project, to name some of the most widely known stories—and the history is depressingly consistent: an initial research report gets ecstatic attention in the press, then a couple of years later it turns out that the miracle is, at best, a marginal success that is not close to the initial claims.
I haven’t seen the study by Roland Fryer and Will Dobbie that was the basis for Brooks’s column, but if I’m going to be such a grinch I might as well lay out the kinds of things I will be looking for (these are generic issues, not things that I necessarily think are problems with this particular study) when I get hold of a copy:
1. Selection factors among the students. Did the program deal with a representative sample? Was random assignment used?
2. Comparison group. Who’s in it? Are they comparable to the students in the experimental group?
3. Attrition. What about the students who started the program but dropped out? How many were there? How were they doing when they dropped out?
4. Teaching to the test. After seven years of No Child Left Behind, everybody knows about this one. Worse, there are the school officials who have rigged attendance on the day the test was taken or simply faked the scores—that’s been happening too with high stakes testing.
5. Cherry-picking. Do the reported test scores include all of the tests that the students took, or just the ones that make the program look good?
6. The tests. Do they meet ordinary standards for statistical reliability, predictive validity, etc.
7. Fade-out. Large short-term test score improvements have, without exception to date, faded to modest ones within a few years.
Lots of people are going to be examining these results (searching on Brooks, Fryer, and Harlem just a few hours after Brooks’s article appeared brought up this, among others), so if they’re solid, we’ll know—but not for a few years. I left fade-out until last, but that’s the killer even if everything else is done right.
All this doesn’t mean that the Harlem Children’s Zone project isn’t terrific. I like the sound of its approach. But I’m tired of people proclaiming miracles on national television or in national media, then not bothering to say “never mind” when the miracle turned out to be a mirage.

