The Trouble With Numbers

I recently read a scholarly article by the principal of a public high school in New York, which in part addressed “data dysentery”, the countless reams of data we educators collect for, well, for what? The collection of data for the sake of the collection of data?

I mentioned this at lunch to a friend, himself a middle school principal in a large eastern district. He told me of a reading program that was implemented in his district. “Teacher-Proof Education”. The district spent millions on lessons were entirely scripted. There was only one program. To the best of my friend’s knowledge, everyone, teachers and principals without exception, felt the program deleterious to the point of counter-productive. So no one implemented it. Not one school. Except when there was a survey, run by outsiders, measuring the program. On those days, everyone implemented it. At other times, analysis depended on self-reporting which, for obvious reasons, bordered on the self-congratulatory.

The program was deemed a success. “People were even promoted,” my friend added with a laugh. He became a bit more somber when he mentioned how a social scientist, one he deeply respected, based part of an article on this data.

First, the disclaimer. I am not a luddite. Nor am I a data hater. Statistical surveys, and other data instruments, can provide deeply meaningful analyses, analyses that can alter an entire society. “The Kinsey Reports” irrevocably changed America by simply informing the nation of our sexual practices.

Contemporary statistical analyses, however, mostly yield noise. Perhaps the greatest disservice is done to serious social scientists, whose work gets lost amid all the stats-trash.

There is, in a word, imbalance. It is time to honor those who do serious data collection by reminding ourselves of some of the fundamental limits of data analysis.

 Data tends to create more data. And this data, based upon that data, can often act like a platonic removal, a shadow based upon a shadow of the original thing.
 Data often obscures its own prejudices. As just one example, every statistical analysis starts with the presumption that the problem at hand is amenable to, and benefited by, data analysis.
 Data has trouble with innovation. Why? Because tomorrow’s innovation is measured by yesterday’s instrument.
 Data often has trouble with social context. Early I. Q. tests were first designed for, and administered to, relatively well educated whites. Blacks did terribly, and were deemed to be intellectually inferior.
 Many large scale studies depend heavily on self-reporting. But how does one report accurately, for example, in a school system? In a rigidly hierarchical system, like a school system, it merits one almost nothing to report accurately, to the next highest level, that which is negative. There are times when, as the report moves up the hierarchy, each level acts as a platonic removal, until the final report, let’s say a report to the state, is just a shadow of the classroom upon which it reports. As anecdotal evidence, I offer the fact that, of the thousands of reading programs implemented in public schools, I’ve not read one that reported itself a failure.
 Perhaps most importantly, our lives are subject to the ineffable, the intangible, the unconscious. How does one measure awe? Faith? Hope? Beauty? More importantly, why would anyone want to? It is not sufficient to say that the opening of Mozart’s “Haffner Symphony” is brilliant in its simplicity?

The best studies do address the problems listed above. Contemporary I. Q. tests, for instance, regularly compensate for social context On the other hand, many studies are simply stats-trash. Let us examine, for a moment, educational psychology. Almost all educational psychology these days is cognitive/behavioral. Such an orientation lends itself to vast amounts of data collection. There is much that can be learned from such measure, much that is great benefit. On the other hand, as any psychoanalyst will tell us, most of what motivates us toward any behavior is unconscious. That too can be known and measured. But such knowing, such measuring, runs the risk of obscuring understanding. It rationalizes the non-rational.

There are times when it feels like the sum of a person is behavior and data, that no one has a mind anymore.

Let me be absolutely clear. Many studies are invaluable in their contributions and robust in their data. I value their work so much that it pains me to read other papers based upon numbers that are just stats-trash.

Filed under: John Samuel Tieman, Prose