Jim Manzi, in a thought-provoking article in City Journal this past fall, tackles What Social Science Does and Doesn’t Know. Jim, a founder and chairman of an applied artificial intelligence software company, argues that randomized controlled experiments are both the gold standard for measuring effect and still largely insufficient for complex social science questions. Given all the claims of extraordinary differences and gains from programs, today’s But If Not is a cautionary tale, more nuanced than the previous Lies, Damned Lies, and Statistics.
Over many decades, social science has groped toward the goal of applying the experimental method to evaluate its theories for social improvement. Recent developments have made this much more practical, and the experimental revolution is finally reaching social science. The most fundamental lesson that emerges from such experimentation to date is that our scientific ignorance of the human condition remains profound. Despite confidently asserted empirical analysis, persuasive rhetoric, and claims to expertise, very few social-program interventions can be shown in controlled experiments to create real improvement in outcomes of interest.
I read this article back in September and kept it with an eye toward sharing it with other marketers I work with. Then I came across an Alex Tabarrok post titled The Ethics of Random Clinical Trials and I knew I had today’s But If Not.
New York City is denying aid to randomly selected applicants as part of an experiment to determine the effectiveness of a housing aid program. Someone, who deserves our thanks, has dared to try to answer the question of whether the millions spent make a difference. You can imagine how this is working out.
“They should immediately stop this experiment,” said the Manhattan borough president, Scott M. Stringer. “The city shouldn’t be making guinea pigs out of its most vulnerable.”
What if it turns out that those denied aid end-up, through what ever means, providing for their housing needs without assistance? What if they don’t? You can see that this is a sticky wicket – these are real people in a bind and we have no idea if the money we give them makes a difference. For sure it does in the absolute near term, but what if it would have worked out otherwise. Without an experiment one can’t even measure . Let’s let Jim explain.
Another way of putting the problem is that we have no reliable way to measure counterfactuals—that is, to know what would have happened had we not executed some policy—because so many other factors influence the outcome. . .
The missing ingredient is controlled experimentation, which is what allows science positively to settle certain kinds of debates.
Science figured this out a long time ago. Galileo dropped the two unequally weighted balls to prove that all objects fall at the same rate in a controlled environment (the vacuum).
Now the trend toward scientific marketing is in full swing. Effective marketers look for statistical associations among their customers and run controlled experiments.
“Not running a control group is one of the things that you can lose your job for at Harrah’s” – Harrah’s CEO in a Stanford Business School case study
It turns out that conducting controlled experiments leads to increased profitability. And where there is profit, business is responsive. Companies are increasingly finding and utilizing statistical techniques and the scientific method to drive growth.
By 2000, Capital One was reportedly running more than 60,000 tests per year. And by 2009, it had gone from an idea in a conference room to a public corporation worth $35 billion.
But a word of caution, many marketers trumpet their findings and pimp certainty well beyond the ambiguity found in their results – because the uncertainty has been painted over.
Run enough tests, and you can find predictive rules that are sufficiently nuanced to be of practical use in the very complex environment of real-world human decision-making.
But social science does not have the same clear production function as most commercial questions and suffers from extremely high causal densities (a ball hit by a bat has one cause, the effect of a crime rehabilitation program has high causal densities). From these limitations Jim outlines three conclusions, all of which – even if you don’t read his article – should be remembered.
First, few programs can be shown to work in properly randomized and replicated trials.
So guard against the anchoring mechanism of beliefs by being initially skeptical of all studies.
Second, within this universe of programs that are far more likely to fail than succeed, programs that try to change people are even more likely to fail than those that try to change incentives.
So understand which is being advocated by the program in question.
And third, there is no magic. Those rare programs that do work usually lead to improvements that are quite modest, compared with the size of the problems they are meant to address or the dreams of advocates.
To my business friends – “modest gains” will get you a promotion. To my social science friends – “modest gains” may get your funding cut in favor of someone else’s newly recycled dream. But I urge you on.