Bias in the World Bank

Author

Jason Collins

Published

July 25, 2016

Last year’s World Development Report 2015: Mind, Society and Behaviour from the World Bank documents many of what seem to be successful behavioural interventions. Many of the interventions are quite interesting and build a case that a behavioural approach can add something to development economics.

The report also rightly received some praise for including a chapter which explored the biases of development professionals. World Bank staff were shown to subjectively interpret data differently depending on the frame, to suffer from the sunk cost bias and to have little idea about the opinions of the poor people they might help. Interestingly, in the brief discussion about what can be done to counteract these biases, there is little discussion about whether it might be better to simply not conduct certain projects.

On a more critical front, Andreas Ortmann sent me a copy of his review of the report that was published in the Journal of Economic Psychology. Ortmann has already put a lot of my reaction into words, so here is an excerpt (a longer excerpt is here):

What the Report does not do, unfortunately, is the kind of red teaming that it advocates as “one way to overcome the natural limitations on judgement among development professionals … In red teaming, an outside group has the role of challenging the plans, procedures, capabilities, and assumptions of an operational design, with the goal of taking the perspective of potential partners or adversaries. …” …

Overall, and notwithstanding the occasional claim of systematic reviewing (p. 155 fn 6), the sampling of the evidence seems often haphazard and partisan. Take as another example, in chapter 7, the discussion of reference points and daily income targeting that was started by Camerer, Babcock, Loewnstein, and Thaler (1997) and brought about studies such as Fehr and Goette (2007). These studies suggested that taxi drivers and bike messengers in high-income settings have target earnings or target hours and do not intertemporally maximize allocation of labor and leisure. The problem with the argument is that several follow-up studies (prominently, the studies by Farber (2005, 2008) questioned the earlier results. Here no mention is made of these critical studies. Instead the authors argue that the failure to maximize intertemporally can also be found in low-income settings. They cite an unpublished working paper investigating bicycle taxi drivers in Kenya and another unpublished working paper citing fishermen in India. Tellingly, the authors (and the scores of commentators they gave them feedback) did not come across a paper, now forthcoming in Journal of Labor Economics, that has been circulating for a couple of years (see Stafford, in press) and that shows, and shows with an unusually rich data set for Florida lobster fishermen, that both participation decisions and hours spent on sea are consistent with a neoclassical model of labor supply. …

There are dozens of other examples of review of the literature that I find troublingly deficient on the basis of articles I know. … But, as mentioned and as I have illustrated with examples above, there is little red teaming on display here. Not that that is a particularly new development. Behavioural Economics, not just in my view, has since the beginning been oversold and much of that over-selling was done by ignoring the considerable controversies that have swirled around it for decades (Gigerenzer, 1996; Kahneman & Tversky, 1996 anyone? …).

The troubling omission of contrarian evidence and critical voices on display in the Report is deplorable because there are important insights that have come out of these debates and the emerging policy implications would be based on less shifty ground if these insights would be taken into account in systematic ways. If you make the case for costly and policy interventions that might affect literally billions of people, you ought to make sure that the evidence on which you base your policy implications is robust.
In sum, it seems to me that the resources that went into the Report would have been better spent had there been adversarial collaborations (Mellers, Hertwig, & Kahneman, 2001) and/or had reviews gone through a standard review process which hopefully would have forced some clear-cut and documented review criteria. A long list of people that gave feedback is not a good substitute for institutional quality control.