Models without data


Jason Collins


August 16, 2012

A new paper in PNAS suggests that the similarity between European and Neanderthal genomes is due to population structure in Africa (500,000 odd years ago), not recent interbreeding (50,000 odd years ago). It has been getting a decent bashing, much of it before it was even released. The problem is that the model underlying the theory does not match recent data, which has overtaken the model since the idea behind it was first conceived.

John Hawks writes:

Paleoanthropology is a field where data are rare and precious, and we do a lot of arguing about the validity of models. …

Genomics is not such a field. We have abundant data today to compare with Neandertal genomes. Yet puzzlingly, the idea of Neandertal ancestry has been challenged by several papers that haven’t performed any new empirical comparisons at all. I’m struggling to figure this out. We have an unparalleled ability to explore the genomes of humans and Neandertals, and we should believe a computer model with no empirical data?

Modeling is a lot of work. We’re trying to avoid putting a lot of investment into modeling that will be easily refuted by the next piece of genomic data. Data are flowing now so rapidly that we can afford to be naive empiricists. …

David Reich dismissed the new paper by Eriksson and Manica as “obsolete”. I agree. The paper describes a model without carrying out any new empirical comparisons, and so has fallen behind where the science has gone.

If we set up a continuum between the rare data of paleoanthropology and the abundant data of genomics, economics is closer to the genomics end of the continuum. Yet papers with models and no reference to the empirical evidence abound, even where the data is plentiful. I suspect that this is at least partly due to the culture in economics. As I wrote earlier this year, a beautiful model in economics is often appreciated, even where it is in direct conflict with empirical evidence. And the pile of economic models that have been discarded as they are inconsistent with empirical observation is very small.