The ENCODE (Encyclopedia of DNA Elements) project is an international collaboration that intends “to build a comprehensive parts list of functional elements in the human genome, including elements that act at the protein and RNA levels, and regulatory elements that control cells and circumstances in which a gene is active.”
The project has made a splash in the last couple of days with the publication of thirty open access papers across Nature, Genome Research and Genome Biology describing some of the results. Much of the blogosphere has been hosing down the declarations of the accompanying press releases, so don’t expect any revolutions to come out of this work just yet. Similarly, the ENCODE project is not about to spur the genoeconomics revolution (the use of molecular genetics in economics). However, the project is a reminder that there is some very cool work going on (at least for those of us not already in the loop).
One important consideration for genoeconomics is how the ENCODE project might affect genome wide association studies (GWAS). ENCODE outputs were compared with previous results of GWAS for disease, and support was found for previous results. As described on the Nature News site:
Since 2005, genome-wide association studies (GWAS) have spat out thousands of points on the genome in which a single-letter difference, or variant, seems to be associated with disease risk. But almost 90% of these variants fall outside protein-coding genes, so researchers have little clue as to how they might cause or influence disease.
The map created by ENCODE reveals that many of the disease-linked regions include enhancers or other functional sequences. And cell type is important. Kellis’s group looked at some of the variants that are strongly associated with systemic lupus erythematosus, a disease in which the immune system attacks the body’s own tissues. The team noticed that the variants identified in GWAS tended to be in regulatory regions of the genome that were active in an immune-cell line, but not necessarily in other types of cell and Kellis’s postdoc Lucas Ward has created a web portal called HaploReg, which allows researchers to screen variants identified in GWAS against ENCODE data in a systematic way. “We are now, thanks to ENCODE, able to attack much more complex diseases,” Kellis says.
The problem for the genoeconomics enterprise is that the existing GWAS on economic traits are often of questionable value. Any results that are not spurious are of such small effect that biochemical analysis is not much use. Further, converting genetic activity to outcomes such as time or risk preference is a much more difficult proposition than examining disease pathways.
So, for the moment, the genoeconomics enterprise is probably best left examining twin studies, GREML analysis or other techniques that don’t need a particular gene and trait to be nailed down. That said, despite being a long way from being able to control for genetic effects by examining someone’s genome, we are not short of information that we can use.
The more interesting part of the events of the last couple of days, as has been noted in many blogs, is the publication model adopted for this release of the ENCODE results. While not without problems (Daniel MacArthur’s mixed reaction is one example worth reading), the information available and the way it is presented is quite cool and hopefully another step towards more open access to data in the field. You can download an Ipad app which has the thirty open access papers, plus an interesting feature called “threads” which allows exploration of issues across the papers. Much of it is heavy going for someone not in the field, and it is useful to use the blogosphere to interpret the information, but there are worse ways to get up to speed with what is happening.