Monday, November 24, 2008

The Charles Darwin Genome Project

Now I am probably put two and two together and making five, but having read in the last few days that:
  1. the mammoth genome has been sequenced using mammoth hair as a starting point
  2. Charles Darwin's beard hair has gone on display at the Darwin's Big Idea exhibition
I should like to launch the Charles Darwin Genome Project, by suggesting that DNA extracted from the hair is used to sequence the great man's genome. In the era of personal genomics, with four individual genomes already completed (including Jim Watson's and Craig Venter's), what better way to mark the bicentenary than getting started on this project! But beyond that, it might also shed light on any genetic predispositions that contributed to Darwin's long-standing ill-health (for example, genotypes associated with inflammatory bowel disease, which has been suggested as one cause of his illness)

Sequencing Darwin will not be quite as straightforward as sequencing Craig Venter or Jim Watson, but recent successes with the mammoth and with the Neanderthal genome provide grounds for optimism. I propose a three-pronged attack on the problem:

1. Studies on Darwin's hair. My understanding is that amplifying DNA from human hair, even recently shed hair, is considered difficult and the dogma was that this would work only on mitochondrial DNA.  In fact, even getting mitochondrial DNA from Darwin's hair might be tricky--a recent attempt to obtain a mitochondrial hypervariable sequence from Isaac Newton's hair delivered several different sequences, so that none could be assigned with confidence to "the greatest Englishman before Darwin". But the mammoth genome sequence forces one to re-evaluate this assumption and nothing ventured, nothing gained. Once the exhibition is over, I suggest that Darwin's hair be turned over to Svante Pääbo to see what he can do with it.

2. Studies on Darwin's descendants. There are now hundreds of Darwin descendants and between them they provide two easy targets: Darwin's Y chromosome and Darwin's mitochondrial genome. Darwin's Y chromosome (give or take a few mutations) is alive and well in his male-line descendants, so obtaining that sequence should be technically straightforward, in that one just needs to sequence the genome of one or more male Darwins. However, there is an ethical problem in that such a genome sequence might reveal predispositions to disease that the owner does not want to know about. Although this could perhaps be circumvented in masking the most obvious disease-associated loci (as Jim Watson did), it probably prevents this becoming an immediate reality, until a better framework for dealing with personal genome data is in place. However, using information from a male-line descendant would provide a means of assigning Charles Darwin to a relevant Y chromosome haplotype. I know of at least one Darwin descendant who might be willing to provide a sample for such an investigation. Watch this space!

Similar arguments could be made for determining the CD mitochondrial genome. Here one would need a matrilineal descendant of one of Darwin's matrilineal ancestors (mother, grandmother etc). From a brief look at the Darwin-Wedgwood family tree, I got quite excited that one such lineage led to the Vaughan Williams family (i.e. sisters of the composer Ralph), but it appears that the line died out in the twentieth century. If anyone can help find suitable descendants, please let us know!

A more ambitious project would be to sequence the genomes of all living Darwin descendants and attempt to re-construct the entire CD nuclear genome. This would be quite an undertaking and perhaps not worth thinking about for a few more years. However, the technology for sequencing genomes is progressing very quickly and there is a thousand human genomes project underway already. A study of all descendants of CD would provide an interesting contrast to current studies which tend to focus on human diversity rather than an in-depth analysis of one family.

3. Studies on Darwin's remains. The most obvious source for Darwin's DNA are his bones, which lie under the floor of Westminster Abbey. Given that the Neanderthal genome sequencing project is progressing well using material from long bones, which in life house the DNA-rich bone marrow, an obvious place to start would be with Darwin's femurs. A current sense of propriety is likely to forestall any current moves in this direction. But given this recent headline about Copernicus, in which his >400-year-old bones have been subjected to DNA analysis, it is only a matter of time before Darwin's DNA becomes a substrate for genome-sequencing. I confidently predict that by Darwin's 500th birthday, in 2309, we will have a complete genome sequence of the great man! And anyone reading this in the twenty-fourth century, take note that the Charles Darwin genome project started here!


Karen James said...

*ahem* call me *ahem*

troubleDevolution said...

Great idea! I vote for sequencing the genomes of Darwin's descendants, since 1) that would be a fitting tribute to his work; 2) sequencing his genome would just be too simple by today's standards (given the DNA, that is), whereas you can make a case (read "funding") for studying disease inheritance (call it a pilot study); 3) when his remains are unearthed a few decades or centuries from now, we can see how close the reconstruction was.


James@cancer said...

This post got me thinking. The thousand genomes project is underway but perhaps could be just as successful and loads more interesting if we dug up loads of dead scientists and sequenced them instead of 1000 anonymised individuals. Just imagine; Darwin alongside, Einstein, Bohr, Turing, Curie, Hawking (not dead). Seeing how the mammoth has been done we might even consider pushing technologies and try to get Copernicus, Galileo, da Vinci on the list?

cpurrin1 said...

Any idea how many Darwin descendants are alive on earth today? Ballpark estimate???

Mark Pallen said...

We are now around generation number 5-6. If average number of reproducing offspring per generation per person is 2, that means 32 in generation 5 and 64 in generation 6, so I guess around a hundred. If average per generation is 3, then numbers will be 243 and 729. So I guess several hundred!

The wikipedia article here goes a small way towards answering the question:
But is far from exhaustive. Some more details can be found if one looks up the wikipedia article for each person listed therein, but I don't have the time to go through them all!

But this book is just out and I suspect Imelda does list them all. I have a copy on order for Christmas!