More Neanderthal Than You

A few months ago, I spotted that were having a sale. 23andme sell personal genomics services – you pay them your cash, and they will post you a sample collection kit. You spit in it, send it back to California, and a month or two later they email you and tell you your results are available for viewing.

What exactly do they do? They extract your DNA, put it on a genotyping chip (a customized Illumina BeadChip) and report the bases at roughly 1 million positions in your DNA which are known to vary in the human population and do something interesting[1]. (The variable positions are called Single Nucleotide Polymorphisms, or SNPs). The human genome has something like 3 billion bases, so it’s not exactly your whole genomeon the chip[2], but it is a sufficient number of the variable positions to tell you a whole host of things about your ancestry, your health, and various other genetic traits.

Why did I get my genotype done? Because I’m a geneticist, and having my genotype is cool, and a lot of the results are entirely frivolous yet cool. It’s fun to know that I am 2.9% Neanderthal, which is probably a bit more Neanderthal than you (2.9% puts me in the 89th percentile), but it’s pretty meaningless otherwise. Similarly, the traits page tells me I am likely to have brown eyes and brown hair with a slight curl, which is also cool but not exactly something I needed to post my spit 3,000 miles to find out about. Other traits are a little more useful – my genotype is AA for the SNP rs601338[3], which means I have the version of the FUTS2 gene which protects against Norovirus infections.

Heading up the scale of usefulness, 23andme check if you’re a carrier for a set of common mutations, including ones which cause nasty diseases in homozygous individuals but go unnoticed in heterozygous carriers. I don’t have any of them, which is a nice relief or rather a nice relief for my hypothetical offspring in the case of the recessive ones, but it doesn’t mean that I don’t carry some other mutation that 23andme don’t test for.

The key part, and what I would guess a lot of customers are most interested in, is the section on disease risks. 23andme only report hard numbers for your disease risk where they consider there is established research, and everything else is considered preliminary research, and you just get either elevated, decreased, or typical risk. I think that’s a pretty sensible move – for instance, I have an established research finding of decreased risk, which is contradicted by some preliminary research claiming elevated risk, but the decreased risk finding is the reliable one at the moment. And they give absolute as well as relative risk, which is essential when telling someone they have an elevated risk of an extremely rare disease. There’s a few pages you have to opt-in specifically to see, so you are doubly certain you want to see your Alzheimer’s or breast cancer risks before you see them, as the effects for some of the known mutations for those diseases are pretty large.

Much like my carrier status, my disease risk is (happily) quite boring, and contains more decreased risks than elevated ones (it is even less likely to be lupus than for most people). But now I have 1 million positions from my DNA, safely downloaded on my hard drive, and I can’t predict what might be lurking in there to be found in the future. A lot of the strong associations may have been found, but there’s all sorts of complex patterns of genetic inheritance to be found and I know I won’t be able to resist finding out what type I have (I clicked on the hidden disease risk pages straight away). On the other hand, what I have found doesn’t worry me that much – the risk is just that, in the end, and even an increased risk doesn’t guarantee that I’ll get a particular disease (and conversely, nor do my protective alleles guarantee I won’t). And I give it five to ten years before we start routinely sequencing cancer patients, and not many more years after that before we start sequencing for all sorts of routine reasons, probably at birth, so whatever is hidden in my genome won’t stay hidden for long.

[1] Technically, I believe it is sometimes a SNP in complete linkage disequilibrium with the SNP that is interesting.
[2] Whole-genome sequencing is still pretty expensive, but 23andme are starting to offer the whole exome (all the bits of the genome which code for DNA) for a mere $999. The catch is that you get your raw exome data – it’s not even clear to me whether it’s variant calls, or just the raw sequences – so good luck doing anything sensible with that unless you really know what you’re doing.
[3] rs numbers are the standardised dbSNP IDs, which you can look up on SNPedia, or stick them into Google Scholar to find papers which reference them.