Finding the Best DNA Test: Should I Genotype or Sequence?
If you’ve arrived here you’re probably somewhat aware of the various consumer genotyping kits available for people wanting to check their ancestry and more recently (at least in an official sense) their health. By the end of this year, estimates predict that 100 million consumers will have taken some form of home DNA test. With the surge in interest, more genetic testing providers have popped up to meet demand. And consumers are the one’s who benefit, especially as the FDA has eased some restrictions on health reporting. For example, 23andme can now tell its customers whether they have a version of the APOE gene that is associated with an increased risk for Alzheimer’s disease. 23andMe is by far the most well known of these companies, as they were the first to identify the niche in the market for consumer focused genomics, all the way back in November 2007. While, the technology wasn’t especially cutting edge, the way the data was packaged and presented was ground breaking, with Time Magazine accordingly naming their service the “Invention of the Year” in 2008 However, even though over 2,000,000 people have been genotyped by 23andMe (and many more by their competitors) there is still a lot of misunderstanding about what exactly these services offer, and how individuals can use this data. Before diving in it’s worth looking at our genetics 101 primer so you’re familiar with some of the terminology that follows. Note: at Gene Food, we use the raw data from genetic testing services to craft custom nutrition plans for our clients.
- Genotyping vs Whole Genome Sequencing
- Which genotyping service is best?
- So what about sequencing?
- My opinion
Genotyping vs Whole Genome SequencingThe first thing to understand is the difference between genotyping and sequencing. All the major players you’ve likely heard of, from Ancestry.com to 23andme, are all genotyping companies. As we explore below, they report on a tiny fraction of your overall DNA. Below, I list the three basic ways you can have your DNA tested.
Whole genome – all your DNAWhole Genome Sequencing (“WGS”) is exactly that. Every single base from all of your chromosomes is determined. Interestingly only about 2% of your genome is known as “coding DNA” that is DNA which codes for the proteins which make up all our cells and allow us to function as the unique humans we are. The rest, known as “non-coding DNA,” was long thought of as junk DNA, but as we understand more about our genetics we now know these regions play a hugely important role in regulating the coding portions of our DNA. Our understanding of these regions and their interactions is relatively poor compared to our knowledge of the DNA coding regions. To use a highway analogy, WGS represents every inch of the highway, including stretches of scrubland along the side of the road, far away from any exits or towns.
Exome Sequencing – just coding DNAWhole Exome Sequencing (WES) sequences only the regions of DNA which code for proteins. This accounts for approximately 2% of the whole genome. On the DNA highway, WES represents every known exit and town, even ones that are not thought of as important, or are as yet still unknown.
Genotyping – curated DNAThen onto genotyping. Here information for a carefully selected number of bases is captured. These bases are usually selected because they represent locations where particular sequences associate with characteristics. But there are huge regions of the exome, let alone the genome which aren’t covered. Genotyping picks up just the important stops on the highway that have activity. Approximate regions covered by whole genome sequencing (WGS), whole exome sequencing (WES) and consumer genotyping. 23andMe, Ancestry.com and most other services are all genotyping companies. They look at select points of interest in your DNA, and look for variations (polymorphisms) giving rise to the term Single Nucleotide Polymorphism (SNP). But as you can see the amount of the genome they cover is tiny, approximately 0.03%!
Which genotyping service is best?How long is a piece of string? A bit of a blasé answer, but there really isn’t much to separate the major players, a fact which is summed up really nicely by this table (which I also summarise below). There will be some differences in how the data is presented to you within their site, and their particular focus i.e. Ancestry.com may be more relevant for those trying to trace relatives and construct a family tree, whereas 23andMe provides more “health” based information.
Ancestry.com – best for family tree searchesAs the name suggests, Ancestry.com beats 23and me for those trying to construct a family tree, or find a distant relative. The company boasts a database of over 5 million from which users can “connect dots” with those who may be related. To be clear, this is an interface win, rather than a data advantage, because as we will see, both services are very comparable.
23andme – best for health research23andme wins out on how they present health data to customers (although it’s important to point out that the Health plus Ancestry version of 23andme is more expensive). For example, 23andme users can access a “Genetic health risk” section which outlines the genetic risk for conditions like late onset Alzheimer’s and Age-related macular degeneration. The company has even included a new report providing risk factors for celiac disease.
True value of genotype data is the raw data fileBut the true power of your genotype data (at least for us with a health focus) comes with the raw data file, which can be analyzed by third party providers (Genetic Genie, Promehease, Livewello, and now Gene Food with the launch of our custom nutrition plan) with a much clearer health focus. For example, if you’re curious about your MTHFR status, you will need to use the raw data to find if a mutation in that gene is present. As both 23andMe and Ancestry.com provide access to this raw data there is very little difference between the two. Even the number of SNPs analyzed is remarkably similar with the most recent 23andMe kit analyzing ~670,000 SNPs and Ancestry.com covering ~700,000 SNPs.
|Major Focus||Physical and behavioral traits, and some information about genetic disorders.||Geographic interpretation of genome with advanced ancestry matching.|
|Health Information||Some, limited.||n/a|
|Ancestry Information||Paternal and maternal information. Geographic analysis. Some simple genealogy tools.||Geographic analysis.|
Advanced genealogy and DNA matching tools for family tree building and ancestry matching.
|Estimated Turnaround||3-4 weeks||4-weeks|
|Access to Raw Data||Yes||Yes|
Great article! Will you be updating this year? Any info regarding cancer testing?
Nice article, thanks. FYI, your graphic comparing the different types of tests has the incorrect number (off by an order of magnitude) for WGS total base pairs. It should be 3 billion, not 300 million.
Genome It All provides genetic testing that provides 700,000 SNPs on their raw data. The biggest benefit is that it comes with a VERY detailed custom report for each person on over 15,000 SNPs. It breaks down the results to make them easy to understand. It even explains what certain findings are defined, so that you don’t have to constantly look up what something means.
Nice article and genetics 101 primer. Love the DNA single diagram and would love to reuse.
Biggest clarification I would make is adding MyHeritage (based on FTDNA/Gene by Gene lab) and LivingDNA (reform of services available previously in the UK). Also ySeq for WGS services (started by CTO/CSO couple for FTDNA who left 4+ yrs ago and returned to Germany). Key also is 23andMe and LivingDNA use the new xxx machine from Illumina which has all new SNPs and hg reference model. The results are very difficult to compare with others without significant work to impute missing values from other tests / companies / Illumina genotyping machines. LivingDNA just announced they are going to switch to the old generation machines to be more compatible with the market. (Given up on making a match database.) MyHeritage just made available the tool to impute and compare results from both sources after considerable effort. They are the first and only to allow the apples to oranges comparison. (GEDMatch is still working on this in their Genesis “product”.) Overall, there is actually little overlap in the SNPs tested between many of these companies. ISOGG has just started looking into this and reporting it in their wiki. But tool / software / bioinformatic people have known this for 10+ years. See y-str.org where an ad-hoc tool researcher put out a venn diagram of the SNP lack of overlap back in 2012! Something you all must be aware of if looking for specific SNPs for your reporting from the various sources.
One small (for you, major for others) is there is variance in the WGS products. In reality, they do not do Full Sequencing. So the read length of the snippets that are sequenced and the depth read coverage of the chromosom are important metrics. DanteLabs and FGC basic products have minimal coverage (big gaps still). Y haplogroup projects have lots of info to compare from looking only at the Y chromosome.
Good article. But, I think your comparison of 23andMe, Ancestry, and FamilyTreeDNA (FTDNA) ignores the genealogic perspective. I agree with you that they are good. But, for genetic genealogy, FTDNA is the winner.
FTDNA offers deep Y-DNA and mtDNA testing. FTDNA has the largest Y-DNA haplotree on the planet. And the largest mtDNA haplotree using Full mtDNA Sequencing. FTDNA has thousands of project groups for sharing and comparing with group administrators to help group members.
Thank-you for a very informative and well illustrated article. You clarified the field of genetic testing for me.
There seems to be a significant error in your math. You list WGS as 300M bases and WES as 60M bases, identifying WGS as 100% of the genome and WES as 2% of the genome. However, 60M is 20% of 300M. Can you please correct whatever is wrong in these? Thanks,
Hi Dr. Gardner,
I was able to download my raw data for no extra charge actually through my genographic 2.0 account. I found this odd. Helix was the lab. I have all of this information though, and there appears to be nothing I can do with it. Any suggestions?
Sorry for the slow reply I missed the notification that you’d posted. You should be able to use the raw data in the same way you would use the data from 23andme or similar. I know that https://promethease.com can deal with the data at a cost. It is also possible for you to manually interpret your data, you just need to find the SNP that you’re interested in amongst all the others.
Do you know what sort of file format you have? I can point you in the direction of useful software then.
Just FYI Helix does allow you to download your raw data for $499.
Thanks Amanda, at the time this didn’t seem to be an option. Have update the post accordingly.