Are you being ripped off? Correlation vs. Causation in Genetic Tests

Updated August 18 with information about Microbiome testing.

For those of you new to the genetic testing field, you should first have a solid understanding of correlation versus causation. The distinction between these scientific terms can allow you to get the most out of your genetic tests, and truly understand how your genes affect your health and lifestyle. While the two terms are related, the difference between correlation and causation is similar to the difference between an educated guess and knowing the answer outright.

Knowing whether a genetic report is based on correlation or causation makes a big difference in how serious you should take the results (or how good is the test kit).  

First, let’s get an understanding of what correlation and causation actually are.

Correlation

A correlation is a relationship between two items. These relationships can be simple. For example, if the number of bees increases so will the number of fruits produced. This statement is one of correlation because it relates one number to another number, without saying exactly why the two entities are related.

Typically, correlations are found through statistics alone. Correlations give scientists a chance to understand what factors may be influencing the outcome of a given situation. In the case of genetics, many genes are correlated with specific outcomes. A certain gene may be related to an increase in heart disease. This is a correlation, but it tells you nothing about why the gene variant may increase heart disease. Further, many correlations are false in that they do not predict causation.

Causation

Causation, on the other hand, is much more powerful and insightful. If scientists are able to prove causation between two events, it means that they can directly show how one trait or event led to the other. With the bees and fruit example, causation would be showing that an increase in bees leads to an increase in pollination events, which ultimately increases the amount of fruit produced by a tree.

Finding the causation of an event or outcome is much harder. Not only must scientists prove that two events are related, but they must also show exactly how the first event leads to the second. With genetic testing, causation is fully understood in only a small number of traits, such as the ability to taste bitterness in certain vegetables.

This bitterness is caused by the presence of a single protein within the taste buds. If you have the gene for this protein, it will be expressed on your tongue and you will be able to taste when these bitter compounds enter your mouth. Without the gene, you cannot taste the bitter substances. This shows causation because it not only relates the gene variant with the trait but describes exactly how the presence of the gene results in you being able to taste bitterness in brussel sprouts.

Correlation Vs. Causation

Causation is a much more powerful tool for scientists, compared to correlation. Correlation only shows that two things are linked. Causation goes a step further and explains why things are linked, and how one thing causes another.  

The problem with using only correlation is that sometimes correlations can be misleading. Correlations can be found among most things. For instance, there is a correlation between the divorce rate in Maine, and the per capita consumption of margarine in the United States. Are these things really related? Take a look at the graph below:

http://www.tylervigen.com/correlation_project/correlation_images/divorce-rate-in-maine_per-capita-consumption-of-margarine-us.png

 

The graph clearly shows that the two events are correlated. Maybe they are actually related, on a very deep level. Or maybe they are not related at all. It certainly seems like they could be affecting each other. But the problem with correlation is that it doesn’t explain how these things are related. Is the consumption of margarine driving divorce rates in Maine down? Or is the divorce rate somehow influencing margarine consumption? Without understanding the causation, we will never truly know if these things are directly related.

But, how do correlation and causation relate to genetic testing?

Correlation, Causation, and Genetic Testing

When it comes to genetic testing, you have to be very careful about understanding whether the report you get is detailing correlation or causation. Most genetic tests, unfortunately, only look at causation. Let’s look at a few different examples of genetic test results.

Single Gene Traits

Single gene traits are almost always understood down to the level of causation. Not only is there a link between variants of these genes and the physical outcome, but scientists have described in detail how those variants lead to the physical outcome.

A good example of causation within single gene traits is your blood type. Your blood type is either A, B, AB, or O, depending on which genetic variants you inherited from your parents. These genes, in turn, produce a number of proteins and antibodies present within your body. These proteins and antibodies recognize each other and ensure that no outside cells like bacteria can invade your system. Check out the chart below:

https://upload.wikimedia.org/wikipedia/commons/3/32/ABO_blood_type.svg

 

If you carry the genetic variant for type A antigens, you have type A blood. Carriers of only the B antigen gene have type B blood. Some people carry variants of both A and B antigens, giving them type AB blood. A fourth blood type, O, is created when a person carries no genetic variants to create either A or B antigens. Thus, blood type is entirely determined by a person’s genetics and we have the molecular understanding to prove exactly how this process happens.

Single gene traits are the most likely to show causation between the gene and the trait because they are the simplest form of a trait to study. If only 1 allele is affecting a trait, it is easy to study the genetic variant and find how a gene works on the molecular level to produce the trait in a person.   

Polygenic Traits

Unfortunately for genetic testing, most traits are not determined by a single gene or allele. In fact, most traits are polygenic, meaning they are affected by several genes at the same time. With these traits, causation is harder to show because of the complex interactions between the genes and their environment.

While causation can be shown in polygenic traits, it is much harder to do so. In fact, to show causation in a polygenic trait, researchers must first understand the product of every gene involved. Then, they must map and describe how the genes relate to each other to produce a certain trait. This is a very intensive and complex process, which is very expensive to complete.

On the other hand, it is relatively easy and cheap to determine a correlation between any number of genes and a trait. The problem with this method is that certain genes may actually play no role in a trait, but are simply present in people with the trait. If you have one of these genetic variants, a report based on correlations may show you have a certain trait when you actually do not. In fact, many genetic tests are based solely on correlation studies and the companies presenting the reports have no idea how the gene actually functions in your body.

Now, let’s look at several types of genetic test, and see whether they are using correlations or causations.

Correlation or Causation?

Tests for Single Gene Traits

Genetic tests for single-gene traits are often the most reliable. Genetic testing, in general, is extremely accurate when identifying genetic variants and sequencing your genome. Almost any genetic test for traits like blood type will be accurate and describe you exactly. These traits are not only correlated to the gene variants, but direct causation has been established between having a genetic variant and showing a given trait.  

Tests for Carrier Status

Many genetic diseases are also single-gene traits. For instance, cystic fibrosis is a human disease caused by recessive alleles at a certain gene. Look at the graphic below:

https://upload.wikimedia.org/wikipedia/commons/3/3e/Autorecessive.svg

Here, you can see that only a person with 2 copies of the recessive mutant will get cystic fibrosis. The people with 1 mutant copy and 1 good copy are known as “carriers” of cystic fibrosis. They do not show symptoms themselves but have a chance of passing on their mutant allele to the next generation.

Like single gene traits, these conditions are not only correlated to a genetic variant but causation has been shown as well. In the case of cystic fibrosis, the mutant allele causes an ion channel to not function properly. This creates big problems within the mucous membranes of individuals with cystic fibrosis, as their cells are not able to regulate ions like chloride. In turn, these substances build up and cause the symptoms of the disease. In general, most “Carrier Status” tests can accurately find whether you are a carrier of a particular condition. Knowing your partner’s genotype can, therefore, be helpful and predictive when planning a family.

Genetic Ancestry Tests

When it comes to genetic ancestry tests (like ancestry’s, 23andme’s or myheritage’s) we aren’t really discussing correlation or causation. In fact, genetic ancestry tests function on a much simpler concept. Genes have to come from somewhere. Further, because mutations happen slowly over time and are extremely rare, the large majority of your DNA comes directly from your ancestors.

Genetic ancestry tests simply analyze your DNA, then compare it to reference populations from around the world. This can give you an idea where your genes originated from, as they are often still present in the population your family originally came from. However, these tests generally ignore the fact that you only receive half of your parents’ combined genetics. Like you, your parents only received half of their parents’ combined genetics.

In practice, this means that only the genes present in your body will be related to reference populations, though your actual family could be much larger. Genetic ancestry tests are a good tool for analyzing parts of your family history, but can only report on the genetic variants actually present within your body.

Genetic Disease Tests

Disease risk traits are typically where genetic testing companies start to take some liberties with their reports. With hundreds, or possibly thousands of genetic reports published every year there are endless correlations between genetics and disease.

Unfortunately, few if any of these diseases are caused entirely by genetics. More than that, almost none of the increases in risk have been explored deeply enough to understand how a particular variant may increase your risk of disease.

A good example is that of the BRCA genes, known to be correlated to breast cancer. The FDA has approved tests for the BRCA gene because it has been shown how a mutated BRCA gene can cause cancer. So, some causation has been shown for the BRCA genes and cancer. However, only a small portion of all breast cancer is caused by mutant BRCA genes, so even this test is not very informative. Most genes related to disease risk are related only through correlation, which may be completely random until proven otherwise.

Polygenic Trait Tests

Some companies offer genetic tests for things like “Intelligence” or “Depression”. While there have been correlations between genetics and these complex subjects, no one has proved any sort of causation. In fact, because of the large number of genes involved in these traits and the impact of the environment on forming the traits, it is unlikely that genetic testing will ever contribute significantly to understanding these traits.

Like margarine use and the divorce rate of Maine, some complex traits are certainly correlated with various genotypes. However, this provides no evidence that one is affecting the other. Of all the traits genetic reports analyze, these are the least likely to be helpful. In fact, they greatly oversimply complex subjects like cognition and learning, which are more related to your environment than your genetics.

Other Types of Genetic Test

Other genetic tests available include those for nutrition, pharmacogenomics, or fitness traits. Like any polygenic trait, these are largely just correlations and no causation has been solidly identified. A whole host of new companies are springing up to tell you about these correlations. But don’t buy in yet, these correlations are likely going to be overturned and largely refined in the coming years. Keep reading to see some of the companies currently focused on correlation, instead of causation.

Microbiome Testing

A new branch of DNA testing is Microbiome testing. This test looks at each person’s individual profile of bacteria, archaea, viruses inhabiting in his body and attempts through sophisticated analytical tools and A.I to correlate between one’s profile and certain disease and conditions. It’s still hard to say whether this can be considered simple correlation or that this field will grow indefinitely (some experts even claim will surpass DNA tests because the information provided is more comprehensive).

Companies Relying On Correlation 

Which we would not recommend using

While a majority of the large DNA testing companies (think 23andMe or Ancestry) work with traits that are generally explained down to the exact molecular cause, or at least are backed by a large number of correlation studies, there are many smaller companies emerging which do not live up to these standards. Below are several of these companies, and the tests they offer which are based only on correlation, not causation (in many cases these are DNA data testing companies who only process external data provided to them).

Allél

This company focuses on skincare, and the company has recently taken a DNA approach to determine the right medicines and lotions for your skin health. While this may sound intriguing, there are several red flags related to the science behind their “miraculous DNA skin cure”. In fact, if you go to their page on the science behind their products, you will find that they are basing their entire DNA business model on only 2 peer-reviewed articles, and fewer than 1000 DNA tests.

As far as correlation over causation, Allél is relying solely on a minor correlation. While there is other evidence backing some of their products and procedures, there is no way that only 2 papers and 800 people can provide a full understanding of how DNA affects your skin. In fact, most of the skin conditions that they are fixing can be seen directly, and there is no need to test a person’s DNA to fix their condition. The DNA test appears to be just another sales tactic meant to reach the new “DNA test” market. Unfortunately, many people will be roped in by this “science” and will buy their supplements, lotions, and other products. The products may work, but it won’t be because they are specifically matched to your DNA.

Genomelink

For free, Genomelink will provide you will a more detailed analysis of your DNA than any of the major companies will give you. While this is interesting information for a great price, you should know that most of the reports they give are based solely on correlations between a trait and certain genetic variants. Few of the traits they report on are based on more than correlations.

For example, Genomelink has a report on childhood intelligence. Several studies have found correlations between certain genes and general childhood intelligence. Genomelink presents this information as if the genes are certainly responsible for the increase in intelligence. However, we also know that childhood intelligence is malleable, and largely affected by a child’s environment. Without proving causation, it could easily be the case that certain populations have better learning and studying habits, which in turn leads to increased intelligence.

The Japanese, for example, share many genetic variants as well as a culture which supports and improves childhood intelligence. So, it is a bit like the chicken and the egg. Did the genetic variants cause increased intelligence, or were the variants simply present as the culture developed advanced teaching methods? Until causation is shown, we will never know.

Promethease

Like Genomelink, Promethease shows you many studies based on the genetic variants you have. It also allows you to explore your genome and gives you access to a number of reports for only $12. While this is also a decent price, most of the studies it presents are correlations only. But, you can easily save yourself $12 by looking up the studies yourself, on SNPedia. This wiki site is constantly updated with the most recent genetic studies, sorted by gene. However, unlike many of the other companies listed here, Promethease is simply connecting you to the scientific studies, so you can determine correlation and causation yourself.

Lifenome

Lifenome focuses on traits like nutrition, fitness, skincare, allergies, and personality. Unfortunately, none of these traits have been shown to be caused by a single gene. In fact, all of these traits are highly polygenic traits and are influenced profoundly by your environment. It is a highly complex topic, and many genetic variants have been correlated to various aspects of nutrition and fitness.

To get around this complexity, companies like Lifenome simply report which variants you have and if those variants have been implicated in any correlation studies. The company claims to be using artificial intelligence to determine which correlations are significant, though this technology has never been proven to find anything more than correlations. So, in effect, they are selling you their “best guess” on how your genetics may affect your life. 

Habit

The company Habit focuses solely on nutrition and genetics, claiming it can produce the “ideal” diet for your “genetic type”. Unfortunately, the science behind diet and genetics is just not that advanced yet. Many different genetic variants have been correlated to various aspects of nutrition, such as the ability to digest carbohydrates or to your body’s ability to absorb a specific vitamin. However, most of these studies are simple correlations and really tell you nothing about your specific body chemistry.

More than that, the entire idea that diets should be “specialized” to the individual is largely pseudoscience. In fact, epidemiologists have shown that a plant-based, whole-foods diet is likely the most beneficial for all people, regardless of their genetic variants, heritage, or culture. Companies like Habit largely promote a general healthy diet under the guise that they are customizing a diet “specifically for you”. Starting at $200, these reports are largely a waste of your money.

DNAFit

Like Habit, DNAFit focuses on one area of genetics: fitness. Also like Habit, the science behind these reports is largely in the form of correlation studies. The company has taken small aspects of overall fitness, such as muscle composition or aerobic capacity, and tied each aspect to genetics. While these components are determined partially by genetics, there is also ample evidence that these traits are highly variable based on how much a person exercises.

For instance, your genetic fitness report may show that you would be a better endurance runner, based on the type of muscle your genes are most likely to reproduce. But other studies have shown that the more a person sprints, the more their muscles will adapt to that activity. Likewise, the oxygen capacity of your blood can easily be increased by working out at elevation and forcing your body to produce more blood cells. Thus, the correlations provided by companies such as DNAFit are largely useless. Considering you will pay $150 for these results, you might be better off hiring a personal trainer and simply working on the things you want to be better at.

DNA Romance

The last and most egregious use of genetic correlations is found in companies like DNA Romance. DNA Romance does exactly what its name implies: it matches users based on their “genetic compatibility”. The entire enterprise is based on only 20+ studies which widely correlate the production of pheromones to dating compatibility. The entire premise behind the company drastically underestimates and oversimplifies human attraction and love. Both of these subjects are complex and involve your early childhood, family structure, and past experiences. Your genetics likely play a role, but how large of a role has yet to be determined.  

 

Other Tests to Watch Out For

The above is not an exhaustive list of genetic testing companies. There are companies which will give you wine selections based on your genetics, and other such nonsense. While these tests may be a fun way to try new wines, the science is limited to correlations only a few labs have even found. In the future, with much more research, we may start to find causation behind some of these correlations. But, until then, these companies are simply charging you to give you generalized advice based on the fact that a certain gene might be related to your tastes and preferences. Like learning, memory, and dating, these traits can easily be changed based simply on what you do every day and the new things you try.

Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments
trackback
4 years ago

[…] of the complications within genetics, nutrition tests are notoriously bad at being predictive and helpful. Often, companies offering nutrition-based DNA analysis are basing their results on small-scale […]

trackback
4 years ago

[…] trying to monetize the information within your genes – meet DNA Data-Testing Companies. As rip-off DNA companies are growing in numbers and popularity, prospective customers should be attentive to detail and […]

trackback
4 years ago

[…] Pet DNA databases are much smaller than human databases, which makes them less reliable (and it’s not like all human DNA tests are). Another major problem with pet DNA testing is that veterinarians have not been trained […]