What is a confounder (confounding variable) and why should I care?
— Tips for making sense of science
Confounders (or confounding variables) are factors that are associated with both the “cause” and “effect” (or exposure and outcome) in a potential cause-and-effect relationship. If ignored, they can cause misleading results and conclusions. Common confounders include age, sex, and socioeconomic status.
Imagine a study that looked at the relationship between diet and various health conditions. This study would likely find that people who eat kale have a lower risk of some health conditions (e.g., heart disease). However, people who eat kale often have many other health-promoting habits (like eating more fruits and veggies and getting regular exercise). Unless we account for these factors (confounders), we may conclude that kale salads are dramatically lowering the risk of heart disease when in reality, kale mania is a marker of a “health junkie” that is doing many things to promote health. Indeed, people are often all-in on health behaviors (e.g. vaccinated, exercising, not smoking, and eating a healthy diet) – or the opposite. This clustering of health behaviors is a huge challenge in health sciences research (known as “healthy user bias”).
In scientific studies, and in our daily lives, we should always consider how confounders (like age, sex, and lifestyle) may impact the relationships we observe. Otherwise, we can mistakenly conclude that there is a cause-and-effect relationship between two factors when none exists. Confounders, when ignored, can also exaggerate or minimize a true relationship.
Nerd Note: When we study cause and effect relationships, the potential cause is called the independent variable (or exposure, in epidemiology) and the potential result is called the dependent variable (or outcome, in epidemiology). For example, if we are looking at the role of social media use on depression, social media use is the independent variable (exposure) and depression is the dependent variable (outcome). A confounding variable must be related to both the independent variable (the cause/exposure) and the dependent variable (the effect/outcome).
Let’s look at a few other examples of confounders in action, and what we can do about them.
Breastfeeding and test scores
Studies that look at the impacts of breastfeeding on cognitive test scores in children are fraught with confounders because many of the factors that influence breastfeeding (e.g., parental education and socioeconomic status) also influence the outcomes we are studying (e.g., test scores). If we see a trend towards higher test scores (SAT or GPAs) among children who have been breastfed for longer periods, we need to consider whether this could also be due to the fact that the researchers aren’t accounting for differences in factors such as parent’s educational attainment or financial resources. In other words, because households with higher educational attainment are more likely to breastfeed, the relationship being observed between breastfeeding and test scores is also picking up on the relationship between the parent’s education level and the child’s cognitive ability.
While many studies have reported a connection between breastfeeding and test scores, these studies should (and do!) acknowledge that other factors (confounders) could be other factors driving this connection. Indeed, when we compare breastfeeding and test scores among siblings, we see very few differences in the health and well-being of children who are breastfed compared to those who are not (i.e., siblings grow up in a similar educational or financial environment and thus these confounders don’t bias our results).
Paxlovid and rebound
Initial studies of COVID rebound found that it was fairly common among people who took Paxlovid and raised concerns that Paxlovid was causing rebound. However, the most recent CDC study suggests that this is not a cause-and-effect relationship – it was being driven by a confounder. The same risk factors that cause someone to take Paxlovid (high risk of severe COVID infection) also increase the risk for rebound (even if Paxlovid is not taken). These risk factors (age, immunosuppression) are confounders of the association between Paxlovid and COVID rebound. When we take them out of the equation by doing a well-controlled trial (testing people with the same risk factors where some take Paxlovid and some don’t take Paxlovid), we do not see a consistent difference in rebound rate when we compare those who take Paxlovid with those who don’t.
How scientists work around confounders
Scientists deal with confounders in a few different ways. One way is to design the experiment in a way that reduces the impact of confounders. For example, they may restrict the participants in a study to a more unified group (same age, sex, etc.) or make sure that the two comparison groups are matched for potential confounders (like age, sex, education, income, etc.). Another approach is to use statistical analysis methods to control for confounders during the analysis.
The reality is that confounders are often lurking in the background, shaping the relationships we observe. Thoughtful study designs (like the breastfeeding sibling study) and statistical methods can reduce the effect of confounders, but there are often other factors at play that researchers didn’t or can’t measure. These factors can make it seem there is a cause-and-effect relationship when this is not the case, or can also exaggerate, or minimize a true causal relationship.
Putting this into practice
When you encounter a potential cause-and-effect relationship in a health headline or your daily life, look before you leap to conclusions. Ask yourself what factors (confounders) may be linked to both the (potential) cause and the (potential) effect. Were they accounted for in the study design or analysis?
Can you think of examples where confounders muddy our ability to understand a relationship between two factors?
Resources
Learn more about confounders/confounding variables on Scribblr and Enago.