Primary vs. Secondary Prevention
| Goal | Evidence Needed |
|---|---|
| Primary prevention | Causal association required; confounding must be removed |
| Secondary prevention | Association sufficient (may be confounded); prediction focus |
Confounding and DAG, GDT 508: Public Health Epidemiology
Professor in Epidemiology and Statistics, Universiti Sains Malaysia
Confounding refers to a situation where a noncausal association between an exposure and outcome is observed due to the influence of a third variable (confounder).
Key Point: If our goal is primary prevention, distinguishing causal from noncausal associations is crucial.
Consider studying the effect of chronic kidney disease (CKD) on mortality:
Result: Confounding by age distorts the true causal effect of CKD on mortality
Fundamental question: Would the outcome in exposed individuals differ if they had not been exposed?
A confounder must satisfy three criteria:
The confounder is associated with both exposure and outcome, creating a backdoor path
Research Question: Does CKD increase mortality risk?
Age qualifies as a confounder because:
| Gender | Cases | Controls | Odds Ratio |
|---|---|---|---|
| Males | 88 | 68 | 1.71 |
| Females | 62 | 82 |
But is this the true effect of gender on malaria?
Outdoor workers: - Males: 53 cases, 15 controls (OR = 1.06)
Indoor workers: - Males: 35 cases, 53 controls (OR = 1.00)
Conclusion: Gender effect disappears when we account for outdoor occupation. Gender was confounded by work environment!
Random association: Sometimes sampling variability creates imbalance in case-control studies
Surrogate confounders: Variables may represent complex constructs
DAGs provide a structured visual framework for:
Key features: - Directed: arrows show causal direction - Acyclic: no closed loops (future cannot cause past)
Consider aspirin reducing coronary heart disease (CHD) through decreased platelet aggregation:
Mediation model: - Aspirin → Platelet aggregation - Platelet aggregation → CHD
With genetic confounder: - Genetic variant → Platelet aggregation - Platelet aggregation → CHD - Creates collider at platelet aggregation!
Confounding:
Mediation:
Research question: Does obesity cause decline in kidney function?
Known facts: - African Americans have higher obesity rates - African Americans progress faster to kidney disease
Ethnicity is a confounder:
Action: Adjust for ethnicity to remove confounding
Even with measurement and adjustment, confounding may remain:
Result: Adjustments only partially successful
Question: Does lead poisoning cause polycystic kidney disease (PKD)?
DAG shows: - Lead poisoning → GFR (affects kidney function) - PKD → GFR (affects kidney function) - GFR is a common effect (collider)
Critical insight: GFR is NOT a confounder; it’s a collider!
Collider: A variable caused by two other variables (arrows point TO it)
Conditioning on a collider: - Opens a path between its causes - Can introduce bias (collider-stratification bias) - Should generally NOT be adjusted for
If we restrict study to patients with low GFR:
Lesson: Understanding causal structure prevents analytical errors
Evidence Synthesis for Constructing DAGs involves:
DAGitty is a browser-based tool for:
Also available: R package ‘dagitty’ on CRAN
Three key questions:
The presence of confounding is assessed by comparing:
Confounding present when: τ ≠ τ’
Measure: Can use percent excess risk explained
\[\text{% Excess Risk Explained} = \frac{RR_U - RR_A}{RR_U - 1.0} \times 100\]
Where: - RRU = Unadjusted relative risk - RRA = Adjusted relative risk
Tells us what proportion of association is due to confounding
Study of long-term oxygen therapy in COPD patients:
| Treatment | Crude HR | Adjusted HR | % Explained |
|---|---|---|---|
| Oxygen therapy | 2.36 | 1.38 | 72% |
Interpretation: Disease severity markers explained 72% of the paradoxical adverse effect
All three involve a third variable affecting X-Y relationship:
| Effect | τ-τ’ | τ’ | Interpretation |
|---|---|---|---|
| Mediation/Confounding | + | + | Consistent direction |
| Suppression | + | - | Opposite signs |
Suppression occurs when including a third variable increases the association magnitude
Example: Intelligence and assembly line errors - Direct effect: More intelligence → fewer errors (negative) - Indirect effect: Intelligence → boredom → more errors (positive) - Effects may cancel out!
US vs. Venezuela mortality rates:
Direction reversed! Due to striking age distribution differences
Restriction: Limit study to specific values of confounder
Stratification: Analyze exposure-outcome relationship within strata of confounder
When confounding present: - Stratified estimates similar to each other - Both differ from crude estimate
Combines stratum-specific estimates into single adjusted measure:
\[RR_{MH} = \frac{\sum_{i} a_i N_{0i}/N_i}{\sum_{i} c_i N_{1i}/N_i}\]
Provides: Single summary estimate adjusted for confounding
Multiple regression adjusts for confounders simultaneously:
\[Y = \beta_0 + \beta_1 X + \beta_2 C_1 + \beta_3 C_2 + ... + \epsilon\]
Advantages: - Handles continuous confounders - Adjusts for multiple confounders - Provides adjusted effect estimates
Instrumental variable (IV): A variable that:
Use: When confounding cannot be measured/controlled Example: Mendelian randomization using genetic variants
Propensity score: Probability of receiving exposure given covariates
Residual confounding remains when adjustment is incomplete due to:
Overadjustment bias occurs when adjusting for:
Result: Can introduce bias or mask true effects
Warning
Important: Do NOT rely solely on p-values to identify confounding
| Goal | Evidence Needed |
|---|---|
| Primary prevention | Causal association required; confounding must be removed |
| Secondary prevention | Association sufficient (may be confounded); prediction focus |
Solution: Transparency and sensitivity analyses
Main readings: - Szklo & Nieto: Epidemiology Beyond the Basics (Chapter 5) - Ferguson et al. (2020): ESC-DAGs method
Online tools: - DAGitty: dagitty.net - R packages: dagitty, ggdag
Additional: - Hernán & Robins: Causal Inference book (free online)
Confounding is a fundamental challenge in observational epidemiology that requires:
Remember: The goal is valid causal inference for effective prevention!
Contact information and additional resources available through course materials
Thank you for your attention!