Stratification and Adjustment in Epidemiology

Controlling for Confounding in Epidemiological Studies

Kamarul Imran Musa, Universiti Sains Malaysia

Introduction

Learning Objectives

  1. To understand stratification and bias in epidemiological studies
  2. To be able to identify stratification and bias in epidemiological studies
  3. To be able to prevent and control stratification and bias in epidemiological studies

What are Stratification and Adjustment?

Analytical tools used to:

  • Control for confounding effects
  • Assess effect modification
  • Summarize associations of several predictor variables with disease risk

Key Point

Stratification and multivariate analysis are essential for disentangling confounding from true exposure-outcome relationships

Why Stratification?

Stratification is informative because:

  1. Allows straightforward and simultaneous examination of:
    • Confounding
    • Effect modification
  2. Helps choose the appropriate statistical technique for adjustment

Stratification to Disentangle Confounding

Example: Male Gender and Malaria Risk

Case-control study examining:

  • Exposure: Male gender
  • Outcome: Malaria infection
  • Potential confounder: Occupation (outdoor/indoor)

Crude analysis: Odds Ratio = 1.71 (males at higher risk)

Stratified Analysis by Occupation

Results after stratification:

Stratum OR
Outdoor workers 1.06
Indoor workers 1.00

Interpretation

Stratum-specific ORs are similar to each other (1.06, 1.00) but different from crude OR (1.71). Occupation is a confounder!

Assessing Interaction

Homogeneous stratum-specific ORs indicate:

  • No interaction present
  • Can calculate overall adjusted OR

Example of interaction:

  • ORs of 1.4 and 20.0 → Likely true interaction
  • ORs of 1.4 and 2.0 → Less likely interaction

Tip

Consider biological plausibility and pre-established hypotheses

Oral Contraceptives and MI Example

Crude OR = 1.7 (70% higher odds with OC use)

Age-stratified ORs:

Age Group OR
25-29 years 7.2
30-34 years 8.9
35-39 years 1.5
40-44 years 3.7
45-49 years 3.9

Adjusted OR = 3.97 (more than double crude estimate)

Negative Confounding

What is Negative Confounding?

When adjustment moves the estimate away from the null (1.0)

In the OC-MI example:

  • Age was driving crude association toward the null
  • After adjustment, true association revealed (OR = 3.97)

Adjustment Methods Based on Stratification

Direct Adjustment

Procedure:

  1. Calculate stratum-specific incidence rates in both study groups
  2. Identify a standard population with specific number in each stratum
  3. Calculate expected cases by applying study group rates to standard population
  4. Compute adjusted rates by dividing total expected cases by total standard population

Tip

Answers: “What would the rate be if both groups had the same age distribution?”

Indirect Adjustment

Used for age adjustment of mortality/morbidity data

Procedure:

  • Apply reference rates to study group strata
  • Calculate ratio of observed to expected events

Results in:

  • SMR = Standardized Mortality Ratio
  • SIR = Standardized Incidence Ratio
  • SPR = Standardized Prevalence Ratio

Mantel-Haenszel Method

Most common method for adjusted OR/RR

Formula for adjusted OR:

\[OR_{MH} = \frac{\sum_i (a_i d_i / N_i)}{\sum_i (b_i c_i / N_i)}\]

Where: - i = stratum - a, b, c, d = cells in 2×2 table - N = total in stratum

MH Example: Malaria Study

Calculation:

\[OR_{MH} = \frac{(53 \times 3)/81 + (35 \times 79)/219}{(10 \times 15)/81 + (52 \times 53)/219} = 1.01\]

Interpretation:

  • ORMH = 1.01 (no association after adjustment)
  • Lies between stratum-specific estimates (1.06, 1.00)
  • Confirms occupation as confounder

Assumption: Homogeneity of Effects

Key Assumption

Mantel-Haenszel assumes no multiplicative interaction between exposure and stratifying variable

When stratum-specific ORs are similar:

  • Assumption met
  • MH provides valid summary estimate

When ORs differ substantially:

  • Report stratum-specific estimates separately

Limitations and Alternatives

Limitations of Stratification

Three main limitations:

  1. One exposure at a time: Can only assess one exposure-outcome association per analysis

  2. Categorical only: Continuous variables must be categorized (may cause residual confounding)

  3. Sparse data: Too many strata → small numbers → unstable estimates

Multiple Regression for Adjustment

Advantages over stratification:

  • Handles multiple exposures simultaneously
  • Accommodates continuous variables
  • More efficient use of data

Common models:

Model Outcome Type Interpretation
Linear Continuous Change in mean per unit
Logistic Binary Change in log odds
Cox Time-to-event Change in log hazard
Poisson Count/Rate Change in log rate

Alternative: Instrumental Variables

Three conditions required:

  1. Causally associated with exposure
  2. Affects outcome only through exposure
  3. Not associated with any confounders

Two-step analysis:

  1. Regress exposure on instrument
  2. Regress outcome on predicted exposure

Tip

Controls for unmeasured confounding!

Alternative: Propensity Scores

Definition: Predicted probability of exposure based on relevant characteristics

Purpose: Mimic randomization by making exposed/unexposed groups comparable

Steps:

  1. Model exposure as outcome with potential confounders as predictors
  2. Calculate propensity score for each individual
  3. Match or stratify on propensity scores
  4. Analyze matched/stratified data

Residual Confounding and Overadjustment

Residual Confounding

Occurs when:

  1. Categories too broad for continuous variables
  2. Imperfect surrogate used (e.g., education for social class)
  3. Important confounders omitted from model
  4. Confounders misclassified

Result

Adjusted estimates still confounded despite adjustment attempt

Overadjustment

Definition: Inappropriate adjustment that distorts true relationship

Occurs when adjusting for:

  1. Intermediate variables (in causal pathway)
    • Example: Adjusting for hypertension in obesity-stroke relationship
  2. Variables too closely related to exposure
    • Example: Adjusting for residence in air pollution-respiratory disease
  3. Overlapping constructs simultaneously
    • Example: BMI, weight, and waist-hip ratio in same model

Summary and Key Messages

Key Takeaways

  1. Stratification is essential for identifying confounding and effect modification
  2. Adjustment methods control for confounding to reveal true associations
  3. Direct and indirect adjustment use stratification principles
  4. Mantel-Haenszel provides weighted average across strata
  5. Multiple regression overcomes limitations of stratification
  6. Beware of residual confounding and overadjustment

Choosing the Right Method

Use stratification when:

  • Exploring data initially
  • Assessing effect modification
  • Simple confounding structure

Use regression when:

  • Multiple confounders
  • Continuous variables
  • Multiple exposures of interest
  • Need statistical efficiency

Practical Considerations

Remember

  • Always examine crude associations first
  • Consider biological plausibility
  • Check for effect modification before pooling
  • Ensure adequate sample size in strata
  • Validate assumptions of chosen method

Thank You!

Questions?

Further Reading:

  • Chapter 7, “Epidemiology: Beyond Basics” (4th edition) by Szklo & Nieto

Contact: Prof Kamarul Imran