| 2x2 Contingency Table for Diagnostic Tests | |||
|---|---|---|---|
True Disease Status
|
Total | ||
| Disease Present | Disease Absent | ||
| Test Positive | True Positive (TP) | False Positive (FP) | TP + FP |
| Test Negative | False Negative (FN) | True Negative (TN) | FN + TN |
| Total | TP + FN | FP + TN | N |
Diagnostic Test Validity and Accuracy
Understanding Screening and Diagnostic Tests
1 Introduction
1.1 Learning Objectives
By the end of this presentation, you will be able to:
- Define key concepts in diagnostic test accuracy
- Calculate sensitivity, specificity, and predictive values
- Understand the impact of disease prevalence on test performance
- Interpret ROC curves and likelihood ratios
- Apply these concepts using R programming with tidyverse principles
2 Part 1: Fundamentals of Diagnostic Tests
2.1 Gold Standard Tests
Definition: A test that provides the definitive diagnosis against which other tests are compared
Characteristics:
- High accuracy (ideally 100% sensitivity and specificity)
- Often invasive, expensive, or time-consuming
- May not be practical for routine screening
Examples:
- Biopsy for cancer diagnosis
- Angiography for coronary artery disease
- Culture for infectious diseases
2.2 The 2×2 Table: Foundation of Test Evaluation
2.3 Key Performance Measures
Sensitivity (True Positive Rate) \[Sensitivity = \frac{TP}{TP + FN}\] Proportion of diseased individuals correctly identified
Specificity (True Negative Rate) \[Specificity = \frac{TN}{TN + FP}\] Proportion of healthy individuals correctly identified
2.4 Predictive Values
Positive Predictive Value (PPV) \[PPV = \frac{TP}{TP + FP}\] Probability that a positive test indicates disease
Negative Predictive Value (NPV) \[NPV = \frac{TN}{TN + FN}\] Probability that a negative test indicates no disease
2.5 Types of Errors in Testing
False Positive (Type I Error)
- Test positive, but no disease
- Consequences:
- Unnecessary anxiety
- Additional testing
- Inappropriate treatment
False Negative (Type II Error)
- Test negative, but disease present
- Consequences:
- Delayed diagnosis
- Disease progression
- Missed treatment opportunity
2.6 Common Biases in Screening Tests
Verification Bias - Non-random selection for confirmatory testing
Spectrum Bias - Study population not representative of target population
Lead Time Bias - Earlier detection appears to improve survival
Length Bias - Slowly progressing diseases overrepresented
Overdiagnosis Bias - Detection of disease that wouldn’t cause clinical problems
2.7 The Critical Role of Prevalence
Key Concept: Predictive values depend heavily on disease prevalence in the population
2.8 Positive Likelihood Ratio
Definition: Ratio of the probability of a positive test in diseased vs. non-diseased individuals
Formula: \[LR+ = \frac{Sensitivity}{1 - Specificity} = \frac{True\ Positive\ Rate}{False\ Positive\ Rate}\]
Interpretation:
- LR+ > 10: Strong evidence for disease
- LR+ 5-10: Moderate evidence for disease
- LR+ 2-5: Weak evidence for disease
- LR+ = 1: No diagnostic value
- LR+ < 1: Evidence against disease
3 Part 2: Worked Example with Simulated Data
3.1 Setting Up Our Example
Let’s create a hypothetical screening scenario using tidyverse:
Code
# Set seed for reproducibility
set.seed(123)
# Create simulated data for breast cancer screening using tidyverse
n <- 1000
prevalence <- 0.05 # 5% prevalence
sensitivity <- 0.85
specificity <- 0.90
# Generate dataset using tidyverse approach
screening_data <- tibble(
id = 1:n,
true_disease = sample(c("Diseased", "Healthy"),
size = n,
prob = c(prevalence, 1-prevalence),
replace = TRUE)
) %>%
mutate(
test_result = case_when(
true_disease == "Diseased" ~
sample(c("Positive", "Negative"),
size = n(),
prob = c(sensitivity, 1-sensitivity),
replace = TRUE),
true_disease == "Healthy" ~
sample(c("Positive", "Negative"),
size = n(),
prob = c(1-specificity, specificity),
replace = TRUE)
)
) %>%
mutate(
true_disease = factor(true_disease, levels = c("Healthy", "Diseased")),
test_result = factor(test_result, levels = c("Negative", "Positive"))
)Code
# Create contingency table using tidyverse
contingency_summary <- screening_data %>%
count(test_result, true_disease, .drop = FALSE) %>%
pivot_wider(names_from = true_disease, values_from = n) %>%
mutate(Total = Healthy + Diseased)
contingency_summary %>%
gt() %>%
tab_header(
title = "Contingency Table: Test Results vs True Disease Status"
) %>%
tab_spanner(
label = "True Disease Status",
columns = c("Healthy", "Diseased")
)| Contingency Table: Test Results vs True Disease Status | |||
|---|---|---|---|
| test_result |
True Disease Status
|
Total | |
| Healthy | Diseased | ||
| Negative | 847 | 8 | 855 |
| Positive | 100 | 45 | 145 |
3.2 Calculating Performance Measures Using Tidyverse
Code
# Calculate performance measures using tidyverse approach
performance_metrics <- screening_data %>%
summarise(
TP = sum(test_result == "Positive" & true_disease == "Diseased"),
FN = sum(test_result == "Negative" & true_disease == "Diseased"),
FP = sum(test_result == "Positive" & true_disease == "Healthy"),
TN = sum(test_result == "Negative" & true_disease == "Healthy"),
.groups = "drop"
) %>%
mutate(
sensitivity = TP / (TP + FN),
specificity = TN / (TN + FP),
ppv = TP / (TP + FP),
npv = TN / (TN + FN),
accuracy = (TP + TN) / (TP + TN + FP + FN),
youden_index = sensitivity + specificity - 1,
lr_positive = sensitivity / (1 - specificity),
lr_negative = (1 - sensitivity) / specificity
)Code
# Display results in a formatted table
performance_summary <- performance_metrics %>%
pivot_longer(
cols = everything(),
names_to = "Measure",
values_to = "Value"
) %>%
mutate(
Measure = case_when(
Measure == "TP" ~ "True Positives",
Measure == "FN" ~ "False Negatives",
Measure == "FP" ~ "False Positives",
Measure == "TN" ~ "True Negatives",
Measure == "sensitivity" ~ "Sensitivity",
Measure == "specificity" ~ "Specificity",
Measure == "ppv" ~ "Positive Predictive Value",
Measure == "npv" ~ "Negative Predictive Value",
Measure == "accuracy" ~ "Accuracy",
Measure == "youden_index" ~ "Youden Index",
Measure == "lr_positive" ~ "Likelihood Ratio (+)",
Measure == "lr_negative" ~ "Likelihood Ratio (-)"
),
Formula = case_when(
str_detect(Measure, "True|False") ~ "",
Measure == "Sensitivity" ~ "TP/(TP+FN)",
Measure == "Specificity" ~ "TN/(TN+FP)",
Measure == "Positive Predictive Value" ~ "TP/(TP+FP)",
Measure == "Negative Predictive Value" ~ "TN/(TN+FN)",
Measure == "Accuracy" ~ "(TP+TN)/Total",
Measure == "Youden Index" ~ "Sensitivity + Specificity - 1",
Measure == "Likelihood Ratio (+)" ~ "Sensitivity/(1-Specificity)",
Measure == "Likelihood Ratio (-)" ~ "(1-Sensitivity)/Specificity"
),
Value = case_when(
str_detect(Measure, "True|False") ~ as.character(Value),
TRUE ~ sprintf("%.3f", Value)
)
)Code
performance_summary %>%
gt() %>%
tab_header(
title = "Diagnostic Test Performance Metrics"
) %>%
tab_style(
style = cell_text(weight = "bold"),
locations = cells_column_labels()
)| Diagnostic Test Performance Metrics | ||
|---|---|---|
| Measure | Value | Formula |
| True Positives | 45 | |
| False Negatives | 8 | |
| False Positives | 100 | |
| True Negatives | 847 | |
| Sensitivity | 0.849 | TP/(TP+FN) |
| Specificity | 0.894 | TN/(TN+FP) |
| Positive Predictive Value | 0.310 | TP/(TP+FP) |
| Negative Predictive Value | 0.991 | TN/(TN+FN) |
| Accuracy | 0.892 | (TP+TN)/Total |
| Youden Index | 0.743 | Sensitivity + Specificity - 1 |
| Likelihood Ratio (+) | 8.041 | Sensitivity/(1-Specificity) |
| Likelihood Ratio (-) | 0.169 | (1-Sensitivity)/Specificity |
3.3 Enhanced 2×2 Table with Calculations
Code
# Create enhanced table with actual values
enhanced_summary <- performance_metrics %>%
select(TP, FN, FP, TN) %>%
mutate(
diseased_total = TP + FN,
healthy_total = FP + TN,
positive_total = TP + FP,
negative_total = FN + TN,
total = TP + FN + FP + TN
)
enhanced_table <- tibble(
` ` = c("Test Positive", "Test Negative", "Total"),
`Disease Present` = c(
paste0("TP = ", enhanced_summary$TP),
paste0("FN = ", enhanced_summary$FN),
paste0("Total = ", enhanced_summary$diseased_total)
),
`Disease Absent` = c(
paste0("FP = ", enhanced_summary$FP),
paste0("TN = ", enhanced_summary$TN),
paste0("Total = ", enhanced_summary$healthy_total)
),
`Total` = c(
paste0("Total = ", enhanced_summary$positive_total),
paste0("Total = ", enhanced_summary$negative_total),
paste0("N = ", enhanced_summary$total)
)
)
enhanced_table %>%
gt() %>%
tab_header(
title = "Enhanced 2×2 Table with Calculated Values"
) %>%
tab_spanner(
label = "True Disease Status",
columns = c("Disease Present", "Disease Absent")
)| Enhanced 2×2 Table with Calculated Values | |||
|---|---|---|---|
True Disease Status
|
Total | ||
| Disease Present | Disease Absent | ||
| Test Positive | TP = 45 | FP = 100 | Total = 145 |
| Test Negative | FN = 8 | TN = 847 | Total = 855 |
| Total | Total = 53 | Total = 947 | N = 1000 |
Key Findings:
- Sensitivity: 84.9% (good at detecting disease)
- Specificity: 89.4% (good at ruling out disease)
- PPV: 31% (moderate confidence in positive results)
- NPV: 99.1% (high confidence in negative results)
4 Part 3: Advanced Analysis with Logistic Regression
4.1 Creating Complex Simulation Dataset Using Tidyverse
Code
# Set seed for reproducibility
set.seed(456)
n <- 1000
# Generate complex dataset using tidyverse
cancer_data <- tibble(
id = 1:n,
age = pmax(pmin(rnorm(n, mean = 55, sd = 15), 90), 20) # Age between 20-90
) %>%
mutate(
# Generate true cancer status (influenced by age)
cancer_prob = plogis(-3 + 0.05 * age),
cancer = rbinom(n, 1, cancer_prob),
# Generate ToolA results (mammography-like)
toolA_sensitivity = pmin(0.75 + 0.01 * pmax(age - 40, 0), 0.95),
toolA_specificity = 0.88,
toolA_prob = case_when(
cancer == 1 ~ toolA_sensitivity,
cancer == 0 ~ 1 - toolA_specificity
),
toolA = rbinom(n, 1, toolA_prob),
# Generate ToolB results (genetic test-like)
toolB_sensitivity = 0.92,
toolB_specificity = 0.82,
toolB_prob = case_when(
cancer == 1 ~ toolB_sensitivity,
cancer == 0 ~ 1 - toolB_specificity
),
toolB = rbinom(n, 1, toolB_prob)
) %>%
mutate(
toolA = factor(toolA, levels = c(0,1), labels = c("Negative", "Positive")),
toolB = factor(toolB, levels = c(0,1), labels = c("Negative", "Positive")),
cancer = factor(cancer, levels = c(0,1), labels = c("No Cancer", "Cancer"))
) %>%
select(id, age, toolA, toolB, cancer)Code
# Display first few rows
cancer_data %>%
slice_head(n = 10) %>%
gt() %>%
tab_header(
title = "Sample of Cancer Screening Dataset"
) %>%
fmt_number(columns = age, decimals = 1)| Sample of Cancer Screening Dataset | ||||
|---|---|---|---|---|
| id | age | toolA | toolB | cancer |
| 1 | 34.8 | Negative | Negative | No Cancer |
| 2 | 64.3 | Positive | Positive | Cancer |
| 3 | 67.0 | Positive | Positive | Cancer |
| 4 | 34.2 | Negative | Negative | No Cancer |
| 5 | 44.3 | Positive | Positive | Cancer |
| 6 | 50.1 | Positive | Positive | Cancer |
| 7 | 65.4 | Positive | Positive | Cancer |
| 8 | 58.8 | Negative | Negative | No Cancer |
| 9 | 70.1 | Negative | Negative | No Cancer |
| 10 | 63.6 | Positive | Positive | Cancer |
4.2 Logistic Regression Analysis Using Broom and gtsummary
Code
# Fit logistic regression model
model <- glm(cancer ~ toolA + toolB,
data = cancer_data,
family = binomial())
# Display model summary using gtsummary
model %>%
tbl_regression(exponentiate = TRUE) %>%
add_glance_table() %>%
modify_header(label = "**Variable**") %>%
modify_footnote(ci = "CI = Confidence Interval") %>%
bold_labels()| Variable | OR | 95% CI | p-value |
|---|---|---|---|
| toolA | |||
| Negative | — | — | |
| Positive | 50.8 | 29.7, 91.6 | <0.001 |
| toolB | |||
| Negative | — | — | |
| Positive | 50.0 | 28.8, 91.8 | <0.001 |
| Null deviance | 1,377 | ||
| Null df | 999 | ||
| Log-likelihood | -209 | ||
| AIC | 425 | ||
| BIC | 439 | ||
| Deviance | 419 | ||
| Residual df | 997 | ||
| No. Obs. | 1,000 | ||
| Abbreviations: CI = Confidence Interval, OR = Odds Ratio | |||
4.3 Model Coefficients Using Broom
Code
# Create detailed model summary using broom
model_tidy <- model %>%
tidy(conf.int = TRUE, exponentiate = TRUE) %>%
mutate(
term = case_when(
term == "(Intercept)" ~ "Intercept",
term == "toolAPositive" ~ "Tool A Positive",
term == "toolBPositive" ~ "Tool B Positive"
)
)Code
model_tidy %>%
gt() %>%
tab_header(
title = "Logistic Regression Model Coefficients"
) %>%
fmt_number(columns = c(estimate, conf.low, conf.high), decimals = 3) %>%
fmt_scientific(columns = c(std.error, statistic, p.value), decimals = 3) %>%
cols_label(
term = "Term",
estimate = "Odds Ratio",
std.error = "Std Error",
statistic = "Z Statistic",
p.value = "P-value",
conf.low = "Lower CI",
conf.high = "Upper CI"
)| Logistic Regression Model Coefficients | ||||||
|---|---|---|---|---|---|---|
| Term | Odds Ratio | Std Error | Z Statistic | P-value | Lower CI | Upper CI |
| Intercept | 0.011 | 3.101 × 10−1 | −1.453 × 101 | 7.345 × 10−48 | 0.006 | 0.020 |
| Tool A Positive | 50.842 | 2.860 × 10−1 | 1.374 × 101 | 5.938 × 10−43 | 29.723 | 91.639 |
| Tool B Positive | 50.012 | 2.947 × 10−1 | 1.328 × 101 | 3.226 × 10−40 | 28.766 | 91.760 |
Interpretation: - Tool A Positive: 50.8 times higher odds of cancer - Tool B Positive: 50 times higher odds of cancer
4.4 Model Performance Evaluation Using Broom
Code
# Generate predictions using augment
model_predictions <- model %>%
augment(data = cancer_data, type.predict = "response") %>%
mutate(
predicted_class = factor(
ifelse(.fitted > 0.5, "Cancer", "No Cancer"),
levels = c("No Cancer", "Cancer")
)
)Code
# Calculate performance metrics using tidyverse
performance_summary <- model_predictions %>%
summarise(
TP = sum(predicted_class == "Cancer" & cancer == "Cancer"),
FN = sum(predicted_class == "No Cancer" & cancer == "Cancer"),
FP = sum(predicted_class == "Cancer" & cancer == "No Cancer"),
TN = sum(predicted_class == "No Cancer" & cancer == "No Cancer"),
.groups = "drop"
) %>%
mutate(
accuracy = (TP + TN) / (TP + TN + FP + FN),
sensitivity = TP / (TP + FN),
specificity = TN / (TN + FP),
ppv = TP / (TP + FP),
npv = TN / (TN + FN)
)Code
# Display confusion matrix
confusion_matrix <- model_predictions %>%
count(predicted_class, cancer, .drop = FALSE) %>%
pivot_wider(names_from = cancer, values_from = n)
confusion_matrix %>%
gt() %>%
tab_header(
title = "Confusion Matrix - Logistic Regression Model"
) %>%
tab_spanner(
label = "Actual",
columns = c("No Cancer", "Cancer")
) %>%
tab_stubhead(label = "Predicted")| Confusion Matrix - Logistic Regression Model | ||
|---|---|---|
| predicted_class |
Actual
|
|
| No Cancer | Cancer | |
| No Cancer | 532 | 77 |
| Cancer | 17 | 374 |
Code
# Display performance metrics
performance_summary %>%
pivot_longer(
cols = accuracy:npv,
names_to = "Metric",
values_to = "Value"
) %>%
mutate(
Metric = str_to_title(str_replace_all(Metric, "_", " "))
) %>%
gt() %>%
tab_header(
title = "Model Performance Metrics"
) %>%
fmt_number(columns = Value, decimals = 3)| Model Performance Metrics | |||||
|---|---|---|---|---|---|
| TP | FN | FP | TN | Metric | Value |
| 374 | 77 | 17 | 532 | Accuracy | 0.906 |
| 374 | 77 | 17 | 532 | Sensitivity | 0.829 |
| 374 | 77 | 17 | 532 | Specificity | 0.969 |
| 374 | 77 | 17 | 532 | Ppv | 0.957 |
| 374 | 77 | 17 | 532 | Npv | 0.874 |
4.5 ROC Curve Analysis Using Tidyverse and pROC
Code
# Create ROC curves for individual tools and combined model
roc_data <- cancer_data %>%
mutate(
toolA_numeric = as.numeric(toolA == "Positive"),
toolB_numeric = as.numeric(toolB == "Positive"),
cancer_numeric = as.numeric(cancer == "Cancer")
)
# Calculate ROC curves
roc_toolA <- roc(roc_data$cancer_numeric, roc_data$toolA_numeric)
roc_toolB <- roc(roc_data$cancer_numeric, roc_data$toolB_numeric)
roc_model <- roc(roc_data$cancer_numeric, model_predictions$.fitted)
# Create ROC curve data for ggplot
create_roc_data <- function(roc_obj, name) {
tibble(
sensitivity = roc_obj$sensitivities,
specificity = roc_obj$specificities,
test = name,
auc = as.numeric(auc(roc_obj))
) %>%
mutate(
fpr = 1 - specificity,
test_auc = paste0(test, " (AUC = ", round(auc, 3), ")")
)
}Code
roc_plot_data <- bind_rows(
create_roc_data(roc_toolA, "Tool A"),
create_roc_data(roc_toolB, "Tool B"),
create_roc_data(roc_model, "Combined Model")
)
# Create ROC curve plot
roc_plot_data %>%
ggplot(aes(x = fpr, y = sensitivity, color = test_auc)) +
geom_line(size = 1.2) +
geom_abline(intercept = 0, slope = 1, linetype = "dashed", color = "gray") +
labs(
title = "ROC Curves Comparison",
subtitle = "Individual Tools vs Combined Logistic Regression Model",
x = "False Positive Rate (1 - Specificity)",
y = "True Positive Rate (Sensitivity)",
color = "Test"
) +
theme_minimal() +
theme(legend.position = "bottom") +
scale_x_continuous(limits = c(0, 1)) +
scale_y_continuous(limits = c(0, 1))4.6 Comprehensive Test Comparison
Code
# Calculate individual test performance using tidyverse
calculate_test_performance <- function(test_result, true_outcome) {
pos_test <- test_result == "Positive"
cancer_present <- true_outcome == "Cancer"
tibble(
sensitivity = sum(pos_test & cancer_present) / sum(cancer_present),
specificity = sum(!pos_test & !cancer_present) / sum(!cancer_present),
auc = as.numeric(auc(roc(as.numeric(cancer_present), as.numeric(pos_test))))
)
}
# Calculate performance for individual tools
toolA_performance <- calculate_test_performance(cancer_data$toolA, cancer_data$cancer)
toolB_performance <- calculate_test_performance(cancer_data$toolB, cancer_data$cancer)Code
# Create comparison table
comparison_table <- tibble(
Test = c("Tool A", "Tool B", "Combined Model"),
AUC = c(
toolA_performance$auc,
toolB_performance$auc,
as.numeric(auc(roc_model))
),
Sensitivity = c(
toolA_performance$sensitivity,
toolB_performance$sensitivity,
performance_summary$sensitivity
),
Specificity = c(
toolA_performance$specificity,
toolB_performance$specificity,
performance_summary$specificity
)
)Code
comparison_table %>%
gt() %>%
tab_header(
title = "Test Performance Comparison"
) %>%
fmt_number(columns = c(AUC, Sensitivity, Specificity), decimals = 3) %>%
tab_style(
style = cell_fill(color = "lightblue"),
locations = cells_body(rows = AUC == max(AUC))
)| Test Performance Comparison | |||
|---|---|---|---|
| Test | AUC | Sensitivity | Specificity |
| Tool A | 0.891 | 0.902 | 0.880 |
| Tool B | 0.885 | 0.925 | 0.845 |
| Combined Model | 0.963 | 0.829 | 0.969 |
4.7 Summary
Individual Tests:
- Each tool has different strengths
- Tool A: Better specificity
- Tool B: Better sensitivity
Combined Approach:
- Logistic regression improves overall performance
- Higher AUC indicates better discrimination
- Balanced sensitivity and specificity
Clinical Implications:
- Combining tests can improve diagnostic accuracy
- Consider prevalence when interpreting results
- Balance between false positives and false negatives
5 Summary and Clinical Applications
5.1 Key Concepts Reviewed
- Test Performance Measures
- Sensitivity and specificity are intrinsic test properties
- Predictive values depend on disease prevalence
- Likelihood ratios provide clinical utility information
- Prevalence Matters
- Low prevalence reduces positive predictive value
- High prevalence reduces negative predictive value
- Target high-risk populations for screening
- Statistical Modeling
- Logistic regression can combine multiple tests
- ROC curves help compare test performance
- AUC provides overall discrimination measure
5.2 Clinical Decision Making
When choosing diagnostic tests, consider:
- Population characteristics and disease prevalence
- Consequences of false positives vs. false negatives
- Cost and availability of tests
- Patient preferences and values
Best practices:
- Use high-sensitivity tests for screening
- Use high-specificity tests for confirmation
- Consider combining tests when appropriate
- Always interpret results in clinical context
5.3 R Packages Used in This Analysis
Code
package_summary %>%
gt() %>%
tab_header(
title = "R Packages and Functions Used"
) %>%
tab_style(
style = cell_text(weight = "bold"),
locations = cells_column_labels()
) %>%
cols_width(
Package ~ px(120),
Purpose ~ px(200),
`Key Functions Used` ~ px(300)
)| R Packages and Functions Used | ||
|---|---|---|
| Package | Purpose | Key Functions Used |
| tidyverse | Data manipulation and visualization | mutate(), summarise(), pivot_longer(), ggplot() |
| broom | Tidy model outputs | tidy(), augment(), glance() |
| broom.helpers | Enhanced broom functionality | tidy_plus_plus() |
| gtsummary | Summary tables for statistical models | tbl_regression(), add_glance_table() |
| pROC | ROC curve analysis | roc(), auc() |
| gt | Beautiful table formatting | gt(), fmt_number() |
| kableExtra | Enhanced kable tables | kable_styling() |
| caret | Machine learning tools | confusionMatrix() |
This enhanced analysis demonstrates the power of combining tidyverse principles with specialized statistical packages for comprehensive diagnostic test evaluation.