--- title: "Doubly Robust MAIC for HTA: A Complete Worked Example in Advanced NSCLC" author: "drMAIC Package Authors" date: "`r Sys.Date()`" output: rmarkdown::html_vignette: toc: true toc_depth: 3 number_sections: true vignette: > %\VignetteIndexEntry{Doubly Robust MAIC for HTA: A Complete Worked Example} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} #bibliography: references.bib --- ```{r setup, include=FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 7, fig.height = 5, warning = FALSE, message = FALSE ) ``` # Introduction ## Background In health technology appraisal (HTA), regulators and payers frequently require indirect treatment comparisons (ITC) when no head-to-head randomised controlled trial (RCT) exists between treatments of interest. A common scenario — particularly in oncology — is the **unanchored indirect comparison**: two single-arm trials, each testing a different treatment, with no common comparator arm. Matching-Adjusted Indirect Comparison (MAIC) [@signorovitch2010] addresses this by reweighting the individual patient data (IPD) from one trial to match the aggregate baseline characteristics of the comparator. However, standard MAIC depends entirely on the weighting model being correctly specified. The **Doubly Robust MAIC (DR-MAIC)** implemented in this package combines: 1. **Inverse probability weighting** (standard MAIC) 2. **Outcome regression** (Standardised Treatment Comparison, STC / g-computation) ...into a single estimator that is consistent if **either** component is correctly specified [@remiroazocar2022; @lunceford2004; @tan2010]. ## Package scope The `drMAIC` package is aligned with: - **NICE DSU Technical Support Document 18** [@phillippo2016] - **Cochrane Handbook Chapter 23** (Dias et al.) - **ISPOR Task Force** guidance on indirect comparisons - **Remiro-Azócar et al. (2022)** DR estimation framework --- # Statistical Background ## Standard MAIC Given IPD from study A with covariates $X_i$ and outcomes $Y_i$, and target aggregate statistics $\bar{X}_B$ from study B, MAIC weights are: $$w_i = \exp(X_i^\top \hat\lambda)$$ where $\hat\lambda$ solves: $$\sum_{i=1}^n w_i X_i = n \bar{X}_B \quad (\text{moment-matching conditions})$$ The **Effective Sample Size (ESS)** quantifies information loss from reweighting: $$ESS = \frac{(\sum_i w_i)^2}{\sum_i w_i^2}$$ Low ESS (< 30% of $n$) indicates limited population overlap and is a key validity concern per NICE TSD 18. ## Doubly Robust Estimator The DR-MAIC estimator: $$\hat\theta_{DR} = \underbrace{\sum_i \omega_i \hat m(X_i)}_{\text{STC (g-computation)}} + \underbrace{\sum_i \omega_i \left(Y_i - \hat m(X_i)\right)}_{\text{IPW bias correction}}$$ where $\omega_i = w_i / \sum w_i$ are normalized weights and $\hat m(X_i)$ is the predicted outcome from an outcome regression model. **Double robustness:** The estimator is consistent if either: - The weights correctly balance $X$ between populations (even if $\hat m$ is wrong), **or** - The outcome model $\hat m$ is correctly specified (even if weights are imperfect) --- # Worked Example: Advanced NSCLC ## Data ```{r load-data} library(drMAIC) data(nsclc_ipd) data(nsclc_agd) # IPD from Study A (index trial — immunotherapy) cat("=== Study A: IPD Summary ===\n") cat(sprintf("n = %d patients\n", nrow(nsclc_ipd))) cat(sprintf("Response rate: %.1f%%\n", 100 * mean(nsclc_ipd$response))) cat(sprintf("Mean age: %.1f years\n", mean(nsclc_ipd$age))) cat(sprintf("%% ECOG 1/2: %.1f%%\n", 100 * mean(nsclc_ipd$ecog))) cat(sprintf("%% Ever-smoker: %.1f%%\n", 100 * mean(nsclc_ipd$smoker))) # AgD from Study B (comparator trial) cat("\n=== Study B: AgD ===\n") cat(sprintf("n = %d patients\n", nsclc_agd$n_agd)) cat(sprintf("Response rate: %.1f%%\n", 100 * nsclc_agd$response_rate)) cat(sprintf("Mean age: %.1f years\n", nsclc_agd$mean_age)) cat(sprintf("%% ECOG 1/2: %.1f%%\n", 100 * nsclc_agd$prop_ecog1)) cat(sprintf("%% Ever-smoker: %.1f%%\n", 100 * nsclc_agd$prop_smoker)) ``` Notice that Study B has an older, sicker population — this **population imbalance** is exactly what MAIC corrects for. ## Step 1: Compute MAIC Weights ```{r compute-weights} # Define target moments from Study B target_moments <- c( age = nsclc_agd$mean_age, ecog = nsclc_agd$prop_ecog1, smoker = nsclc_agd$prop_smoker ) # Compute entropy-balancing weights w <- compute_weights( ipd = nsclc_ipd, target_moments = target_moments, match_vars = c("age", "ecog", "smoker"), verbose = TRUE ) ``` ## Step 2: Covariate Balance Diagnostics ```{r diagnostics, fig.cap="Love plot: covariate balance before and after MAIC weighting"} diag <- maic_diagnostics(w, plot_type = "all") ``` ```{r love-plot, fig.cap="Love Plot — covariate balance"} diag$love_plot ``` ```{r weight-plot, fig.cap="Weight distribution"} diag$weight_plot ``` The Love plot shows that all covariates achieve |SMD| < 0.10 after weighting (the NICE TSD 18 recommended threshold), confirming successful covariate balance. ## Step 3: Check Assumptions ```{r check-assumptions} check_assumptions(w, ess_threshold = 30, smd_threshold = 0.10) ``` ## Step 4: DR-MAIC Estimation ```{r dr-maic} result <- dr_maic( maic_weights = w, outcome_var = "response", outcome_type = "binary", comparator_estimate = nsclc_agd$response_rate, comparator_se = nsclc_agd$response_se, effect_measure = "OR" ) print(result) ``` ### Interpreting the three estimators | Estimator | Description | Robust to | |-----------|-------------|-----------| | **MAIC (IPW)** | Re-weighted outcome mean | Outcome model misspecification | | **STC (g-comp)** | Outcome model prediction | Weight misspecification | | **DR-MAIC** | Augmented combination | Misspecification of **either** component | The DR augmentation term (the difference between DR and STC) quantifies the residual imbalance not captured by the outcome model — ideally close to zero. ## Step 5: Bootstrap Confidence Intervals ```{r bootstrap, eval=FALSE} # Run 1000 bootstrap replicates (BCa method recommended by NICE TSD 18) boot_res <- bootstrap_ci( dr_maic_result = result, R = 1000, ci_type = "bca", seed = 2024 ) print(boot_res) boot_res$boot_plot ``` ```{r bootstrap-demo, echo=FALSE} # Demonstration with fewer replicates for vignette build speed boot_res <- bootstrap_ci( dr_maic_result = result, R = 200, ci_type = "perc", seed = 2024, verbose = FALSE ) print(boot_res) ``` ## Step 6: Sensitivity Analysis ```{r sensitivity} sa <- sensitivity_analysis( dr_maic_result = result, trim_percentiles = c(0.90, 0.95, 0.99), lovo = TRUE ) ``` ```{r trim-plot, fig.cap="Weight trimming sensitivity"} if (!is.null(sa$trim_plot)) sa$trim_plot ``` ```{r lovo-plot, fig.cap="Leave-one-variable-out sensitivity"} if (!is.null(sa$lovo_plot)) sa$lovo_plot ``` **E-value interpretation:** An unmeasured confounder would need at least a `r round(sa$evalue, 2)`-fold association with both treatment and outcome to fully explain away the observed treatment effect. Values > 2 generally indicate a robust finding. ## Step 7: NICE Report ```{r nice-report} nice_report( dr_maic_result = result, bootstrap_result = boot_res, sensitivity_result = sa, study_a_name = "KEYNOTE-024 (simulated)", study_b_name = "IMpower150 (simulated)", indication = "Advanced / Metastatic NSCLC", treatment_a = "Pembrolizumab (simulated)", treatment_b = "Atezo + Bev + Chemo (simulated)" ) ``` --- # Advanced Usage ## Adding second-moment matching (mean + SD) For continuous variables, you can match on both mean and standard deviation: ```{r second-moment, eval=FALSE} w2 <- compute_weights( ipd = nsclc_ipd, target_moments = c(age = nsclc_agd$mean_age, age_sd = nsclc_agd$sd_age, ecog = nsclc_agd$prop_ecog1, smoker = nsclc_agd$prop_smoker), match_vars = c("age", "ecog", "smoker"), match_var_types = c(age = "mean_sd", ecog = "proportion", smoker = "proportion") ) ``` ## Additional prognostic covariates in outcome model Including additional prognostic variables in the outcome model can improve efficiency of the DR estimator (even without including them in matching): ```{r additional-covariates, eval=FALSE} result2 <- dr_maic( maic_weights = w, outcome_var = "response", outcome_type = "binary", comparator_estimate = nsclc_agd$response_rate, comparator_se = nsclc_agd$response_se, additional_covariates = c("pdl1_high", "prior_lines"), # efficiency gain effect_measure = "OR" ) ``` ## Time-to-event outcomes ```{r tte, eval=FALSE} result_os <- dr_maic( maic_weights = w, outcome_var = "os_event", outcome_type = "tte", time_var = "os_time", comparator_estimate = log(0.78), # log-HR from comparator comparator_se = 0.12, effect_measure = "HR" ) ``` --- # Reporting Checklist Per NICE DSU TSD 18 and ISPOR guidance, a complete DR-MAIC submission should include: - [ ] Justification for choice of matching variables (clinical rationale) - [ ] ESS and % of original n - [ ] Love plot of SMDs before and after weighting - [ ] Primary DR-MAIC estimate with bootstrap 95% CI (BCa) - [ ] Comparison of MAIC, STC, and DR-MAIC estimates - [ ] DR augmentation term (evidence of model concordance) - [ ] E-value for unmeasured confounding - [ ] Weight trimming sensitivity analysis - [ ] Leave-one-variable-out sensitivity analysis - [ ] Clear statement of assumptions and limitations --- # References