--- title: "mmbcv: Bias-corrected sandwich variance for clustered multistate Cox models" output: rmarkdown::html_vignette: css: vignette.css bibliography: references.bib link-citations: true vignette: > %\VignetteIndexEntry{mmbcv: Bias-corrected sandwich variance for clustered multistate Cox models} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include=FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", message = FALSE, warning = FALSE ) has_survival <- requireNamespace("survival", quietly = TRUE) ``` ## Overview The **mmbcv** package computes cluster-robust and bias-corrected sandwich variance estimators for multi-state Cox models fit in counting-process form. Wang, Turner, and Li [@wang2023] developed a class of bias corrections under clustered marginal Cox model. This package extend their approaches to multi-state settings. The main user-facing function is **`MMBCV()`**, which takes - a fitted multistate Cox model object (`fit`), - the original counting-process dataset (`data`), - the start and stop time variables, - the cluster identifier, - the subject identifier, - the event/state identifier, and returns a collection of robust and bias-corrected variance estimators. For background on fitting multistate Cox models using `survival::coxph()` with list-formulas, see the survival vignette *Multi-state models and competing risks*: - `vignette("compete", package = "survival")` - https://CRAN.R-project.org/package=survival ## Statistical background Suppose the data arise from a clustered multi-state survival process, and let \(\beta\) denote the full regression coefficient vector. Let \(U_i(\hat\beta)\) denote the cluster-level score contribution for cluster \(i\), and let \(V_m\) denote the model-based variance estimator. Following Wang et. al. [@wang2023], the baseline robust sandwich variance estimator is \[ \widehat{\mathrm{Var}}_{\mathrm{robust}}(\hat\beta) = V_m \left( \sum_{i=1}^n U_i(\hat\beta) U_i(\hat\beta)^\top \right) V_m. \] In **mmbcv**, this estimator is returned as `robust`. When the number of clusters is small, the usual robust sandwich variance may underestimate the true variance, which motivates finite-sample bias corrections. Following Wang et. al. [@wang2023], a convenient way to write many of these estimators is \[ \widehat{\mathrm{Var}}(\hat\beta) = V_m \left( \sum_{i=1}^n C_i \, U_i^\star(\hat\beta)\, U_i^\star(\hat\beta)^\top C_i^\top \right) V_m, \] where \(U_i^\star(\hat\beta)\) is either the original cluster score \(U_i(\hat\beta)\) or a corrected score based on martingale residual (MR), and \(C_i\) is a cluster-specific correction matrix. ### MR correction The martingale-residual correction replaces the original cluster score \(U_i(\hat\beta)\) with a bias-corrected version, denoted here by \(U_i^{\mathrm{MR}}(\hat\beta)\). The resulting variance estimator is \[ \widehat{\mathrm{Var}}_{\mathrm{MR}}(\hat\beta) = V_m \left( \sum_{i=1}^n U_i^{\mathrm{MR}}(\hat\beta) U_i^{\mathrm{MR}}(\hat\beta)^\top \right) V_m. \] This is returned by **mmbcv** as `varMR`. ### KC, FG, and MD corrections Wang et al. [@wang2023] summarize three multiplicative small-sample corrections using the same general sandwich form with different choices of \(C_i\). For the **Kauermann-Carroll (KC)** correction [@kauermann2001; @wang2023], \[ C^{KC}_i = \left(I_p - \Omega_i V_m \right)^{-1/2}, \] and the corresponding estimator is \[ \widehat{\mathrm{Var}}_{\mathrm{KC}}(\hat\beta) = V_m \left( \sum_{i=1}^n C^{KC}_i U_i(\hat\beta) U_i(\hat\beta)^\top C^{KC \top}_i \right) V_m. \] This is returned as `varKC`. For the **Fay-Graubard (FG)** correction [@fay2001; @wang2023], \[ C^{FG}_i = \mathrm{diag} \left\{ \left(1 - \min\{\xi, [\Omega_i V_m]_{jj}\}\right)^{-1/2} \right\}, \] where \(\xi=0.75\) is a truncation constant. The resulting estimator is \[ \widehat{\mathrm{Var}}_{\mathrm{FG}}(\hat\beta) = V_m \left( \sum_{i=1}^n C^{FG}_i U_i(\hat\beta) U_i(\hat\beta)^\top C^{FG \top}_i \right) V_m. \] This is returned as `varFG`. For the **Mancl-DeRouen (MD)** correction [@mancl2001; @wang2023], \[ C^{MD}_i = \left(I_p - \Omega_i V_m \right)^{-1}, \] and \[ \widehat{\mathrm{Var}}_{\mathrm{MD}}(\hat\beta) = V_m \left( \sum_{i=1}^n C^{MD}_i U_i(\hat\beta) U_i(\hat\beta)^\top C^{MD \top}_i \right) V_m. \] This is returned as `varMD`. ### MBN correction The **Morel-Bokossa-Neerchal (MBN)** correction [@morel2003; @wang2023] is written as a linear combination of the robust sandwich variance and the model-based variance: \[ \widehat{\mathrm{Var}}_{\mathrm{MBN}}(\hat\beta) = c_1 \widehat{\mathrm{Var}}_{\mathrm{robust}}(\hat\beta) + c_2 \hat{\phi} V_m, \] where \(c_1\), \(c_2\), and \(\hat{\phi}\) are defined as in Wang et al. [@wang2023]. This is returned as `varMBN`. ### Hybrid MR estimators Wang et al. [@wang2023] further propose hybrid estimators that combine the MR correction with KC, FG, MD, and MBN. These are obtained by replacing \(U_i(\hat\beta)\) with \(U_i^{\mathrm{MR}}(\hat\beta)\) in the KC, FG, and MD formulas, and by replacing the robust variance with the MR variance in the MBN formula. The resulting estimators are: \[ \widehat{\mathrm{Var}}_{\mathrm{KCMR}}(\hat\beta) = V_m \left( \sum_{i=1}^n C^{KC}_i U_i^{\mathrm{MR}}(\hat\beta) U_i^{\mathrm{MR}}(\hat\beta)^\top C_i^{KC \top} \right) V_m, \] \[ \widehat{\mathrm{Var}}_{\mathrm{FGMR}}(\hat\beta) = V_m \left( \sum_{i=1}^n C^{FG}_i U_i^{\mathrm{MR}}(\hat\beta) U_i^{\mathrm{MR}}(\hat\beta)^\top C_i^{FG \top} \right) V_m, \] \[ \widehat{\mathrm{Var}}_{\mathrm{MDMR}}(\hat\beta) = V_m \left( \sum_{i=1}^n C^{MD}_i U_i^{\mathrm{MR}}(\hat\beta) U_i^{\mathrm{MR}}(\hat\beta)^\top C_i^{MD \top} \right) V_m, \] and \[ \widehat{\mathrm{Var}}_{\mathrm{MBNMR}}(\hat\beta) = c_1 \widehat{\mathrm{Var}}_{\mathrm{MR}}(\hat\beta) + c_2 \hat{\phi}_{\mathrm{MR}} V_m. \] These are returned as `varKCMR`, `varFGMR`, `varMDMR`, and `varMBNMR`, respectively. In practice, all outputs returned by `MMBCV()` are variance-covariance matrices for \(\hat\beta\). Standard errors from any estimator \(V\) can be obtained via \[ \mathrm{SE}(\hat\beta_j) = \sqrt{V_{jj}}. \] ## Example data This package ships with a small simulated clustered multistate dataset `msdat3`. ```{r} library(mmbcv) data("msdat3") dim(msdat3) length(unique(msdat3$id)) length(unique(msdat3$clus_id)) table(msdat3$event) ``` ## Variables `msdat3` is in long (counting-process) format with one row per subject-interval: - **`id`**: subject id\ - **`clus_id`**: cluster id\ - **`Z`**: cluster-level treatment assignment (0/1)\ - **`X`**: individual-level baseline covariate\ - **`from`, `to`**: start/end states for the interval\ - **`Tstart`, `Tstop`**: interval start/stop times\ - **`event`**: end-state at `Tstop` (e.g., `censor`, `S1`, `S2`, `S3`, `D`) The state labels are: - **`(s0)`**: initial state - **`S1`**: first intermediate state - **`S2`**: second intermediate state - **`S3`**: third intermediate state - **`D`**: absorbing state - **`censor`**: right-censoring at the end of the interval A quick look at the observed transitions: ```{r} with(msdat3, table(from, to)) ``` ## Fit a multistate Cox model (survival) The **`MMBCV()`** function uses a fitted model object as input. In many workflows, this model is fit with `survival::coxph()` using list-formulas. ```{r} library(survival) fit <- coxph( list( Surv(Tstart, Tstop, event) ~ 1, state("(s0)"):state("S1") + state("S1"):state("S2") + state("S2"):state("S3") ~ Z + X, state("(s0)"):state("D") + state("S1"):state("D") + state("S2"):state("D") + state("S3"):state("D") ~ (Z + X)/common ), data = msdat3, id = id, ties = "breslow", timefix = FALSE ) fit ``` If the **`survival`** package is not installed, install it separately in an interactive R session using `install.packages("survival")`. ## Compute variance estimators with **`MMBCV`** Once you have fit the multi-state Cox model with the original dataset, compute the variance estimators with **`MMBCV()`**: ```{r} out <- MMBCV( fit, msdat3, StartTime = Tstart, StopTime = Tstop, ClusterID = clus_id, SubjectID = id, Event = event, tie = "breslow", details = FALSE ) names(out) ``` ## Interpreting the output `MMBCV()` returns a list of variance–covariance matrices for the fitted regression coefficients. Each matrix can be used to compute standard errors, Wald tests, and confidence intervals via `sqrt(diag(V))`. The output includes: - `robust`: standard cluster-robust sandwich variance estimator - `varMR`: martingale-residual (MR) bias-corrected variance estimator - `varMD`: Mancl-DeRouen (MD) bias-corrected variance estimator - `varMDMR`: hybrid MD + MR bias-corrected variance estimator - `varKC`: Kauermann-Carroll (KC) bias-corrected variance estimator - `varKCMR`: hybrid KC + MR bias-corrected variance estimator - `varFG`: Fay-Graubard (FG) bias-corrected variance estimator - `varFGMR`: hybrid FG + MR bias-corrected variance estimator - `varMBN`: Morel-Bokossa-Neerchal (MBN) bias-corrected variance estimator - `varMBNMR`: hybrid MBN + MR bias-corrected variance estimator Standard errors can be computed from any returned variance-covariance matrix with `sqrt(diag(...))`. ```{r} Vlist <- out[c("robust","varMR","varMD","varMDMR","varFG","varFGMR","varKC","varKCMR","varMBN","varMBNMR")] SE <- sapply(Vlist, function(V) sqrt(diag(V))) SE ``` ## Efron ties If you fit multi-state model in `coxph()` using Efron ties, you need to use Efron ties in the variance calculation as well: ```{r} out_efron <- MMBCV( fit, msdat3, StartTime = Tstart, StopTime = Tstop, ClusterID = clus_id, SubjectID = id, Event = event, tie = "efron", details = FALSE ) sqrt(diag(out_efron$robust)) ``` ## Notes - **`MMBCV()`** assumes the fit object contains at least `cmap`, `rmap`, `states`, and `coefficients` (as in multistate fits produced by survival workflows). - The data passed to **`MMBCV()`** must match the data used to fit the model in `coxph()` (the same counting-process variables, ids, and state labels). ## Practical remarks The robust estimator is the natural baseline. The bias-corrected estimators are most useful when the number of clusters is not large. Wang et al. (2023) report that, in their simulations for clustered time-to-event outcomes, the MD estimator performed well when cluster-size variation was modest, while KCMR performed best under larger cluster-size variation. They also note that MR alone could be unstable in some scenarios. For that reason, it is useful to compare robust, varMD, and varKCMR in practice. ## References