mmbcv: Bias-corrected sandwich variance for clustered multistate Cox models

Overview

The mmbcv package computes cluster-robust and bias-corrected sandwich variance estimators for multi-state Cox models fit in counting-process form. Wang, Turner, and Li (Wang, Turner, and Li 2023) developed a class of bias corrections under clustered marginal Cox model. This package extend their approaches to multi-state settings.

The main user-facing function is MMBCV(), which takes

a fitted multistate Cox model object (fit),
the original counting-process dataset (data),
the start and stop time variables,
the cluster identifier,
the subject identifier,
the event/state identifier,

and returns a collection of robust and bias-corrected variance estimators.

For background on fitting multistate Cox models using survival::coxph() with list-formulas, see the survival vignette Multi-state models and competing risks:

vignette("compete", package = "survival")
https://CRAN.R-project.org/package=survival

Statistical background

Suppose the data arise from a clustered multi-state survival process, and let \(\beta\) denote the full regression coefficient vector. Let \(U_i(\hat\beta)\) denote the cluster-level score contribution for cluster \(i\), and let \(V_m\) denote the model-based variance estimator. Following Wang et. al. (Wang, Turner, and Li 2023), the baseline robust sandwich variance estimator is

\[ \widehat{\mathrm{Var}}_{\mathrm{robust}}(\hat\beta) = V_m \left( \sum_{i=1}^n U_i(\hat\beta) U_i(\hat\beta)^\top \right) V_m. \]

In mmbcv, this estimator is returned as robust. When the number of clusters is small, the usual robust sandwich variance may underestimate the true variance, which motivates finite-sample bias corrections.

Following Wang et. al. (Wang, Turner, and Li 2023), a convenient way to write many of these estimators is

\[ \widehat{\mathrm{Var}}(\hat\beta) = V_m \left( \sum_{i=1}^n C_i \, U_i^\star(\hat\beta)\, U_i^\star(\hat\beta)^\top C_i^\top \right) V_m, \]

where \(U_i^\star(\hat\beta)\) is either the original cluster score \(U_i(\hat\beta)\) or a corrected score based on martingale residual (MR), and \(C_i\) is a cluster-specific correction matrix.

MR correction

The martingale-residual correction replaces the original cluster score \(U_i(\hat\beta)\) with a bias-corrected version, denoted here by \(U_i^{\mathrm{MR}}(\hat\beta)\). The resulting variance estimator is

\[ \widehat{\mathrm{Var}}_{\mathrm{MR}}(\hat\beta) = V_m \left( \sum_{i=1}^n U_i^{\mathrm{MR}}(\hat\beta) U_i^{\mathrm{MR}}(\hat\beta)^\top \right) V_m. \]

This is returned by mmbcv as varMR.

KC, FG, and MD corrections

Wang et al. (Wang, Turner, and Li 2023) summarize three multiplicative small-sample corrections using the same general sandwich form with different choices of \(C_i\).

For the Kauermann-Carroll (KC) correction (Kauermann and Carroll 2001; Wang, Turner, and Li 2023),

\[ C^{KC}_i = \left(I_p - \Omega_i V_m \right)^{-1/2}, \]

and the corresponding estimator is

\[ \widehat{\mathrm{Var}}_{\mathrm{KC}}(\hat\beta) = V_m \left( \sum_{i=1}^n C^{KC}_i U_i(\hat\beta) U_i(\hat\beta)^\top C^{KC \top}_i \right) V_m. \]

This is returned as varKC.

For the Fay-Graubard (FG) correction (Fay and Graubard 2001; Wang, Turner, and Li 2023),

\[ C^{FG}_i = \mathrm{diag} \left\{ \left(1 - \min\{\xi, [\Omega_i V_m]_{jj}\}\right)^{-1/2} \right\}, \]

where \(\xi=0.75\) is a truncation constant. The resulting estimator is

\[ \widehat{\mathrm{Var}}_{\mathrm{FG}}(\hat\beta) = V_m \left( \sum_{i=1}^n C^{FG}_i U_i(\hat\beta) U_i(\hat\beta)^\top C^{FG \top}_i \right) V_m. \]

This is returned as varFG.

For the Mancl-DeRouen (MD) correction (Mancl and DeRouen 2001; Wang, Turner, and Li 2023),

\[ C^{MD}_i = \left(I_p - \Omega_i V_m \right)^{-1}, \]

and

\[ \widehat{\mathrm{Var}}_{\mathrm{MD}}(\hat\beta) = V_m \left( \sum_{i=1}^n C^{MD}_i U_i(\hat\beta) U_i(\hat\beta)^\top C^{MD \top}_i \right) V_m. \]

This is returned as varMD.

MBN correction

The Morel-Bokossa-Neerchal (MBN) correction (Morel, Bokossa, and Neerchal 2003; Wang, Turner, and Li 2023) is written as a linear combination of the robust sandwich variance and the model-based variance:

\[ \widehat{\mathrm{Var}}_{\mathrm{MBN}}(\hat\beta) = c_1 \widehat{\mathrm{Var}}_{\mathrm{robust}}(\hat\beta) + c_2 \hat{\phi} V_m, \]

where \(c_1\), \(c_2\), and \(\hat{\phi}\) are defined as in Wang et al. (Wang, Turner, and Li 2023). This is returned as varMBN.

Hybrid MR estimators

Wang et al. (Wang, Turner, and Li 2023) further propose hybrid estimators that combine the MR correction with KC, FG, MD, and MBN. These are obtained by replacing \(U_i(\hat\beta)\) with \(U_i^{\mathrm{MR}}(\hat\beta)\) in the KC, FG, and MD formulas, and by replacing the robust variance with the MR variance in the MBN formula.

The resulting estimators are:

\[ \widehat{\mathrm{Var}}_{\mathrm{KCMR}}(\hat\beta) = V_m \left( \sum_{i=1}^n C^{KC}_i U_i^{\mathrm{MR}}(\hat\beta) U_i^{\mathrm{MR}}(\hat\beta)^\top C_i^{KC \top} \right) V_m, \]

\[ \widehat{\mathrm{Var}}_{\mathrm{FGMR}}(\hat\beta) = V_m \left( \sum_{i=1}^n C^{FG}_i U_i^{\mathrm{MR}}(\hat\beta) U_i^{\mathrm{MR}}(\hat\beta)^\top C_i^{FG \top} \right) V_m, \]

\[ \widehat{\mathrm{Var}}_{\mathrm{MDMR}}(\hat\beta) = V_m \left( \sum_{i=1}^n C^{MD}_i U_i^{\mathrm{MR}}(\hat\beta) U_i^{\mathrm{MR}}(\hat\beta)^\top C_i^{MD \top} \right) V_m, \]

and

\[ \widehat{\mathrm{Var}}_{\mathrm{MBNMR}}(\hat\beta) = c_1 \widehat{\mathrm{Var}}_{\mathrm{MR}}(\hat\beta) + c_2 \hat{\phi}_{\mathrm{MR}} V_m. \]

These are returned as varKCMR, varFGMR, varMDMR, and varMBNMR, respectively.

In practice, all outputs returned by MMBCV() are variance-covariance matrices for \(\hat\beta\). Standard errors from any estimator \(V\) can be obtained via

\[ \mathrm{SE}(\hat\beta_j) = \sqrt{V_{jj}}. \]

Example data

This package ships with a small simulated clustered multistate dataset msdat3.

library(mmbcv)

data("msdat3")
dim(msdat3)
#> [1] 3004    9
length(unique(msdat3$id))
#> [1] 1500
length(unique(msdat3$clus_id))
#> [1] 30
table(msdat3$event)
#> 
#> censor     S1     S2     S3      D 
#>   1050    837    450    217    450

Variables

msdat3 is in long (counting-process) format with one row per subject-interval:

id: subject id
clus_id: cluster id
Z: cluster-level treatment assignment (0/1)
X: individual-level baseline covariate
from, to: start/end states for the interval
Tstart, Tstop: interval start/stop times
event: end-state at Tstop (e.g., censor, S1, S2, S3, D)

The state labels are:

(s0): initial state
S1: first intermediate state
S2: second intermediate state
S3: third intermediate state
D: absorbing state
censor: right-censoring at the end of the interval

A quick look at the observed transitions:

with(msdat3, table(from, to))
#>       to
#> from     D  S1  S2  S3 censor
#>   (s0) 234 837   0   0    429
#>   S1   109   0 450   0    278
#>   S2    55   0   0 217    178
#>   S3    52   0   0   0    165

Fit a multistate Cox model (survival)

The MMBCV() function uses a fitted model object as input. In many workflows, this model is fit with survival::coxph() using list-formulas.

library(survival)

fit <- coxph(
  list(
    Surv(Tstart, Tstop, event) ~ 1,
    state("(s0)"):state("S1") + state("S1"):state("S2") + state("S2"):state("S3") ~
      Z + X,
    state("(s0)"):state("D") + state("S1"):state("D") + state("S2"):state("D") +
      state("S3"):state("D") ~ (Z + X)/common
  ),
  data = msdat3,
  id = id,
  ties = "breslow",
  timefix = FALSE
)

fit
#> Call:
#> coxph(formula = list(Surv(Tstart, Tstop, event) ~ 1, state("(s0)"):state("S1") + 
#>     state("S1"):state("S2") + state("S2"):state("S3") ~ Z + X, 
#>     state("(s0)"):state("D") + state("S1"):state("D") + state("S2"):state("D") + 
#>         state("S3"):state("D") ~ (Z + X)/common), data = msdat3, 
#>     ties = "breslow", id = id, timefix = FALSE)
#> 
#>    
#> 1:2     coef exp(coef) se(coef) robust se      z     p
#>   Z -0.03640   0.96425  0.06935   0.06919 -0.526 0.599
#>   X  0.02849   1.02890  0.03384   0.03371  0.845 0.398
#> 
#>    
#> 2:3     coef exp(coef) se(coef) robust se      z      p
#>   Z -0.14916   0.86144  0.09516   0.09486 -1.572 0.1159
#>   X  0.09521   1.09990  0.04666   0.04616  2.063 0.0391
#> 
#>    
#> 3:4     coef exp(coef) se(coef) robust se     z     p
#>   Z 0.138588  1.148651 0.136808  0.137060 1.011 0.312
#>   X 0.002256  1.002258 0.068286  0.068826 0.033 0.974
#> 
#>                   
#> 1:5, 2:5, 3:5, 4:5      coef exp(coef)  se(coef) robust se      z     p
#>                  Z -0.009474  0.990571  0.094743  0.094405 -0.100 0.920
#>                  X -0.030723  0.969744  0.046400  0.045745 -0.672 0.502
#> 
#>  States:  1= (s0), 2= S1, 3= S2, 4= S3, 5= D 
#> 
#> Likelihood ratio test=9.24  on 8 df, p=0.3226
#> n= 3004, unique id= 1500, number of events= 1954

If the survival package is not installed, install it separately in an interactive R session using install.packages("survival").

Compute variance estimators with `MMBCV`

Once you have fit the multi-state Cox model with the original dataset, compute the variance estimators with MMBCV():

out <- MMBCV(
  fit, msdat3,
  StartTime = Tstart,
  StopTime  = Tstop,
  ClusterID = clus_id,
  SubjectID = id,
  Event     = event,
  tie       = "breslow",
  details   = FALSE
)

names(out)
#>  [1] "robust"   "varMR"    "varMD"    "varMDMR"  "varFG"    "varFGMR" 
#>  [7] "varKC"    "varKCMR"  "varMBN"   "varMBNMR"

Interpreting the output

MMBCV() returns a list of variance–covariance matrices for the fitted regression coefficients. Each matrix can be used to compute standard errors, Wald tests, and confidence intervals via sqrt(diag(V)). The output includes:

robust: standard cluster-robust sandwich variance estimator
varMR: martingale-residual (MR) bias-corrected variance estimator
varMD: Mancl-DeRouen (MD) bias-corrected variance estimator
varMDMR: hybrid MD + MR bias-corrected variance estimator
varKC: Kauermann-Carroll (KC) bias-corrected variance estimator
varKCMR: hybrid KC + MR bias-corrected variance estimator
varFG: Fay-Graubard (FG) bias-corrected variance estimator
varFGMR: hybrid FG + MR bias-corrected variance estimator
varMBN: Morel-Bokossa-Neerchal (MBN) bias-corrected variance estimator
varMBNMR: hybrid MBN + MR bias-corrected variance estimator

Standard errors can be computed from any returned variance-covariance matrix with sqrt(diag(...)).

Vlist <- out[c("robust","varMR","varMD","varMDMR","varFG","varFGMR","varKC","varKCMR","varMBN","varMBNMR")]
SE <- sapply(Vlist, function(V) sqrt(diag(V)))
SE
#>          robust      varMR      varMD    varMDMR      varFG    varFGMR
#> [1,] 0.12681339 0.13612657 0.13173546 0.14142991 0.12917113 0.13865946
#> [2,] 0.03800617 0.03971887 0.03975596 0.04156009 0.03881715 0.04056647
#> [3,] 0.11249698 0.12105297 0.11763603 0.12667276 0.11483803 0.12360734
#> [4,] 0.04200442 0.04400631 0.04386978 0.04599197 0.04280556 0.04485222
#> [5,] 0.19063933 0.20739029 0.20607051 0.22431679 0.19772144 0.21516664
#> [6,] 0.07310988 0.07704443 0.07684647 0.08099901 0.07482618 0.07885202
#> [7,] 0.08010269 0.08547342 0.08446929 0.09016392 0.08219768 0.08772159
#> [8,] 0.04188072 0.04363451 0.04350871 0.04534072 0.04266635 0.04445542
#>           varKC    varKCMR     varMBN   varMBNMR
#> [1,] 0.12919935 0.13869646 0.13896496 0.14904754
#> [2,] 0.03886875 0.04062644 0.04603928 0.04840860
#> [3,] 0.11497344 0.12376072 0.13432627 0.14420807
#> [4,] 0.04292311 0.04498382 0.05485895 0.05788685
#> [5,] 0.19806014 0.21553152 0.21881488 0.23707776
#> [6,] 0.07493991 0.07898086 0.08985204 0.09505413
#> [7,] 0.08223319 0.08776186 0.10731307 0.11449882
#> [8,] 0.04268456 0.04447690 0.05464086 0.05746402

Efron ties

If you fit multi-state model in coxph() using Efron ties, you need to use Efron ties in the variance calculation as well:

out_efron <- MMBCV(
  fit, msdat3,
  StartTime = Tstart,
  StopTime  = Tstop,
  ClusterID = clus_id,
  SubjectID = id,
  Event     = event,
  tie       = "efron",
  details   = FALSE
)

sqrt(diag(out_efron$robust))
#> [1] 0.12681339 0.03800617 0.11249698 0.04200442 0.19063933 0.07310988 0.08010269
#> [8] 0.04188072

Notes

MMBCV() assumes the fit object contains at least cmap, rmap, states, and coefficients (as in multistate fits produced by survival workflows).
The data passed to MMBCV() must match the data used to fit the model in coxph() (the same counting-process variables, ids, and state labels).

Practical remarks

The robust estimator is the natural baseline. The bias-corrected estimators are most useful when the number of clusters is not large.

Wang et al. (2023) report that, in their simulations for clustered time-to-event outcomes, the MD estimator performed well when cluster-size variation was modest, while KCMR performed best under larger cluster-size variation. They also note that MR alone could be unstable in some scenarios. For that reason, it is useful to compare robust, varMD, and varKCMR in practice.

References

Fay, M. P., and B. I. Graubard. 2001. “Small-Sample Adjustments for Wald-Type Tests Using Sandwich Estimators.” Biometrics 57 (4): 1198–1206.

Kauermann, G., and R. J. Carroll. 2001. “A Note on the Efficiency of Sandwich Covariance Matrix Estimation.” Journal of the American Statistical Association 96 (456): 1387–96.

Mancl, L. A., and T. A. DeRouen. 2001. “A Covariance Estimator for GEE with Improved Small-Sample Properties.” Biometrics 57 (1): 126–34.

Morel, J. G., M. Bokossa, and N. K. Neerchal. 2003. “Small Sample Correction for the Variance of GEE Estimators.” Biometrical Journal 45 (4): 395–409.

Wang, Xueqi, Elizabeth L. Turner, and Fan Li. 2023. “Improving Sandwich Variance Estimation for Marginal Cox Analysis of Cluster Randomized Trials.” Biometrical Journal 65: 2200113. https://doi.org/10.1002/bimj.202200113.