SAVVY: estimation of AE probabilities

Author

Regina Stegherr and Kaspar Rufibach

Background

The assessment of safety is an important aspect of the evaluation of new therapies in clinical trials, with analyses of adverse events being an essential part of this. Standard methods for the analysis of adverse events such as the incidence proportion, i.e. the number of patients with a specific adverse event out of all patients in the treatment groups, do not account for both varying follow-up times and competing risks. Alternative approaches such as the Aalen-Johansen estimator of the cumulative incidence function have been suggested. Theoretical arguments and numerical evaluations support the application of these more advanced methodology, but as yet there is to our knowledge only insufficient empirical evidence whether these methods would lead to different conclusions in safety evaluations. The Survival analysis for AdVerse events with VarYing follow-up times (SAVVY) project strives to close this gap in evidence by conducting a meta-analytical study to assess the impact of the methodology on the conclusion of the safety assessment empirically. Three papers are currently under review summarizing the rationale and results of the project:

  • Regina Stegherr et al. (2021): Statistical analysis plan, presenting the rationale and statistical concept of the empirical study conducted as part of the SAVVY project. The statistical methods are presented in unified notation and examples of their implementation in R and SAS are provided. arxiv | publication
  • R. Stegherr, Schmoor, Beyersmann, et al. (2021): 1-sample case, compares estimators of AE risk in one treatment arm. arxiv | publication
  • Rufibach et al. (2022): 2-sample case, compares relative risk estimators comparing two treatment arms based on (1) AE probabilities and (2) hazards. arxiv | publication

Purpose of this document

This R markdown file provides easy accessible code to compute all the estimators for AE probabilities that are compared to each other in these papers. The github repository where this document is hosted is available here.

SAS code

The original SAS macros that were used by each sponsor to generate the summary data for the meta-analysis is available as supplement to Regina Stegherr et al. (2021), see here, or direct download link for zip-file.

Setup

Packages

# --------------------------------------------------------------
# packages
# --------------------------------------------------------------
packs <- c("data.table", "etm", "survival", "mvna", "knitr")    
for (i in 1:length(packs)){library(packs[i], character.only = TRUE)}

Functions

Below all functions are defined.

  • data_generation_constant_cens: This function generates the dataset denoted by \(S1\) in Table~4 of R. Stegherr, Schmoor, Lübbert, et al. (2021), i.e. we assume constant hazards for the AE hazard, the hazard for the competing event of death, and the hazard for the “soft” competing events. Censoring is uniform.
  • incidenceProportion: Compute incidence proportion.
  • probTransIncidenceDensity: Compute probability transform incidence density ignoring competing events.
  • probTransIncidenceDensity: Compute probability transform incidence density ignoring competing events.
  • oneMinusKaplanMeier: Compute 1 - Kaplan-Meier estimator.
  • AJE: Compute Aalen-Johansen estimator.

The argument CE codifies the type of competing event: CE = 2 refers to a competing event of death only whereas CE = 3 refers to all competing events.

# --------------------------------------------------------------
# functions
# --------------------------------------------------------------

# function to generate dataset with constant hazards for AE, death, and soft competing events
data_generation_constant_cens <- function(N, min.cens, max.cens, haz.AE, haz.death,
                                          haz.soft, seed = 57 * i + 5){
  
  # status, 1 for AE, 2 for death 3 for soft competing event
  set.seed(seed)
  haz.all <- haz.AE + haz.death + haz.soft

  my.data <- data.table(time_to_event = rep(0, N), type_of_event = rep(0, N))
  my.data$time_to_event<- rexp(n = N, rate = haz.all) # event time
  my.data$type_of_event <- rbinom(n = N, size = 2, 
                                  prob = c(haz.AE / haz.all, haz.death / haz.all, haz.soft / haz.all)) + 1
                                  # status, 1 for AE, 2 for death 3 for soft competing event
  my.data$cens <- runif(n = N, min = min.cens, max = max.cens)
  my.data$type_of_event <- as.numeric(my.data$time_to_event <= my.data$cens) * my.data$type_of_event
  my.data$time_to_event <- pmin(my.data$time_to_event, my.data$cens)
  my.data$id <- 1:N

  # reorder columns
  my.data <- my.data[, c("id", "time_to_event", "type_of_event", "cens")]
  return(my.data)
}

# compute incidence proportion
incidenceProportion <- function(data, tau){

  ae <- nrow(data[type_of_event == 1 & time_to_event <= tau]) / nrow(data)
  ae_prob_var <- ae * (1 - ae) / nrow(data)

  res <- c("ae_prob" = ae, "ae_prob_var" = ae_prob_var)
  return(res)
}

# compute probability transform incidence density
probTransIncidenceDensity <- function(data, tau){

  time <- data$time_to_event
  incidence.dens <- nrow(data[type_of_event == 1 & time_to_event <= tau]) / 
    sum(ifelse(time <= tau, time, tau))
  ae <- 1 - exp(-incidence.dens * tau)

  var_A_var <- nrow(data[type_of_event == 1 & time_to_event <= tau]) / 
    sum(ifelse(time <= tau, time, tau)) ^ 2
  ae_var <- exp(-incidence.dens * tau) ^ 2 * var_A_var * tau ^ 2

  res <- c("ae_prob" = ae, "ae_prob_var" = ae_var)
  return(res)
}

# compute probability transform incidence density accounting for CE
probTransIncidprobTransIncidenceDensityCompEvents <- function(data, CE, tau){
  
  data2 <- data[, type_of_event2 := ifelse(CE == 2 & data$type_of_event == 3, 0, 
                                           ifelse(CE == 3 & data$type_of_event == 3, 2, type_of_event))]
  time2 <- ifelse(data2$time_to_event <= tau, data2$time_to_event, tau)
  s2 <- sum(time2)
  
  id <- sum(dim(data2[type_of_event2 == 1 & time_to_event <= tau])[1]) / s2
  id_ce <- sum(dim(data2[type_of_event2 == 2 & time_to_event <= tau])[1]) / s2
  tmp <- id + id_ce
  ett <- exp(-tau * tmp)

  ae_prob <- id / tmp * (1 - exp(- tau*tmp))
  ae_prob_var <- (((ett *      (id_ce * (1 / ett - 1) + tau * id * tmp)) / tmp ^ 2) ^ 2 * id / s2 + 
                  ((ett * id * (tau * tmp - 1 / ett + 1)) / tmp ^ 2) ^ 2 * id_ce / s2)

  res <- c("ae_prob" = ae_prob, "ae_prob_var" = ae_prob_var)
  return(res)
}

# compute 1 - Kaplan-Meier
oneMinusKaplanMeier <- function(data, tau){

  if(nrow(data[type_of_event == 1]) == 0){
    ae_prob <- 0
    ae_prob_var <- 0
  }

  if(nrow(data[type_of_event == 1]) > 0){
    help <- data.frame(id = data$id)
    help$from <- 0
    help$to <- ifelse(data$type_of_event != 1, "cens", data$type_of_event)
    help$time <-ifelse(data$time_to_event == 0, 0.001, data$time_to_event)

    tra <- matrix(FALSE, 2, 2)
    tra[1, 2] <- TRUE
    state.names <-as.character(0:1)
    etmmm <-etm(help, state.names, tra, "cens", s = 0)

    ae_prob <- summary(etmmm)[[2]][sum(summary(etmmm)[[2]]$time <= tau),]$P
    ae_prob_var <- summary(etmmm)[[2]][sum(summary(etmmm)[[2]]$time <= tau),]$var
  }

  res <- c("ae_prob" = ae_prob, "ae_prob_var" = ae_prob_var)
  return(res)
}

# compute Aalen-Johansen estimator
AJE <- function(data, CE, tau){

  data[, type_of_event2 := ifelse(CE == 2 & data$type_of_event == 3, 0, 
                                  ifelse(CE == 3 & data$type_of_event == 3, 2, type_of_event))]
  time <- data$time_to_event
  type2 <- data$type_of_event2
  
  # conditions
  c1 <- nrow(data[type_of_event2 == 1])
  c2 <- nrow(data[type_of_event2 == 2])
  
  if(c1 == 0){
   ae_prob <- 0
   ae_prob_var <- 0
  }

  if(c2 == 0){
    ce_prob <- 0
    ce_prob_var <- 0
  }

  # define auxiliary objects
  help <- data.frame(id = data$id)
  help$from <- 0
  help$time <-ifelse(time == 0, 0.001, time)
  tra <- matrix(FALSE, 2, 2)
  tra[1, 2] <- TRUE
  state.names <- as.character(0:1)
  
  if(c1 == 0 & c2 != 0){
    help$to <- ifelse(type2 != 2, "cens", type2 - 1)
    etmmm <- etm(help, state.names, tra, "cens", s = 0)
    setmm <- summary(etmmm)[[2]]
    ce_prob <- setmm[sum(setmm$time <= tau),]$P
    ce_prob_var <- setmm[sum(setmm$time <= tau),]$var
  }

  if(c1 != 0 & c2 == 0){
    help$to <- ifelse(type2 != 1, "cens", type2)
    etmmm <- etm(help, state.names, tra, "cens", s = 0)
    setmm <- summary(etmmm)[[2]]
   
    ae_prob <- setmm[sum(setmm$time <= tau),]$P
    ae_prob_var <- setmm[sum(setmm$time <= tau),]$var
  }

  if(c1 != 0 & c2 != 0){
    help$to <- ifelse(!(type2 %in% c(1, 2)),"cens", type2)

    tra <- matrix(FALSE, 3, 3)
    tra[1, 2:3] <- TRUE
    state.names <- as.character(0:2)
    etmmm <- etm(help, state.names, tra, "cens", s = 0)
    setmm <- summary(etmmm)
   
    ae_prob <- setmm[[2]][sum(setmm[[2]]$time <= tau),]$P
    ae_prob_var <- setmm[[2]][sum(setmm[[2]]$time <= tau),]$var

    ce_prob <- setmm[[3]][sum(setmm[[3]]$time <= tau),]$P
    ce_prob_var <- setmm[[3]][sum(setmm[[3]]$time <= tau),]$var
  }

  res_ae <- c("ae_prob" = ae_prob, "ae_prob_var" = ae_prob_var)
  res_ce <- c("ce_prob" = ce_prob, "ce_prob_var" = ce_prob_var)
  
  res <- rbind(res_ae, res_ce)
  return(res)
}

Estimation of AE probabilities

We generate the dataset \(S1\) in Regina Stegherr et al. (2021) using the parameter values for Arm A:

# sample size
N <- 200

# support of uniform censoring distribution
min.cens <- 0
max.cens <- 1000

# hazards for the three event types
haz.AE <- 0.00265
haz.death <- 0.00151
haz.soft <- 0.00227

# generate dataset
dat1 <- data_generation_constant_cens(N, min.cens, max.cens, haz.AE, haz.death, haz.soft, seed = 2020)

# compute tau
tau <- max(dat1[, "time_to_event"])

The structure of the dataset looks as follows:

kable(head(dat1, 10), align = c("crcr"))
id time_to_event type_of_event cens
1 45.692951 1 424.518663
2 197.519473 1 266.013197
3 36.859040 2 650.432466
4 115.062905 2 164.620580
5 220.764432 1 994.178471
6 8.499869 0 8.499869
7 833.042670 0 833.042670
8 37.389099 1 987.244922
9 82.243862 2 556.460553
10 75.843159 1 813.761866

For this dataset we then compute all the estimators that enter the comparison in our papers:

# compute each estimator
IP <- incidenceProportion(dat1, tau)
ID <- probTransIncidenceDensity(dat1, tau)
KM <- oneMinusKaplanMeier(dat1, tau)

# competing event "death only"
IDCE2 <- probTransIncidprobTransIncidenceDensityCompEvents(dat1, CE = 2, tau)
AJ2 <- AJE(dat1, CE = 2, tau)

# account for all competing events
IDCE3 <- probTransIncidprobTransIncidenceDensityCompEvents(dat1, CE = 3, tau)
AJ3 <- AJE(dat1, CE = 3, tau)

# display
tab <- rbind(IP, ID, KM, IDCE2, AJ2[1, ], IDCE3, AJ3[1, ])
colnames(tab) <- c("estimated AE probability", "variance of estimation")
rownames(tab) <- c("incidence proportion", "probability transform incidence density ignoring competing event",
                   "1 - Kaplan-Meier", "probability transform incidence density (death only)", 
                   "Aalen-Johansen (death only), AE risk", "probability transform incidence density (all CEs)",
                   "Aalen-Johansen (all CEs), AE risk")

# probability of AE
kable(tab, digits = c(3, 5))
estimated AE probability variance of estimation
incidence proportion 0.400 0.00120
probability transform incidence density ignoring competing event 0.891 0.00073
1 - Kaplan-Meier 0.832 0.00303
probability transform incidence density (death only) 0.509 0.00157
Aalen-Johansen (death only), AE risk 0.509 0.00164
probability transform incidence density (all CEs) 0.472 0.00146
Aalen-Johansen (all CEs), AE risk 0.471 0.00149

Finally, the estimated probabilities of competing events based on the Aalen-Johansen estimators:

# display
tab <- rbind(AJ2[2, ], AJ3[2, ])
colnames(tab) <- c("estimated probability", "variance of estimation")
rownames(tab) <- c("Aalen-Johansen (death only), CE risk",
                   "Aalen-Johansen (all CEs), CE risk")

# probability of AE
kable(tab, digits = c(3, 5))
estimated probability variance of estimation
Aalen-Johansen (death only), CE risk 0.466 0.00165
Aalen-Johansen (all CEs), CE risk 0.507 0.00151

References

Rufibach, Kaspar, Regina Stegherr, Claudia Schmoor, Valentine Jehl, Arthur Allignol, Annette Boeckenhoff, Cornelia Dunger-Baldauf, et al. 2022. Survival analysis for AdVerse events with VarYing follow-up times (SAVVY) – comparison of adverse event risks in randomized controlled trials.” Statistics in Biopharmaceutical Research, Accepted.
Stegherr, Regina, Jan Beyersmann, Valentine Jehl, Kaspar Rufibach, Friedhelm Leverkus, Claudia Schmoor, and Tim Friede. 2021. “Survival Analysis for AdVerse Events with VarYing Follow-up Times (SAVVY): Rationale and Statistical Concept of a Meta-Analytic Study.” Biometrical Journal 63 (March): 650–70.
Stegherr, R., C. Schmoor, J. Beyersmann, K. Rufibach, V. Jehl, A. Brückner, L. Eisele, et al. 2021. Survival analysis for AdVerse events with VarYing follow-up times (SAVVY)-estimation of adverse event risks.” Trials 22 (1): 420.
Stegherr, R., C. Schmoor, M. Lübbert, T. Friede, and J. Beyersmann. 2021. Estimating and comparing adverse event probabilities in the presence of varying follow-up times and competing events.” Pharmaceutical Statistics 20 (6): 1125–46.