The “Cancer Relative Survival Model” represents transitions between three states: no evidence of disease (NED), cancer metastasis (Mets), and cancer death.



drawing
CRS State Diagram (Alarid-Escudero et al. 2018)



We need to calibrate the two transition probabilities in the figure above - p.Mets, p.DieMets - because these transition probabilities are unobservable (at least in this example).

To calibrate p.Mets and p.DieMets, we will fit the model to observed survival data. To do this, we’ll sample the two parameters randomly, and then measure how well the model-predicted survival matches the observed survival.

We’ll use the following packages:

# calibration functionality
library(lhs)

# visualization
library(plotrix)
library(psych)

1 Targets

The hypothetical calibration target is the proportion of people that survive over time.

load("CRSTargets.RData")

Let’s see what the survival looks like.

plotrix::plotCI(x = CRS.targets$Surv$Time, y = CRS.targets$Surv$value, 
                ui = CRS.targets$Surv$ub,
                li = CRS.targets$Surv$lb,
                ylim = c(0, 1), 
                xlab = "Time", ylab = "Pr Survive")

Note that the estimate of survival has some uncertainty around it, indicated by the error bounds.

If you had another target (stored in CRS.targets$Target2), you could plot it like this:

plotrix::plotCI(x = CRS.targets$Target2$Time, y = CRS.targets$Target2$value,
                ui = CRS.targets$Target2$ub,
                li = CRS.targets$Target2$lb,
                ylim = c(0, 1),
                xlab = "Time", ylab = "Target 2")

2 Model

The inputs to the function are parameters to be estimated through calibration. The model outputs correspond to the target data.

This creates the function markov_crs(). The function has one argument, v.params, which is a vector of parameters. In this case, v.params is a named vector with elements p.Mets and p.DieMets.

source("markov_crs.R")

Let’s check that the model runs. We arbitrarily set some parameter values in v.parms.test, then run the model.

v.params.test <- c(p.Mets = 0.1, p.DieMets = 0.2)
test_results <- markov_crs(v.params.test)

Within the returned model object (a list), the predicted survival is in a vector called Surv:

str(test_results)
## List of 1
##  $ Surv: Named num [1:59] 0.98 0.946 0.903 0.853 0.801 ...
##   ..- attr(*, "names")= chr [1:59] "2" "3" "4" "5" ...
head(test_results$Surv)
##         2         3         4         5         6         7 
## 0.9800000 0.9460000 0.9026000 0.8533000 0.8007380 0.7468786
plot(test_results$Surv)

It works!

But let’s compare the outputs to the targets.

plotrix::plotCI(x = CRS.targets$Surv$Time, y = CRS.targets$Surv$value, 
                ui = CRS.targets$Surv$ub,
                li = CRS.targets$Surv$lb,
                ylim = c(0, 1), 
                xlab = "Time", ylab = "Pr Survive")
points(test_results$Surv,
       col = "green", pch = 20)
legend("topright", legend = c("Targets", "Outputs"),
       col = c("black", "green"),
       pch = c(1, 20))

That doesn’t look right! We need to calibrate the unobservable parameters so that the results match the observed data.

3 Random Search Calibration

For random search with latin hypercube sampling, we need to specify:

  1. The number of samples
  2. The range of each parameter

First, when sampling random numbers in R, setting a seed allows you to obtain the same sample each time.

set.seed(1010)

Let’s start with 1000 samples. We’re calibrating 2 parameters, p.Mets and p.DieMets.

param.names <- c("p.Mets","p.DieMets")
n.param <- length(param.names)
rs.n.samp <- 1000

The boundaries on the search space are as follows:

# lower bound
lb <- c(p.Mets = 0.04, p.DieMets = 0.04) 

# upper bound
ub <- c(p.Mets = 0.16, p.DieMets = 0.16)

Now, we sample. Right away, we rename the columns of m.lhs.unit so we can keep track of which column is which.

# Sample unit Latin Hypercube
m.lhs.unit <- lhs::randomLHS(rs.n.samp, n.param)
colnames(m.lhs.unit) <- param.names
head(m.lhs.unit)
##          p.Mets  p.DieMets
## [1,] 0.92829349 0.24442996
## [2,] 0.64837718 0.43424406
## [3,] 0.50388911 0.98957474
## [4,] 0.43746806 0.41460896
## [5,] 0.02729788 0.05204551
## [6,] 0.17821315 0.30577191

Note that this does not yet incorporate the ranges we specified.

plot(x = m.lhs.unit[, param.names[1]],
     y = m.lhs.unit[, param.names[2]],
     xlab = param.names[1],
     ylab = param.names[2])

We have to rescale to the ranges we specified. (Note: we can start with a LHS sample and then transform it to any distribution - in this case, we’re rescaling it to a uniform distribution)

# Rescale to min/max of each parameter
rs.param.samp <- matrix(nrow=rs.n.samp,ncol=n.param)
colnames(rs.param.samp) <- param.names
for (i in 1:n.param){
  rs.param.samp[,i] <- qunif(m.lhs.unit[,i],
                           min = lb[i],
                           max = ub[i])
}

Now, the parameters are within the range we specified.

# view resulting parameter set samples
psych::pairs.panels(rs.param.samp)

3.1 Run the Calibration

We have 1000 parameter sets, so we’ll have 1000 entries in the goodness-of-fit vector. We’re currently using a single target, so we’ll have one column (i.e., a vector).

# initialize goodness-of-fit vector
rs.GOF <- rep(0, rs.n.samp)

Now, we run the model for each set of input values, and determine how well the model results fit the target data.

In this case, the goodness of fit function is the likelihood of observing the target data, given the model results. We assume a normal likelihood for this example. In many cases, it’s more numerically stable to use the log-likelihood log = TRUE and sum the individual likelihoods together.

gof_norm_loglike <- function(target_mean, target_sd, model_output){
  sum(dnorm(x = target_mean,
            mean = model_output,
            sd = target_sd,
            log = TRUE))
}

Loop through sampled sets of input values

for (j in 1:rs.n.samp){
  
  ###  Run model for a given parameter set  ###
  model.res = markov_crs(v.params = rs.param.samp[j, ])
  
  ###  Calculate goodness-of-fit of model outputs to targets  ###
  # log likelihood of the model output given the targets
  rs.GOF[j] = gof_norm_loglike(model_output = model.res$Surv,
                               target_mean = CRS.targets$Surv$value,
                               target_sd = CRS.targets$Surv$se)
}

3.2 Best-fitting parameters

# Arrange parameter sets in order of fit
rs.calib.res <- cbind(rs.param.samp, rs.GOF)
rs.calib.res <- rs.calib.res[order(-rs.calib.res[,"rs.GOF"]),]

# Examine the top 10 best-fitting sets
rs.calib.res[1:10,]
##           p.Mets  p.DieMets   rs.GOF
##  [1,] 0.04782207 0.12645087 156.0153
##  [2,] 0.12570106 0.04809467 155.9828
##  [3,] 0.13269799 0.04690869 155.7880
##  [4,] 0.12781268 0.04708311 155.7230
##  [5,] 0.04891700 0.12277125 155.7190
##  [6,] 0.04760128 0.13057134 155.6673
##  [7,] 0.04657485 0.13111585 155.5720
##  [8,] 0.05052389 0.11034383 155.1886
##  [9,] 0.12734270 0.04851216 155.1570
## [10,] 0.05011317 0.11813266 155.0989
# Plot the top 100 (top 10%)
plot(x = rs.calib.res[1:100,1], y = rs.calib.res[1:100,2],
     xlim=c(lb[1],ub[1]),ylim=c(lb[2],ub[2]),
     xlab = colnames(rs.calib.res)[1],ylab = colnames(rs.calib.res)[2])

How does the model look with the best-fitting set?

rs_best_fit_params <- c(rs.calib.res[1, c("p.Mets",  "p.DieMets")])
rs_best_fit_model <- markov_crs(rs_best_fit_params)

plotrix::plotCI(x = CRS.targets$Surv$Time, y = CRS.targets$Surv$value, 
                ui = CRS.targets$Surv$ub,
                li = CRS.targets$Surv$lb,
                ylim = c(0, 1), 
                xlab = "Time", ylab = "Pr Survive")
points(x = names(rs_best_fit_model$Surv), y = rs_best_fit_model$Surv,
       col = "green",
       pch = 20)
legend("topright", legend = c("Targets", "Random Search"),
       col = c("black", "green"),
       pch = c(1, 20))

3.3 Other likelihoods

We used a normal likelihood as a goodness-of-fit measure above, but there are many other options. For example, we could use a weighted sum of squared errors.

gof_wsse <- function(model_output, target_mean, target_se){
    w = 1/(target_se^2)
    gof <-  (-1) * sum(w*(model_output - target_mean)^2) 
    return(gof)
}

3.4 Multiple Targets

There is nothing restricting us to a single target, either. If we do use multiple targets, we can combine the multiple goodness-of-fit measures into a single GOF, possibly using weights to indicate the importance of each target.

If we had 2 targets, we would define a GOF matrix with two columns:

# initialize goodness-of-fit matrix
rs.GOF <- matrix(0, nrow = n.samp, ncol = 2)

for (j in 1:n.samp){
  
  ###  Run model for a given parameter set  ###
  model.res = markov_crs(v.params = rs.params.samp[j, ])
  
  ###  Target 1  ###
  rs.GOF[j, 1] = gof_norm_loglike(model_output = model.res$Output1,
                              target_mean = CRS.targets$Target1$value,
                              target_sd = CRS.targets$Target1$se)
  
  ### Target 2 ###
  rs.GOF[j, 2] = gof_norm_loglike(model_output = model.res$Output2,
                              target_mean = CRS.targets$Target2$value,
                              target_sd = CRS.targets$Target2$se)
}

Then, we would combine the GoFs into a single, multiplying by a v.weights, a vector of weights (here, equal weights).

# can give different targets different weights
v.weights <- matrix(1, nrow = n.target, ncol = 1)

# matrix multiplication to calculate weight sum of each GOF matrix row
rs.GOF.overall <- c(rs.GOF %*% v.weights)

# Store in GOF matrix with column name "Overall"
rs.GOF <- cbind(rs.GOF, Overall_fit=rs.GOF.overall)

4 Nelder-Mead Calibration

Main differences between Random Search and Nelder-Mead:

  1. Nelder-Mead is directed, which random search is undirected
  2. In Nelder-Mead, we sample several starting points to ensure that we find global optima, or at least all of the local optima.

Let’s do a Nelder-Mead calibration!

4.1 Sample starting values

n.init <- 100
nm.params.init <- matrix(nrow=n.init,ncol=n.param)
set.seed(101)
for (i in 1:n.param){
  nm.params.init[,i] <- runif(n.init,min=lb[i],max=ub[i])
}
colnames(nm.params.init) <- param.names

head(nm.params.init)
##          p.Mets  p.DieMets
## [1,] 0.08466381 0.05500232
## [2,] 0.04525898 0.04279920
## [3,] 0.12516208 0.08702335
## [4,] 0.11892285 0.14315183
## [5,] 0.06998269 0.12620014
## [6,] 0.07600658 0.08072740

4.2 Objective Function

This function is what is optimized by the optim() function. As such, it must return a summary measure of goodness of fit - that is, a single number. If we were to use multiple targets, their combination would have to be calculated in this function.

nm_objective = function(v.params){
  ###   Run model for parametr set "v.params" ###
  model.res = markov_crs(v.params)

  # log likelihood  
  v.GOF = gof_norm_loglike(target_mean = CRS.targets$Surv$value,
                              model_output = model.res$Surv,
                              target_sd = CRS.targets$Surv$se)
  return(v.GOF)
}

4.3 Run the Calibration!

For each starting point, we maximize the goodness of fit, then save the resulting parameter values:

nm.calib.res <- matrix(nrow = n.init, ncol = n.param+1)
colnames(nm.calib.res) <- c(param.names, "Overall_fit")
for (j in 1:n.init){
  
  fit.nm <- optim(nm.params.init[j,], nm_objective, 
                 control = list(fnscale = -1, # fnscale = -1 switches from minimization to maximization
                                maxit = 1000), 
                 hessian = T)
  
  nm.calib.res[j,] <- c(fit.nm$par,fit.nm$value)
  
}

4.4 Results

# Arrange parameter sets in order of fit
nm.calib.res <- nm.calib.res[order(-nm.calib.res[,"Overall_fit"]),]

# Examine the top 10 best-fitting sets
nm.calib.res[1:10,]
##           p.Mets  p.DieMets Overall_fit
##  [1,] 0.12439610 0.04810758    156.0328
##  [2,] 0.04810757 0.12439632    156.0328
##  [3,] 0.12439800 0.04810736    156.0328
##  [4,] 0.12439921 0.04810714    156.0328
##  [5,] 0.12439998 0.04810701    156.0328
##  [6,] 0.12439728 0.04810703    156.0328
##  [7,] 0.04810716 0.12439631    156.0328
##  [8,] 0.04810842 0.12439197    156.0328
##  [9,] 0.12439044 0.04810829    156.0328
## [10,] 0.12439814 0.04810673    156.0328
# Plot the top 10 (top 10%)
plot(nm.calib.res[1:10,1],nm.calib.res[1:10,2],
     xlim=c(lb[1],ub[1]),ylim=c(lb[2],ub[2]),
     xlab = colnames(nm.calib.res)[1],ylab = colnames(nm.calib.res)[2])

What do you notice about these results?

Let’s see how the results look compared to the targets:

nm_best_fit_params <- c(nm.calib.res[1, c("p.Mets",  "p.DieMets")])
nm_best_fit_model <- markov_crs(nm_best_fit_params)

plotrix::plotCI(x = CRS.targets$Surv$Time, y = CRS.targets$Surv$value, 
                ui = CRS.targets$Surv$ub,
                li = CRS.targets$Surv$lb,
                ylim = c(0, 1), 
                xlab = "Time", ylab = "Pr Survive")
points(x = names(nm_best_fit_model$Surv), y = nm_best_fit_model$Surv,
       col = "red",
       pch = 4,
       cex = 1.2)
legend("topright", legend = c("Targets", "Nelder-Mead"),
       col = c("black", "red"),
       pch = c(1, 2, 4))

5 Compare Calibration Results

We can plot the calibration results on the same axes, to see if the results are similar or different.

plotrix::plotCI(x = CRS.targets$Surv$Time, y = CRS.targets$Surv$value, 
                ui = CRS.targets$Surv$ub,
                li = CRS.targets$Surv$lb,
                ylim = c(0, 1), 
                xlab = "Time", ylab = "Pr Survive")
points(x = names(rs_best_fit_model$Surv), y = rs_best_fit_model$Surv,
       col = "green",
       pch = 2,
       cex = 1.2)
points(x = names(nm_best_fit_model$Surv), y = nm_best_fit_model$Surv,
       col = "red",
       pch = 4,
       cex = 1.2)

legend("topright", legend = c("Targets", "Random Search", "Nelder-Mead"),
       col = c("black", "green", "red"),
       pch = c(1, 2, 4))

6 Return to Main Page

Back to main page

7 Appendix - Model Function

#### CRS Markov model in a function ####
markov_crs <- function(v.params) {
  with(as.list(v.params), {
    ## Markov model parameters
    n.t  <- 60                        # time horizon, number of cycles
    v.n  <- c("NED", "Mets", "Death") # the 3 states of the model
    n.s <- length(v.n)                # number of health states 
    
    # Transition probabilities 
    # p.Mets    = 0.10          # probability to become sicker when sick
    # p.DieMets = 0.05            # hazard ratio of death in sick vs healthy
    
    ####### INITIALIZATION ##########################################
    # create the cohort trace
    m.M <- matrix(NA, nrow = n.t + 1 , 
                  ncol = n.s,
                  dimnames = list(0:n.t, v.n)) # create Markov trace (n.t + 1 because R doesn't understand  Cycle 0)
    
    m.M[1, ] <- c(1, 0, 0)                     # initialize Markov trace
    
    # create transition probability matrix for NO treatment
    m.P <- matrix(0,
                  nrow = n.s, 
                  ncol = n.s,
                  dimnames = list(v.n, v.n))
    # fill in the transition probability array
    ### From NED
    m.P["NED", "NED"]   <- 1 - (p.Mets)
    m.P["NED", "Mets"]  <- p.Mets
    m.P["NED", "Death"] <- 0            # Not allowed to die from cancer in NED state
    ### From Mets
    m.P["Mets", "NED"]   <- 0
    m.P["Mets", "Mets"]  <- 1 - (p.DieMets)
    m.P["Mets", "Death"] <- p.DieMets
    ### From Death
    m.P["Death", "Death"] <- 1
    
    # check rows add up to 1
    if (!isTRUE(all.equal(as.numeric(rowSums(m.P)), as.numeric(rep(1, n.s))))) {
      stop("This is not a valid transition Matrix")
    }
    
    ############# PROCESS ###########################################
    
    for (t in 1:n.t){                   # throughout the number of cycles
      m.M[t + 1, ] <- m.M[t, ] %*% m.P  # estimate the Markov trace for cycle the next cycle (t + 1)
    }
    
    ####### EPIDEMIOLOGICAL OUTPUT  ###########################################
    #### Overall Survival (OS) ####
    v.os <- 1 - m.M[, "Death"]  # calculate the overall survival (OS) probability
    
    ####### RETURN OUTPUT  ###########################################
    out <- list(Surv = v.os[-c(1:2)])
    
    return(out)
  }
  )
}

References

Alarid-Escudero, Fernando, Richard F. MacLehose, Yadira Peralta, Karen M Kuntz, and Eva A Enns. 2018. “Nonidentifiability in Model Calibration and Implications for Medical Decision Making.” Medical Decision Making 38 (7). University of Minnesota Shcool of Public Health: 810–21. doi:10.1177/0272989X18792283.