Time To Event

Author

Jose Storopoli

Caution

This lesson is really comprehensive and has a wide audience in mind. We recommend the reader to briefly navigate the Table of Contents and choose which topics to dive in. If you are in doubt feel free to read it from top to bottom.

using Pumas

In this lecture we’ll explore time to event (TTE) models.

First, let’s define what are TTE models.

Tip

There are several resources available for a deep dive into time to event and survival models. Specifically, we recommend:

  • Collett, D. (2015). Modelling survival data in medical research. Chapman and Hall/CRC.
  • Smith, P. J. (2002). Analysis of failure and survival data. Chapman and Hall/CRC.
  • Ibrahim, J. G., Chen, M. H., Sinha, D., Ibrahim, J. G., & Chen, M. H. (2001). Bayesian survival analysis. New York: Springer.

1 What are Time to Event Models?

Time to event models, also called survival analysis, involves lifetime distributions. Despite having applications in engineering, social sciences, computer sciences and so on, we will focus on TTE models applied to health, i.e. lifetime distributions of biological systems.

In TTE models, we are generally interested in lifetime distributions. This can be analyzed as different populations, such as those under different treatment arms of a clinical trial. Or as covariates effects which are common in observational or epidemiological studies.

Most important in a TTE model is the survival function.

2 Survival Function

To understand survival function, first we need to learn what are survival random variables. Survival random variables is simply a random variable whose outcomes are constrained to the positive domain:

\[T \in [0, \infty)\]

Then, the survival function is one minus the cumulative distribution function of the survival random variable \(T\):

\[\begin{aligned} F(t) &= P(T \leq t) = \int_0^t f(u)du \\ S(t) &= P(T > t) = 1 - F(t) = \int_t^\infty f(u)du \end{aligned}\]

where:

  • \(F(t)\) is the cumulative distribution function of the survival random variable \(T\) at value \(T=t\).
  • \(S(t)\) is the survival function of the survival random variable \(T\) at value \(T=t\).

Furthermore, \(S(t)\) is a monotonic decreasing function which tends to 0 as \(t \to \infty\). This is the general shape of a survival function:

using AlgebraOfGraphics
using CairoMakie
x = 0:0.1:5.0
survival(t) = exp(-t)
plt =
    data((; x, t = survival.(x))) *
    mapping(:x => "time", :t => "Survival Function") *
    visual(Lines; linewidth = 3)
draw(plt; axis = (; xticks = 0:5))

Tip

If you are curious why the exp inside the survival function in the plot code, it is because \(S(y)\) is a cumulative function, hence by the fundamental theorem of calculus:

\[s(t) = \frac{d}{dt}S(t) = \frac{d}{dt}\Big( 1-F(t) \Big) = -f(t)\]

where \(s(t)\) is the derivative of the survival function \(S(t)\) and \(f(t)\) is the derivative of the CDF \(F(t)\).

If we exponentiate both left and right ends, we get:

\[S(t) = e^{-f(t)}\]

As you can see, as times goes by, survival function declines until it reaches almost zero at high time values.

2.1 Probability Distributions

Like all other phenomena we saw in these tutorials, we can use a distribution to model survival functions. There is a wide list of candidate distributions. However, most of the time these distributions used are:

  • Exponential: parametrized by \(\lambda\), which is called the rate.
  • Weibull: parametrized by \(\alpha\) and \(\theta\), which are called shape and scale, respectively.

Let us show first the exponential distribution for different values of \(\alpha\):

using Distributions
x = 0:0.1:10.0
λs = [2, 5, 10]
dists = [Exponential(λ) for λ in λs]
pdfs = mapreduce(d -> pdf.(d, x), vcat, dists)
plt =
    data((; λs = repeat(λs; inner = length(x)), x = repeat(x; outer = length(λs)), pdfs)) *
    mapping(
        :x => "outcome",
        :pdfs => "PDF";
        color = :λs => nonnumeric => L"\mathrm{parameter}~\lambda",
        col = :λs => nonnumeric,
    ) *
    visual(Lines)
draw(plt; axis = (; xticks = 0:10), legend = (; position = :bottom, titleposition = :left))

As you can see the exponential distribution has the characteristic of having an exponential decay, in which the parameter \(\lambda\) dictates the rate of the decay. Higher values of \(\lambda\) indicate slower decay rates and moves the density to higher outcome values. Whereas lower values of \(\lambda\) have higher decay rates, moving the density of the distribution towards lower outcome values.

Note that we cannot change the general form of the exponential distribution: it will always be a monotonic exponential decay curve. This is why the Weibull distribution is also used to model survival functions. Take a look at some different values of \(\theta\) (scale parameter) while fixing the value of the shape parameter \(\alpha\) to 3 for Weibull distribution:

x = 0:0.1:10.0
θs = [2, 3, 5]
α = 3
dists = [Weibull(α, θ) for θ in θs]
pdfs = mapreduce(d -> pdf.(d, x), vcat, dists)
plt =
    data((; θs = repeat(θs; inner = length(x)), x = repeat(x; outer = length(θs)), pdfs)) *
    mapping(
        :x => "outcome",
        :pdfs => "PDF";
        color = :θs => nonnumeric => L"\mathrm{parameter}~\theta",
        col = :θs => nonnumeric,
    ) *
    visual(Lines)
draw(plt; axis = (; xticks = 0:10), legend = (; position = :bottom, titleposition = :left))

3 Censoring

We need to cover one last concept before we dive into how to model TTEs. Censoring occurs when we are unable to observe the dependent variable in a survival analysis.

For example, in a cancer survival clinical trial that lasts for 10 years after initial detection, we only have survival functions defined up to the total time of the trial. After the trial is finished, patients may have other outcomes, but we cannot observe them: they are censored.

Note

We won’t be covering censoring in these tutorials. But it is important for you to know the concept and if you have censored observations in your analysis, you definitely should take them into account.

We will describe three possible ways to analyze survival data:

  1. Kaplan-Meyer Estimator: a non-parametric procedure.
  2. Cox Proportional Hazards Model: a semi-parametric procedure.
  3. Accelerated Failure Time Model: a parametric procedure.

Below you’ll find a table that summarizes the main characteristics of each TTE analyzes approaches:

Model Characteristics Example
Non-Parametric Distribution of survival times is unknown; Survival Functions aren’t smooth; Difficult to incorporate complex covariates Kaplan-Meier estimates
Semi-Parametric Decomposes the hazard into a non-parametric baseline and a parametric covariate vector; Shape of baseline hazard is arbitrary; Allows for time-varying covariates Cox-Proportial Hazards
Parametric Distribution is assumed to be known and continuous; Hazard function completely specified; High degree of model flexibility Exponential, Weibull, Log-Normal models

4 Kaplan-Meyer (K-M) Estimator

The Kaplan-Meyer Estimator is a non-parametric statistic used to estimate the survival function \(S(t)\).

Note

Non-parametric statistics is the branch of statistics that is not based solely on parametrized families of probability distributions.

Non-parametric statistics is based on either being distribution-free or having a specified distribution but with the distribution’s parameters unspecified.

The basic idea is to estimate the survival function using the only assumption that the observations are independent. In other words, this means that what happens to observation \(a\) does not impact what happens to observation \(b\). They are only governed by the survival function.

We begin by constructing ordered time intervals for the survival function. We give these time intervals the index \(k\) and they have the following property:

\[t_1 > \ldots > t_k > t_{k+1}\]

and since the survival function is a monotonic decreasing function, this property also holds for the survival function:

\[S(t_1) > \ldots > S(t_k) > S(t_{k+1})\]

Then, the estimated survivor function at any time, \(t\), in the \(k\)th constructed time interval, will be the estimated probability of surviving beyond \(t_k\). To be more precise, this is the probability of surviving through the interval from t(k) to t(k+1), and all preceding intervals.

This leads us to the K-M Estimate of the survival function:

\[\hat{S}(t) = \prod^k_{j=1} \left( \frac{n_j - d_j}{n_j} \right)\]

where:

  • \(n_j\) is the number of subjects who are alive just before time \(t_j\), including those who are about to die at this time.
  • \(d_j\) is the number of subjects who die at time \(t)j\).

Since we discretized continuous time into discrete intervals, the K-M estimate of the survival function becomes a monotonic decreasing step-function. Here is a plot for an estimated K-M survival function using Survival.jl available in Pumas’ sysimage and in any Pumas App:

using Survival
t = [2, 2, 3, 5, 5, 7, 9, 16, 16, 18]
s = [1, 1, 0, 1, 0, 1, 1, 1, 1, 0]
survival_fit = fit(KaplanMeier, t, s)
plt =
    data((; t = survival_fit.events.time, s_t = survival_fit.survival)) *
    mapping(:t => L"t", :s_t => L"S(t)") *
    visual(Stairs; step = :post, linewidth = 3)
draw(
    plt;
    axis = (; xticks = 1:18, yticks = 0.0:0.1:1.0, limits = ((0, nothing), (0, nothing))),
)

The main drawback from K-M estimates is the difficulty to incorporate covariates. This is addressed by the Cox’s Proportional Hazards model, which is a semi-parametric approach.

5 Cox’s Proportional Hazards (PH) Model

Often when we are analyzing survival data, we have individual data that is can shed light into the relationship between individual characteristics and survival rates. These are known as covariates or explanatory variables. Common examples are: physiological variables such as hemoglobin and heart rate; and lifestyle factors such as smoking, diet and exercise.

It’s difficult to incorporate covariate effects into the survival analysis when using non-parametric approaches such as the K-M Estimator. Hence, we need different approaches for that.

Semi-parametric approaches are statistical models that have parametric and nonparametric components. The most used semi-parametric model is Cox’s Proportional Hazards (PH) model.

PH has the following definition:

\[h(t) = h_0(t) \cdot e^{\left( \beta_1 x_1 + \beta_2 x_2 + \dots + \beta_p x_p \right)}\]

where:

  • \(h(t)\) is the hazard function at time \(t\)
  • \(h_0(t)\) is the baseline hazard at time \(t\)
  • \(x_i\) is the \(i\)th covariate
  • \(\beta_i\) the coefficient for the covariate \(x_i\)
Note

We are now using hazards instead of survival function. The hazard function is widely used to express the risk or hazard of an event such as death occurring at some time \(t\). Also some authors and literature might use \(\lambda(t)\) instead of \(h(t)\) to denote the hazard function.

Notice that PH is called proportional hazards because we can rewrite the above equation as:

\[\ln \frac{h(t)}{h_0(t)} = \beta_1 x_1 + \beta_2 x_2 + \dots + \beta_p x_p\]

where \(\ln\) is the natural logarithm function. So the linear combination of coefficients \(\beta_i\) and covariates \(x_i\) is equal to the logarithm of the hazard function divided by the baseline hazard. Hence, the linear combination is the logarithm of a proportional hazard with respect to a baseline hazard.

We say PH is semi-parametric because there is a non-parametric component since no particular form of probability distribution is assumed for the survival times. And also has a parametric component because we can estimate the coefficients \(\beta_i\).

5.1 How to interpret coefficients in a PH model

For the interpretation of the coefficients in a PH model note that:

  1. \(\beta\)s are in log-scale.
  2. \(\beta\)s represents the proportional hazard.

So, if we exponentiate \(\beta\)s we get back something similar to an odds ratio that we saw already in the Logistic Regression tutorial. But now representing the proportional hazard compared to the baseline hazard.

6 Accelerated Failure Time (AFT) Model

When you combine the covariates effects (\(\beta_i\)) while making assumptions about the underlying distribution of the survival times we get a full parametric approach. One of the most used parametric approach is the Accelerated Failure Time (AFT) model.

In an AFT model, the covariates are assumed to act directly on lifetime, just the same as in the PH model. Also in an AFT model we can model the underlying survival distribution as a known distribution but with unknown parameters to be estimated from data.

The most used distributions are: Exponential, Weibull, and Gompertz.

7 Pumas modeling

Let’s create some Pumas models for Time to Event models. First, let’s revisit the Pumas’ workflow.

7.1 Pumas’ Workflow

Pumas’ workflow

  1. Define a model. It can be a PumasModel or a PumasEMModel.
  2. Define a Subject and Population.
  3. Fit your model.
  4. Perform inference on the model’s population-level parameters (also known as fixed effects).
  5. Predict from a fitted model using the empirical Bayes estimate or Simulate random observations from a fitted model using the sampling distribution (also known as likelihood).
  6. Model diagnostics and visual predictive checks.
Note

Pumas modeling is a highly iterative process. Eventually you’ll probably go back and forth in the stages of workflow. This is natural and expected in Pumas’ modeling. Just remember to have a good code organization and version control for all of the workflow iterations.

7.1.1 Defining a model in Pumas

To define a model, you start with the @model macro and a begin .. end block of code:

my_model = @model begin end
PumasModel
  Parameters: 
  Random effects: 
  Covariates: 
  Dynamical system variables: 
  Dynamical system type: No dynamical model
  Derived: 
  Observed: 

This creates an empty model. Now we need to populate it with model components. These are additional macros that we can include inside the @model definition. We won’t be covering all the possible macros that we can use in this lesson, but here is the full list:

  • @param, fixed effects specifications.
  • @random, random effects specifications.
  • @covariates, covariate names.
  • @pre, pre-processing variables for the dynamic system and statistical specification.
  • @dosecontrol, specification of any dose control parameters present in the model.
  • @vars, shorthand notation.
  • @init, initial conditions for the dynamic system.
  • @dynamics, dynamics of the model.
  • @derived, statistical modeling of dependent variables.
  • @observed, model information to be stored in the model solution.
7.1.1.1 Model parameter with @param

The model parameters go inside the @param block. The syntax is the following:

@param begin
    ParameterName  Domain(...)
    ...
end
Tip

To use the “in” () operator in Julia, you can either replace for in or type \in <TAB> for the Unicode character.

By using the begin ... end block we can specify one parameter per line. We will just name the parameters the same name as it’s respective covariate, while also adding the β prefix to make clear that this is a coefficient.

We will just name the parameters the same name as it’s respective covariate, while also adding the β prefix to make clear that this is a coefficient.

  • RealDomain for scalar parameters
  • VectorDomain for vectors
  • PDiagDomain for positive definite matrices with diagonal structure
  • PSDDomain for general positive semi-definite matrices
Tip

If you don’t specify any arguments inside the domain constructor it will either error (for some domains that have required arguments) or will use the defaults. In the case of the RealDomain() without arguments it just uses the following arguments:

RealDomain(; lower = -∞, upper = ∞, init = 0)
7.1.1.2 Model covariates with @covariates

We also have to specify our model covariates using the macro @covariates. This can be done using a one-liner syntax, without begin ... end block:

@covariates Group var1 var2 var3 # and so on

Or using the begin ... end block and specifying one covariate per line. This is what we will do, since our model has several covariates, which makes easy to visualize them and also to comment out a covariate to perform iterative model exploration:

@covariates begin
    var1
    var2
    var3
    ...
end
7.1.1.3 Pre-processing variables with @pre

We can specify all the necessary variable and statistical pre-processing with the @pre macro.

Note

The @pre block is traditionally used to specify the inputs of the Ordinary Differential Equations (ODE) system used for non-linear mixed-effects PK/PD models (NLME).

But we will use @pre to specify our variable transformations instead.:::

Here, we are defining all the coefficients-covariates multiplication operation here. Note that we are adding the prefix _ to the resulting operation:

@pre begin
    _var1 = @. var1 * βvar1
    _var2 = @. var2 * βvar2
    _var3 = @. var3 * βvar3
end
Note

We are using the @. macro which tells Julia to vectorize (add the “dot syntax”) to all operations and function calls to the right of it.

7.1.1.4 Statistical modeling of dependent variables with @derived

Our final block, @derived, is used to specify all the assumed distributions of observed variables that are derived from the blocks above. This is where we include our dependent variable: depvar.

Note that depvar is being declared as following a Normal distribution. This is why we use the tilde notation ~. It means (much like the mathematical model notation) that depvar follows a Normal distribution. Since depvar might be a vector of values, we need to broadcast, i.e. vectorize, the operation with the dot . operator:

depvar ~ Normal.(μ, σ)

where μ and σ are the parameters mean and standard deviation that parametrizes the Normal distribution.

If we would do more operations to the right of the ~, we would need to vectorize all of them. This is cumbersome, so we can use the @. macro which tells Julia to apply the . in every operator and function call after it:

depvar ~ @. Normal(μ, σ)
Tip

In this block we can use all variables defined in the previous blocks, in our example the @param, @covariates and @pre blocks.

7.1.2 Define a Subject and Population

Once we have our model defined we have to specify a Subject and a Population.

In Pumas, subjects (i.e. observations) are represented by the Subject type and collections of subjects are represented as vectors of Subjects are defined as Population.

Subjects can be constructed with the Subject() constructor, for example:

Subject()
Subject
  ID: 1

We just constructed a Subject that has ID equal to 1 and no extra information. In order to provide extra information, you would need to pass keyword arguments to the Subject() constructor.

You can use the following keyword arguments inside Subject():

  • id: identifier.
  • observations: a named tuple of the dependent variables.
  • covariates: a named tuple containing the covariates, or nothing.
  • events: a vector of Events.
  • time: a vector of time stamps for the observations.

For this lesson, we will just use the covariates keyword argument.

This is an example of a Subject with ID equal to 42 and with some covariates specified as a named tuple:

Subject(id = 42, covariates = (; TRT = 1, AUC = 100.0, WT = 70))
Subject
  ID: 42
  Covariates: TRT, AUC, WT

Since a Population is just a vector of Subjects, we can use a simple map() function over the IDs present at our dataset. For example:

pop = map(
    i -> Subject(id = i, covariates = (; TRT = df.TRT, AUC = df.AUC, WT = df.WT)),
    1:maximum(df.ID),
)
7.1.2.1 Reading Subjects directly from a DataFrame

Another option is to use the read_pumas() function to read Subjects (and Population) directly from a DataFrame. This is more convenient than the map() with the Subject() constructor inside.

The read_pumas() function accepts as first argument a DataFrame followed by the following keyword options (which are very similar to the Subject() constructor):

  • observations: dependent variables specified by a vector of column names, i.e. [:depvar].
  • covariates: covariates specified by a vector of column names, i.e. [:TRT, :AUC, :WT, ...].
  • id: specifies the ID column of the DataFrame.
  • time: specifies the time column of the DataFrame.
  • event_data: toggles assertions applicable to event data, either true or `false.
Caution

Even if you don’t have a :time column the read_pumas function expects it. So you can pass a column of zeros as Float64:

@rtransform! df :time = 0.0
pop = read_pumas(df; ..., time = :time)

7.1.3 Fit your model

Now that you have a population, you can fit your model!

Model fitting in Pumas has the purpose of estimating parameters and is done by calling the fit() function with the following positional arguments:

  1. Pumas model.
  2. a Population.
  3. a named tuple of the initial parameter values.
  4. an inference algorithm.
7.1.3.1 Initial Parameter’s Values

We already covered model and Population, now let’s talk about initial parameter values. It is the 3rd positional argument inside fit().

You can specify your initial parameters as a named tuple. For instance, if you want to have a certain parameter, β1, as having an initial value as 3.14, you can do so by passing it inside a named tuple in the 3rd positional argument of fit():

fit(model, population, (; β1 = 3.14))

You can also use the helper function init_params() which will recover all the initial parameters we specified inside the model’s @param block.

7.1.3.2 Inference Algorithms

Finally, our last (4th) positional argument is the choice of inference algorithm.

Pumas has the following available inference algorithms:

  • Marginal Likelihood Estimation:

    • NaivePooled(): first order approximation without random-effects.
    • FO(): first-order approximation.
    • FOCE(): first-order conditional estimation with automatic interaction detection.
    • LaplaceI(): second-order Laplace approximation.
  • Bayesian with Markov Chain Monte Carlo (MCMC):

    • BayesMCMC(): MCMC using No-U-Turn Sampler (NUTS).
Note

We can also use a Maximum A Posteriori (MAP) estimation procedure for any marginal likelihood estimation algorithm. You just need to call the MAP() constructor with the desired marginal likelihood algorithm inside, for instance:

fit(model, population, init_params(model), MAP(FOCE()))

7.1.4 Perform inference on the model’s population-level parameters

Once the model is fitted, we can analyze our inference and estimates.

We can see that after the model is fitted, it prints a result with a summary and a table of the parameter estimates.

We can also recover the estimates as a named tuple with coef(). Or as a DataFrame with coeftable().

We can also inspect our estimated coefficients’ standard errors (SE) along with the 95% confidence intervals with the infer() function

7.1.5 Predict from a fitted model or Simulate random observations from a fitted model

Now that we have our model estimates, we can use it to make predictions or simulate random observations.

7.1.5.1 Predictions with predict()

In order to calculate prediction from a fitted model, we need to use the predict() function and passing two positional arguments:

  1. a fitted Pumas model.
  2. either a single Subject or a collection of them with a Population.

For example to make predictions onto a single Subject, we can:

my_subject = Subject(id = 42, covariates = (; TRT = 1, AUC = 100.0, WT = 70))
Subject
  ID: 42
  Covariates: TRT, AUC, WT
Caution

predict() also needs a keyword argument obstimes which represents the observation times that we want to predict the dependent variable. The value must be a vector. Since we used a dummy 0.0 value as :time variable in our read_pumas function, we can pass the same 0.0 but as a vector to the predict() function.

predict(model_fit, my_subject; obstimes = [0.0])

We can also generate prediction for a Population:

predict(model_fit, pop; obstimes = [0.0])
7.1.5.2 Simulations with simobstte()
Caution

Time to event models uses the simobstte function instead of the simobs function we covered in all of the discrete models so far.

We can generate random observations from our fitted Pumas model with the simobstte() function. You need to supply at least three positional arguments:

  1. a non-fitted Pumas model, in our case model.
  2. either a single Subject or a collection of them with a Population.
  3. a named tuple of parameter values.

The 1st and 2nd positional arguments we already covered in our previous examples. Let’s focus on the 3rd positional argument: param. It can be a named tuple with the parameter values. For instance, we can use estimated parameters from a fitted model with coef(model_fit):

simobstte(model, pop, coef(model_fit))
Note

You can also pass a single parameter as a Pumas fitted model to simobstte().

simobstte(model_fit)

7.1.6 Model diagnostics

Finally, our last step is to assess model diagnostics.

7.1.6.1 Assessing model diagnostics

To assess model diagnostics we can use the inspect() function in a fitted Pumas model:

inspect(model_fit)

We can also check for Information Criteria, such as:

  • Akaike Information Criterion (AIC): aic(model_fit)
  • Bayesian Information Criterion (BIC): bic(model_fit)

8 tte_single dataset

The tte_single dataset from PharmaDatasets.jl has just a dependent variable :DV and covariates :TIME and :DOSE.

Since it is a single Time to Event (TTE) it has an :EVID column filled with 3s and 0s.

Tip

:EVID columns describes the event ID. It uses a pattern reminiscent from NONMEM, where 1 specifies a normal event. 3 means it’s a reset event, meaning that the value of the dynamical variable is reset to the amt at the dosing event. If 4, then the dynamical value and time are reset, and then a final dose is given. Defaults to 1. For more about EVIDs see the Dosage Regimens documentation.

First, let’s load the dataset with PharmaDatasets.jl:

using PharmaDatasets

tte_single = dataset("tte_single")
first(tte_single, 5)
5×5 DataFrame
Row ID TIME DV EVID DOSE
Int64 Float64 Int64 Int64 Int64
1 1 0.0 0 3 1
2 1 484.537 1 0 1
3 2 0.0 0 3 1
4 2 246.129 1 0 1
5 3 0.0 0 3 1

8.1 Exploratory Data Analysis

Let’s do some exploratory data analysis and start with loading DataFramesMeta.jl:

using DataFrames
using DataFramesMeta

As you can see every ID has two unique EVIDs:

  • 3 for the reset event, i.e. the start of the dosing event.
  • 1 for a normal event, i.e. when a “failure” happens.

We have 300 unique IDs from 1 to 300:

by_df = @by tte_single :ID :unique_EVID = length(:EVID)
last(by_df, 5)
5×2 DataFrame
Row ID unique_EVID
Int64 Int64
1 296 2
2 297 2
3 298 2
4 299 2
5 300 2

Let’s see some summary statistics with describe():

describe(tte_single)
5×7 DataFrame
Row variable mean min median max nmissing eltype
Symbol Float64 Real Float64 Real Int64 DataType
1 ID 150.5 1 150.5 300 0 Int64
2 TIME 127.492 0.0 2.95488 500.0 0 Float64
3 DV 0.413333 0 0.0 1 0 Int64
4 EVID 1.5 0 1.5 3 0 Int64
5 DOSE 0.5 0 0.5 1 0 Int64

We need to have summary statistics by :EVID. This can be accomplished with a groupby or @by:

@chain tte_single begin
    @by :EVID :mean_failure = mean(:TIME)
    @rsubset :EVID != 3
end
1×2 DataFrame
Row EVID mean_failure
Int64 Float64
1 0 254.984

The mean time of failure is the mean of the variable :TIME only for EVIDs values different than 3: ≈255.

Also we would like to know if the mean time of failure differs for different :DOSE values. :DOSE can take only values 0 and 1:

unique(tte_single.DOSE)
2-element Vector{Int64}:
 1
 0

This can also be accomplished with a groupby or @by:

@chain tte_single begin
    @by [:DOSE, :EVID] :mean_failure = mean(:TIME)
    @rsubset :EVID != 3
    @select $(Not(:EVID))
end
2×2 DataFrame
Row DOSE mean_failure
Int64 Float64
1 0 180.261
2 1 329.707

So, definitely observations that have a value 1 of DOSE have higher failures time than observations with DOSE having a value of 0.

8.2 Visualizations

Now let’s plot some visualizations with AlgebraOfGraphics.jl using CairoMakie.jl as a backend:

First, we need to convert tte_single dataset to a time-series type of dataset. This can be accomplished by keeping only the EVIDs different from 3, then grouping by both :DOSE and :TIME, and counting the occurrences of :TIME:

tte_single_ts = @chain tte_single begin
    @rsubset :EVID != 3
    @by [:DOSE, :TIME] :total_failure = length(:TIME)
    @orderby :TIME :DOSE
end
Note

Please heed that our tte_single dataset is censored. So every observation that has not “failed” until time 500 automatically takes value 500 for the :TIME column. This is a problem for the time series visualization, so we will keep only the observations with :TIME lower than 500 in the tte_single_ts dataset.

sum(tte_single_ts.total_failure)
300
@rsubset! tte_single_ts :TIME < 500;

sum(tte_single_ts.total_failure)
248

We can see that we have a total failure of 248 observations from the total of 300.

We need one last thing for our time series formatted data tte_single_ts. In order to get a plot of the survival function through time, we need to normalize the :total_failure to the interval from 0 to 1 and also calculate the cumulative sum by different :DOSE values:

d0, d1 = groupby(tte_single_ts, :DOSE);

@transform! d0 @astable begin
    sum_total_failure = sum(:total_failure)
    cumulative_failure = cumsum(:total_failure)
    :survival = (sum_total_failure .- cumulative_failure) ./ sum_total_failure
end

@transform! d1 @astable begin
    sum_total_failure = sum(:total_failure)
    cumulative_failure = cumsum(:total_failure)
    :survival = (sum_total_failure .- cumulative_failure) ./ sum_total_failure
end

tte_single_ts_combined = vcat(d0, d1)

Now we can plot it:

data(tte_single_ts_combined) *
mapping(:TIME, :survival; color = :DOSE => nonnumeric) *
visual(Lines) |> draw

As we can clearly see, :DOSE impacts the survival function.

8.3 Pumas Models

Now let’s create three Pumas AFT models. One for each of the assumed underlying distribution of the survival function:

  • Exponential
  • Weibull
  • Gompertz

8.3.1 Exponential AFT Model

Let’s start with the simplest AFT model: exponential.

We define two parameters in the @params block:

  • λ₁: basal hazard.
  • β: coefficient for covariate :DOSE.

Inside the @pre block we define _λ₀ as the total hazard.

In the @vars block we include our underlying exponential distribution. For the exponential distribution, the estimated hazard λ is simply equal to the total hazard _λ₀.

Now, most important, in the @dynamics block, we defined the cumulative hazard as Λ and the its derivative Λ' equal to the estimated hazard λ.

Finally, in the @derived block, we condition our dependent variable :DV as a TimeToEvent Pumas’ distribution. TimeToEvent takes the estimated hazard λ as the first positional parameter and as the second positional parameter the cumulative hazard Λ.

The dependent variable in a model that uses TimeToEvent should be a censoring variable that is zero if the variable isn’t censored and one if the variable is right censored. Currently, no other censoring types are supported.

So, this means that your DV has to take only values of 1 and 0. If you observe a “failure” event until the end of the data collection, you should make DV = 1, and DV = 0 otherwise.

Since DV is a column (and also a vector of values), we need to broadcast, i.e. vectorize, the operation with the dot . operator. We do this with the @. macro.

Here is the full TTE exponential Pumas model:

Note

It is always best to initialize your parameters to something other than 0 (the default of RealDomain). This makes the iterative estimation procedures more numerically robust. Whenever uncertain you can choose a small value close to 0. In our case, I’ve chosen 0.001.

tte_exp_model = @model begin
    @param begin
        λ₁  RealDomain(; lower = 0, init = 0.001)
        β  RealDomain(; init = 0.001)
    end

    @covariates DOSE

    @pre begin
        _λ₀ = λ₁ * exp* DOSE)
    end

    @vars begin
        # exponential
        λ = _λ₀
    end

    @dynamics begin
        # the derivative of cumulative hazard is equal to hazard
        Λ' = λ
    end

    @derived begin
        # Pumas derived TTE function
        DV ~ @. TimeToEvent(λ, Λ)
    end
end
PumasModel
  Parameters: λ₁, β
  Random effects: 
  Covariates: DOSE
  Dynamical system variables: Λ
  Dynamical system type: Matrix exponential
  Derived: DV, λ
  Observed: DV, λ

8.3.2 Weibull AFT Model

Let’s move on to the second AFT model: Weibull.

We now define three parameters, one additional from the exponential model, in the @params block:

  • λ₁: basal hazard.
  • β: coefficient for covariate :DOSE.
  • Κ: the shape parameter of the underlying AFT model Weibull distribution. This is the α parameter in the Weibull plots above in the beginning of the tutorials. Weibull distribution is used extensively by several modeling communities and, because of historical reasons, there are several naming conventions for the parameters that characterize the distribution. We will use K from now, since it is customary in the biomedical modeling community.
Tip

Almost everything is similar from the exponential model. The only thing that changes is how we calculate the estimated hazard λ inside the @vars

Inside the @pre block we define _λ₀ as the total hazard.

In the @vars block we include our underlying Weibull distribution. For the Weibull distribution, the estimated hazard λ is the probability density function (PDF) of the Weibull distribution condition on the total hazard _λ₀ and also parameter K.

Now, most important, in the @dynamics block, we defined the cumulative hazard as Λ and the its derivative Λ' equal to the estimated hazard λ.

Finally, in the @derived block, we condition our dependent variable :DV as a TimeToEvent Pumas’ distribution. TimeToEvent takes the estimated hazard λ as the first positional parameter and as the second positional parameter the cumulative hazard Λ.

The dependent variable in a model that uses TimeToEvent should be a censoring variable that is zero if the variable isn’t censored and one if the variable is right censored. Currently, no other censoring types are supported.

So, this means that your DV has to take only values of 1 and 0. If you observe a “failure” event until the end of the data collection, you should make DV = 1, and DV = 0 otherwise.

Since DV is a column (and also a vector of values), we need to broadcast, i.e. vectorize, the operation with the dot . operator. We do this with the @. macro.

Here is the full TTE Weibull Pumas model:

tte_wei_model = @model begin
    @param begin
        λ₁  RealDomain(; lower = 0, init = 0.001)
        β  RealDomain(; init = 0.001)
        Κ  RealDomain(; lower = 0, init = 0.001)
    end

    @covariates DOSE

    @pre begin
= Κ
        _λ₀ = λ₁ * exp* DOSE)
    end

    @vars begin
        # Weibull
        # 1e-10 for model numerical stability
        λ = _λ₀ ** (_λ₀ * t + 1e-10)^(_Κ - 1)
    end

    @dynamics begin
        # the derivative of cumulative hazard is equal to hazard
        Λ' = λ
    end

    @derived begin
        # Pumas derived TTE function
        DV ~ @. TimeToEvent(λ, Λ)
    end
end
PumasModel
  Parameters: λ₁, β, Κ
  Random effects: 
  Covariates: DOSE
  Dynamical system variables: Λ
  Dynamical system type: Nonlinear ODE
  Derived: DV, λ
  Observed: DV, λ

8.3.3 Gompertz AFT Model

Let’s move on to the third and final AFT model: Gompertz.

We also define three parameters, one additional from the exponential model, in the @params block:

  • λ₁: basal hazard.
  • β: coefficient for covariate :DOSE.
  • Κ: the shape parameter of the underlying AFT model Gompertz distribution.
Tip

Almost everything is similar from the exponential model. The only thing that changes is how we calculate the estimated hazard λ inside the @vars

Inside the @pre block we define _λ₀ as the total hazard.

In the @vars block we include our underlying Weibull distribution. For the Gompertz distribution, the estimated hazard λ is the probability density function (PDF) of the Gompertz distribution condition on the total hazard _λ₀ and also parameter K.

Now, most important, in the @dynamics block, we defined the cumulative hazard as Λ and the its derivative Λ' equal to the estimated hazard λ.

Finally, in the @derived block, we condition our dependent variable :DV as a TimeToEvent Pumas’ distribution. TimeToEvent takes the estimated hazard λ as the first positional parameter and as the second positional parameter the cumulative hazard Λ.

The dependent variable in a model that uses TimeToEvent should be a censoring variable that is zero if the variable isn’t censored and one if the variable is right censored. Currently, no other censoring types are supported.

So, this means that your DV has to take only values of 1 and 0. If you observe a “failure” event until the end of the data collection, you should make DV = 1, and DV = 0 otherwise.

Since DV is a column (and also a vector of values), we need to broadcast, i.e. vectorize, the operation with the dot . operator. We do this with the @. macro.

Here is the full TTE Gompertz Pumas model:

tte_gomp_model = @model begin
    @param begin
        λ₁  RealDomain(; lower = 0, init = 0.001)
        β  RealDomain(; init = 0.001)
        Κ  RealDomain(; lower = 0, init = 0.001)
    end

    @covariates DOSE

    @pre begin
= Κ
        _λ₀ = λ₁ * exp* DOSE)
    end

    @vars begin
        # Gompertz
        λ = _λ₀ * exp(_Κ * t)
    end

    @dynamics begin
        # the derivative of cumulative hazard is equal to hazard
        Λ' = λ
    end

    @derived begin
        # Pumas derived TTE function
        DV ~ @. TimeToEvent(λ, Λ)
    end
end
PumasModel
  Parameters: λ₁, β, Κ
  Random effects: 
  Covariates: DOSE
  Dynamical system variables: Λ
  Dynamical system type: Nonlinear ODE
  Derived: DV, λ
  Observed: DV, λ

8.4 tte_single Population

Before we do any fitting of the models, we need to convert our dataset tte_single into a Population with the read_pumas function.

Since we have different :EVID values, we need to also provide :AMT and :CMT columns.

:EVID column is specified using the evid keyword argument. :AMT column is the dose amount column and is specified using the amt keyword argument. Since we do not have any pharmacokinetics in our model and data, we simply create an :AMT column filled with 0s. :CMT column is the compartment column and is specified using the cmt keyword argument. Since we do not have any pharmacokinetics in our model and data, we also simply create an :CMT column filled with 1s to denote a single compartment model:

@rtransform! tte_single :AMT = 0 :CMT = 1;
tte_single_pop = read_pumas(
    tte_single;
    observations = [:DV],
    covariates = [:DOSE],
    id = :ID,
    time = :TIME,
    amt = :AMT,
    cmt = :CMT,
    evid = :EVID,
)
Population
  Subjects: 300
  Covariates: DOSE
  Observations: DV

8.5 Estimation with fit

Now that we have Pumas models and a population we can start the estimation with fit. We do not have any mixed-effects, so we will use the NaivePooled() maximum likelihood estimation algorithm.

Let’s fit all three models using the same population:

tte_single_exp_fit =
    fit(tte_exp_model, tte_single_pop, init_params(tte_exp_model), Pumas.NaivePooled());

tte_single_wei_fit =
    fit(tte_wei_model, tte_single_pop, init_params(tte_wei_model), Pumas.NaivePooled());

tte_single_gomp_fit =
    fit(tte_gomp_model, tte_single_pop, init_params(tte_gomp_model), Pumas.NaivePooled());

8.6 Comparing Estimates with compare_estimates

We can use the function compare_estimates from PumasUtilities.jl to compare estimates across different models.

Let’s load PumasUtilities.jl:

using PumasUtilities

The compare_estimates takes a NamedTuple of models:

compare_estimates(;
    Exponential = tte_single_exp_fit,
    Weibull = tte_single_wei_fit,
    Gompertz = tte_single_gomp_fit,
)
3×4 DataFrame
Row parameter Exponential Weibull Gompertz
String Float64? Float64? Float64?
1 λ₁ 0.00532562 0.00496489 0.00375767
2 β -0.92922 -0.811505 -1.08131
3 Κ missing 1.30176 0.00223201

As we can see all three AFT models have similar estimates for hazard λ₁ and DOSE effect on hazard β.

Let’s bring also the confidence intervals and standard error for the estimates of each model.

We can calculate confidence intervals with the infer function, and then collect them into a DataFrame with the coeftable function. We do a little bit of data wrangling to get the comparisons and also exponentiate the estimates to make them interpretable:

df_exponential = coeftable(infer(tte_single_exp_fit))
df_weibull = coeftable(infer(tte_single_wei_fit))
df_gompertz = coeftable(infer(tte_single_gomp_fit))

@rsubset! df_exponential :parameter == "β"
@rtransform! df_exponential :model = "exponential"

@rsubset! df_weibull :parameter == "β"
@rtransform! df_weibull :model = "Weibull"

@rsubset! df_gompertz :parameter == "β"
@rtransform! df_gompertz :model = "Gompertz"

df = vcat(df_exponential, df_weibull, df_gompertz)

@rtransform! df $(
    [:estimate, :ci_lower, :ci_upper] .=>
        (x -> exp.(x)) .=> [:estimate, :ci_lower, :ci_upper]
)
[ Info: Calculating: variance-covariance matrix.
[ Info: Done.
[ Info: Calculating: variance-covariance matrix.
[ Info: Done.
[ Info: Calculating: variance-covariance matrix.
[ Info: Done.
3×7 DataFrame
Row parameter estimate se relative_se ci_lower ci_upper model
String Float64 Float64 Float64 Float64 Float64 String
1 β 0.394862 0.112017 0.120549 0.317027 0.491806 exponential
2 β 0.444189 0.102203 0.125942 0.363557 0.542704 Weibull
3 β 0.339152 0.134403 0.124297 0.26061 0.441365 Gompertz

Weibull AFT model has the lowest standard error, as display in the column :se, and, consequently, the narrowest confidence interval.

8.6.1 Visual Comparisons

Let’s do some visual comparisons between the models. We are going to use the estimated hazard and DOSE covariate effect to plot different estimated survival functions across all three models using color to differentiate between DOSE equal 0 or 1.

This means a line plot with:

  • t in the x-axis
  • estimated hazard in the y-axis
  • color is DOSE

To get the estimated hazard as a function of time t we are going to use the formula in the @vars block in each model while using the estimated parameters from each model in the calculation.

First, some helper functions:

function hazard_exponential(λ₁, β, DOSE, T)
    λ₀ = λ₁ * exp* DOSE) # estimated hazard
    λ = [λ₀ for t = 1:T]   # cumulative hazard
    return 1 .- cumsum(λ)
end;

function hazard_weibull(λ₁, β, Κ, DOSE, T)
    λ₀ = λ₁ * exp* DOSE)                              # estimated hazard
    λ = [λ₀ * Κ * (λ₀ * t + 1e-10)^- 1) for t = 1:T] # cumulative hazard
    return 1 .- cumsum(λ)
end;

function hazard_gompertz(λ₁, β, Κ, DOSE, T)
    λ₀ = λ₁ * exp* DOSE)            # estimated hazard
    λ = [λ₀ * exp* t) for t = 1:T] # cumulative hazard
    return 1 .- cumsum(λ)
end;

These functions will return a length-\(T\) vector of the survival distribution. That is, 1 minus the cumulative hazard until time \(T\) depending on the DOSE value.

Now let’s plot it:

t = 1:200

# exponential estimates
λ_exponential_dose_0 = hazard_exponential(
    0.00532562, # estimated λ₁
    -0.92922,   # estimated β
    0,          # DOSE value
    length(t),  # T
)
λ_exponential_dose_1 = hazard_exponential(
    0.00532562, # estimated λ₁
    -0.92922,   # estimated β
    1,          # DOSE value
    length(t),  # T
)

# Weibull estimates
λ_weibull_dose_0 = hazard_weibull(
    0.00496489, # estimated λ₁
    -0.811505,  # estimated β
    1.30176,    # Κ 
    0,          # DOSE value
    length(t),  # T
)
λ_weibull_dose_1 = hazard_weibull(
    0.00496489, # estimated λ₁
    -0.811505,  # estimated β
    1.30176,    # Κ 
    1,          # DOSE value
    length(t),  # T
)

# Gompertz estimates
λ_gompertz_dose_0 = hazard_gompertz(
    0.00375767, # estimated λ₁
    -1.08131,   # estimated β
    0.00223201, # Κ 
    0,          # DOSE value
    length(t),  # T
)
λ_gompertz_dose_1 = hazard_gompertz(
    0.00375767, # estimated λ₁
    -1.08131,   # estimated β
    0.00223201, # Κ 
    1,          # DOSE value
    length(t),  # T
)

plt =
    data(
    # data in a long table format
        (;
        t = repeat(t, 6),             # 2 DOSES ⋅ 3 AFT models
        λs = vcat(
            # exponential estimates
            λ_exponential_dose_0,
            λ_exponential_dose_1,
            # Weibull estimates
            λ_weibull_dose_0,
            λ_weibull_dose_1,
            # Gompertz estimates
            λ_gompertz_dose_0,
            λ_gompertz_dose_1,
        ),
        DOSE = repeat(
            [0, 1];
            inner = length(t), # repeat 0 500x then 1 500x
            outer = 3,          # 3 AFT models
        ),
        MODEL = repeat(["exponential", "Weibull", "Gompertz"]; inner = 2 * length(t)),    # repeat each model 1_000x
    )) *
    mapping(:t => L"t", :λs => L"S(t)"; color = :DOSE => nonnumeric, row = :MODEL) *
    visual(Lines)
draw(plt; axis = (; yticks = 0.0:0.25:1.0))

The models differ little in the general estimation of the survival function across different DOSE values.

9 tte_repeated dataset

The tte_repeated dataset from PharmaDatasets.jl has just a dependent variable :DV and covariate :TIME.

This type of data is common when the time to event can occurs several times. For example, patient death is generally a single TTE data. Whereas, patient experiencing vomiting or being afflicted with cancer can occur more than once, then it is a repeated TTE data.

tte_repeated = dataset("tte_repeated")
first(tte_repeated, 5)
5×3 DataFrame
Row ID DV TIME
Int64 Int64 Float64
1 1 0 288.0
2 2 1 22.6626
3 2 1 29.3927
4 2 0 288.0
5 3 0 288.0

We can definitely see that we have 316 repeated observations from 120 IDs, from 1 to 120.

Note

tte_repeated does not have an :EVID column like the tte_single dataset. Hence we need to add an evid = 3 row for each measurement row evid = 0.

tte_repeated[!, :TAD] =
    combine(i -> [first(i.TIME); diff(i.TIME)], groupby(tte_repeated, [:ID])).x1
# Add EVID=3 events with TAD=1e-10 to reset the integration right after each observation
evd = DataFrame(;
    ID = tte_repeated.ID,
    AMT = missing,
    DV = missing,
    TIME = tte_repeated.TIME,
    TAD = 1e-10,
    EVID = 3,
)
tte_repeated[!, :EVID] .= 0
tte_repeated[!, :AMT] .= 0
tte_repeated_evid = vcat(tte_repeated, evd)
sort!(tte_repeated_evid, [:ID, :TIME, :EVID])

Also, we have, on average, 2.6 events per ID:

@chain tte_repeated_evid begin
    @rsubset :EVID == 0
    @by :ID :freq_DV = length(:DV)
    @combine :mean_freq_DV = mean(:freq_DV)
end
1×1 DataFrame
Row mean_freq_DV
Float64
1 2.63333

9.1 Pumas Models

Since we do not have any covariates besides :TIME our previous defined Pumas models would need to be rewritten without DOSE.

I will show an example with the exponential AFT model. The code for Weibull and Gompertz should be the same as above but without the β and DOSE.

tte_repeated_exp_model = @model begin
    @param begin
        λ₁  RealDomain(; lower = 0, init = 0.001)
    end

    @pre begin
        _λ₀ = λ₁
    end

    @vars begin
        # exponential
        λ = _λ₀
    end

    @dynamics begin
        # the derivative of cumulative hazard is equal to hazard
        Λ' = λ
    end

    @derived begin
        # Pumas derived TTE function
        DV ~ @. TimeToEvent(λ, Λ)
    end
end
PumasModel
  Parameters: λ₁
  Random effects: 
  Covariates: 
  Dynamical system variables: Λ
  Dynamical system type: Matrix exponential
  Derived: DV, λ
  Observed: DV, λ

9.2 tte_repeated Population

Like we did for tte_single, we need to convert our dataset tte_repeated into a Population with the read_pumas function.

Since we have a dependent variable :DV, we need to also provide an :AMT column.

:AMT column is the dose amount column and is specified using the amt keyword argument. Since we do not have any pharmacokinetics in our model and data, we simply create an :AMT column filled with 0s:

@rtransform! tte_repeated_evid :AMT = 0;
tte_repeated_pop = read_pumas(
    tte_repeated_evid;
    observations = [:DV],
    id = :ID,
    time = :TAD,
    evid = :EVID,
    amt = :AMT,
)

9.3 Estimation with fit

Now that we have a Pumas model and a population we can start the estimation with fit. We do not have any mixed-effects, so we will use the NaivePooled() maximum likelihood estimation algorithm.

Let’s fit our tte_repeated_exp_model using the population we just created:

tte_repeated_exp_fit = fit(
    tte_repeated_exp_model,
    tte_repeated_pop,
    init_params(tte_repeated_exp_model),
    Pumas.NaivePooled(),
)
[ Info: Checking the initial parameter values.
[ Info: The initial negative log likelihood and its gradient are finite. Check passed.
Iter     Function value   Gradient norm 
     0     1.388480e+03     1.614400e+02
 * time: 0.00010800361633300781
     1     1.251864e+03     1.020562e+02
 * time: 0.002665996551513672
     2     1.212583e+03     3.131723e+01
 * time: 0.006311178207397461
     3     1.210335e+03     1.514137e+01
 * time: 0.008748054504394531
     4     1.209782e+03     1.285035e+00
 * time: 0.011193037033081055
     5     1.209778e+03     4.741926e-02
 * time: 0.013631105422973633
     6     1.209778e+03     1.561555e-04
 * time: 0.01604008674621582
FittedPumasModel

Successful minimization:                      true

Likelihood approximation:              NaivePooled
Likelihood Optimizer:                         BFGS
Dynamical system type:          Matrix exponential

Log-likelihood value:                   -1209.7782
Number of subjects:                            120
Number of parameters:         Fixed      Optimized
                                  0              1
Observation records:         Active        Missing
    DV:                         316              0
    Total:                      316              0

------------------
       Estimate
------------------
λ₁      0.0056713
------------------

10 Conclusion

I hope you enjoyed this tutorial on time to event modeling. Whenever you are dealing with survival data, or any other of time to event data, you can use any of the techniques described here. Also, don’t forget to check the recommended textbooks and resources in the Tip green box at the start of this tutorial.