using Pumas
Time To Event
This lesson is really comprehensive and has a wide audience in mind. We recommend the reader to briefly navigate the Table of Contents and choose which topics to dive in. If you are in doubt feel free to read it from top to bottom.
In this lecture we’ll explore time to event (TTE) models.
First, let’s define what are TTE models.
There are several resources available for a deep dive into time to event and survival models. Specifically, we recommend:
- Collett, D. (2015). Modelling survival data in medical research. Chapman and Hall/CRC.
- Smith, P. J. (2002). Analysis of failure and survival data. Chapman and Hall/CRC.
- Ibrahim, J. G., Chen, M. H., Sinha, D., Ibrahim, J. G., & Chen, M. H. (2001). Bayesian survival analysis. New York: Springer.
1 What are Time to Event Models?
Time to event models, also called survival analysis, involves lifetime distributions. Despite having applications in engineering, social sciences, computer sciences and so on, we will focus on TTE models applied to health, i.e. lifetime distributions of biological systems.
In TTE models, we are generally interested in lifetime distributions. This can be analyzed as different populations, such as those under different treatment arms of a clinical trial. Or as covariates effects which are common in observational or epidemiological studies.
Most important in a TTE model is the survival function.
2 Survival Function
To understand survival function, first we need to learn what are survival random variables. Survival random variables is simply a random variable whose outcomes are constrained to the positive domain:
\[T \in [0, \infty)\]
Then, the survival function is one minus the cumulative distribution function of the survival random variable \(T\):
\[\begin{aligned} F(t) &= P(T \leq t) = \int_0^t f(u)du \\ S(t) &= P(T > t) = 1 - F(t) = \int_t^\infty f(u)du \end{aligned}\]
where:
- \(F(t)\) is the cumulative distribution function of the survival random variable \(T\) at value \(T=t\).
- \(S(t)\) is the survival function of the survival random variable \(T\) at value \(T=t\).
Furthermore, \(S(t)\) is a monotonic decreasing function which tends to 0 as \(t \to \infty\). This is the general shape of a survival function:
using AlgebraOfGraphics
using CairoMakie
= 0:0.1:5.0
x survival(t) = exp(-t)
=
plt data((; x, t = survival.(x))) *
mapping(:x => "time", :t => "Survival Function") *
visual(Lines; linewidth = 3)
draw(plt; axis = (; xticks = 0:5))
If you are curious why the exp
inside the survival function in the plot code, it is because \(S(y)\) is a cumulative function, hence by the fundamental theorem of calculus:
\[s(t) = \frac{d}{dt}S(t) = \frac{d}{dt}\Big( 1-F(t) \Big) = -f(t)\]
where \(s(t)\) is the derivative of the survival function \(S(t)\) and \(f(t)\) is the derivative of the CDF \(F(t)\).
If we exponentiate both left and right ends, we get:
\[S(t) = e^{-f(t)}\]
As you can see, as times goes by, survival function declines until it reaches almost zero at high time values.
2.1 Probability Distributions
Like all other phenomena we saw in these tutorials, we can use a distribution to model survival functions. There is a wide list of candidate distributions. However, most of the time these distributions used are:
- Exponential: parametrized by \(\lambda\), which is called the rate.
- Weibull: parametrized by \(\alpha\) and \(\theta\), which are called shape and scale, respectively.
Let us show first the exponential distribution for different values of \(\alpha\):
using Distributions
= 0:0.1:10.0
x = [2, 5, 10]
λs = [Exponential(λ) for λ in λs]
dists = mapreduce(d -> pdf.(d, x), vcat, dists)
pdfs =
plt data((; λs = repeat(λs; inner = length(x)), x = repeat(x; outer = length(λs)), pdfs)) *
mapping(
:x => "outcome",
:pdfs => "PDF";
= :λs => nonnumeric => L"\mathrm{parameter}~\lambda",
color = :λs => nonnumeric,
col *
) visual(Lines)
draw(plt; axis = (; xticks = 0:10), legend = (; position = :bottom, titleposition = :left))
As you can see the exponential distribution has the characteristic of having an exponential decay, in which the parameter \(\lambda\) dictates the rate of the decay. Higher values of \(\lambda\) indicate slower decay rates and moves the density to higher outcome values. Whereas lower values of \(\lambda\) have higher decay rates, moving the density of the distribution towards lower outcome values.
Note that we cannot change the general form of the exponential distribution: it will always be a monotonic exponential decay curve. This is why the Weibull distribution is also used to model survival functions. Take a look at some different values of \(\theta\) (scale parameter) while fixing the value of the shape parameter \(\alpha\) to 3 for Weibull distribution:
= 0:0.1:10.0
x = [2, 3, 5]
θs = 3
α = [Weibull(α, θ) for θ in θs]
dists = mapreduce(d -> pdf.(d, x), vcat, dists)
pdfs =
plt data((; θs = repeat(θs; inner = length(x)), x = repeat(x; outer = length(θs)), pdfs)) *
mapping(
:x => "outcome",
:pdfs => "PDF";
= :θs => nonnumeric => L"\mathrm{parameter}~\theta",
color = :θs => nonnumeric,
col *
) visual(Lines)
draw(plt; axis = (; xticks = 0:10), legend = (; position = :bottom, titleposition = :left))
3 Censoring
We need to cover one last concept before we dive into how to model TTEs. Censoring occurs when we are unable to observe the dependent variable in a survival analysis.
For example, in a cancer survival clinical trial that lasts for 10 years after initial detection, we only have survival functions defined up to the total time of the trial. After the trial is finished, patients may have other outcomes, but we cannot observe them: they are censored.
We won’t be covering censoring in these tutorials. But it is important for you to know the concept and if you have censored observations in your analysis, you definitely should take them into account.
We will describe three possible ways to analyze survival data:
- Kaplan-Meyer Estimator: a non-parametric procedure.
- Cox Proportional Hazards Model: a semi-parametric procedure.
- Accelerated Failure Time Model: a parametric procedure.
Below you’ll find a table that summarizes the main characteristics of each TTE analyzes approaches:
Model | Characteristics | Example |
---|---|---|
Non-Parametric | Distribution of survival times is unknown; Survival Functions aren’t smooth; Difficult to incorporate complex covariates | Kaplan-Meier estimates |
Semi-Parametric | Decomposes the hazard into a non-parametric baseline and a parametric covariate vector; Shape of baseline hazard is arbitrary; Allows for time-varying covariates | Cox-Proportial Hazards |
Parametric | Distribution is assumed to be known and continuous; Hazard function completely specified; High degree of model flexibility | Exponential, Weibull, Log-Normal models |
4 Kaplan-Meyer (K-M) Estimator
The Kaplan-Meyer Estimator is a non-parametric statistic used to estimate the survival function \(S(t)\).
Non-parametric statistics is the branch of statistics that is not based solely on parametrized families of probability distributions.
Non-parametric statistics is based on either being distribution-free or having a specified distribution but with the distribution’s parameters unspecified.
The basic idea is to estimate the survival function using the only assumption that the observations are independent. In other words, this means that what happens to observation \(a\) does not impact what happens to observation \(b\). They are only governed by the survival function.
We begin by constructing ordered time intervals for the survival function. We give these time intervals the index \(k\) and they have the following property:
\[t_1 > \ldots > t_k > t_{k+1}\]
and since the survival function is a monotonic decreasing function, this property also holds for the survival function:
\[S(t_1) > \ldots > S(t_k) > S(t_{k+1})\]
Then, the estimated survivor function at any time, \(t\), in the \(k\)th constructed time interval, will be the estimated probability of surviving beyond \(t_k\). To be more precise, this is the probability of surviving through the interval from t(k) to t(k+1), and all preceding intervals.
This leads us to the K-M Estimate of the survival function:
\[\hat{S}(t) = \prod^k_{j=1} \left( \frac{n_j - d_j}{n_j} \right)\]
where:
- \(n_j\) is the number of subjects who are alive just before time \(t_j\), including those who are about to die at this time.
- \(d_j\) is the number of subjects who die at time \(t)j\).
Since we discretized continuous time into discrete intervals, the K-M estimate of the survival function becomes a monotonic decreasing step-function. Here is a plot for an estimated K-M survival function using Survival.jl
available in Pumas’ sysimage and in any Pumas App:
using Survival
= [2, 2, 3, 5, 5, 7, 9, 16, 16, 18]
t = [1, 1, 0, 1, 0, 1, 1, 1, 1, 0]
s = fit(KaplanMeier, t, s)
survival_fit =
plt data((; t = survival_fit.events.time, s_t = survival_fit.survival)) *
mapping(:t => L"t", :s_t => L"S(t)") *
visual(Stairs; step = :post, linewidth = 3)
draw(
plt;= (; xticks = 1:18, yticks = 0.0:0.1:1.0, limits = ((0, nothing), (0, nothing))),
axis )
The main drawback from K-M estimates is the difficulty to incorporate covariates. This is addressed by the Cox’s Proportional Hazards model, which is a semi-parametric approach.
5 Cox’s Proportional Hazards (PH) Model
Often when we are analyzing survival data, we have individual data that is can shed light into the relationship between individual characteristics and survival rates. These are known as covariates or explanatory variables. Common examples are: physiological variables such as hemoglobin and heart rate; and lifestyle factors such as smoking, diet and exercise.
It’s difficult to incorporate covariate effects into the survival analysis when using non-parametric approaches such as the K-M Estimator. Hence, we need different approaches for that.
Semi-parametric approaches are statistical models that have parametric and nonparametric components. The most used semi-parametric model is Cox’s Proportional Hazards (PH) model.
PH has the following definition:
\[h(t) = h_0(t) \cdot e^{\left( \beta_1 x_1 + \beta_2 x_2 + \dots + \beta_p x_p \right)}\]
where:
- \(h(t)\) is the hazard function at time \(t\)
- \(h_0(t)\) is the baseline hazard at time \(t\)
- \(x_i\) is the \(i\)th covariate
- \(\beta_i\) the coefficient for the covariate \(x_i\)
We are now using hazards instead of survival function. The hazard function is widely used to express the risk or hazard of an event such as death occurring at some time \(t\). Also some authors and literature might use \(\lambda(t)\) instead of \(h(t)\) to denote the hazard function.
Notice that PH is called proportional hazards because we can rewrite the above equation as:
\[\ln \frac{h(t)}{h_0(t)} = \beta_1 x_1 + \beta_2 x_2 + \dots + \beta_p x_p\]
where \(\ln\) is the natural logarithm function. So the linear combination of coefficients \(\beta_i\) and covariates \(x_i\) is equal to the logarithm of the hazard function divided by the baseline hazard. Hence, the linear combination is the logarithm of a proportional hazard with respect to a baseline hazard.
We say PH is semi-parametric because there is a non-parametric component since no particular form of probability distribution is assumed for the survival times. And also has a parametric component because we can estimate the coefficients \(\beta_i\).
5.1 How to interpret coefficients in a PH model
For the interpretation of the coefficients in a PH model note that:
- \(\beta\)s are in log-scale.
- \(\beta\)s represents the proportional hazard.
So, if we exponentiate \(\beta\)s we get back something similar to an odds ratio that we saw already in the Logistic Regression tutorial. But now representing the proportional hazard compared to the baseline hazard.
6 Accelerated Failure Time (AFT) Model
When you combine the covariates effects (\(\beta_i\)) while making assumptions about the underlying distribution of the survival times we get a full parametric approach. One of the most used parametric approach is the Accelerated Failure Time (AFT) model.
In an AFT model, the covariates are assumed to act directly on lifetime, just the same as in the PH model. Also in an AFT model we can model the underlying survival distribution as a known distribution but with unknown parameters to be estimated from data.
The most used distributions are: Exponential, Weibull, and Gompertz.
7 Pumas modeling
Let’s create some Pumas models for Time to Event models. First, let’s revisit the Pumas’ workflow.
7.1 Pumas’ Workflow
Pumas’ workflow
- Define a model. It can be a
PumasModel
or aPumasEMModel
. - Define a
Subject
andPopulation
. - Fit your model.
- Perform inference on the model’s population-level parameters (also known as fixed effects).
- Predict from a fitted model using the empirical Bayes estimate or Simulate random observations from a fitted model using the sampling distribution (also known as likelihood).
- Model diagnostics and visual predictive checks.
Pumas modeling is a highly iterative process. Eventually you’ll probably go back and forth in the stages of workflow. This is natural and expected in Pumas’ modeling. Just remember to have a good code organization and version control for all of the workflow iterations.
7.1.1 Defining a model in Pumas
To define a model, you start with the @model
macro and a begin .. end
block of code:
= @model begin end my_model
PumasModel
Parameters:
Random effects:
Covariates:
Dynamical system variables:
Dynamical system type: No dynamical model
Derived:
Observed:
This creates an empty model. Now we need to populate it with model components. These are additional macros that we can include inside the @model
definition. We won’t be covering all the possible macros that we can use in this lesson, but here is the full list:
@param
, fixed effects specifications.@random
, random effects specifications.@covariates
, covariate names.@pre
, pre-processing variables for the dynamic system and statistical specification.@dosecontrol
, specification of any dose control parameters present in the model.@vars
, shorthand notation.@init
, initial conditions for the dynamic system.@dynamics
, dynamics of the model.@derived
, statistical modeling of dependent variables.@observed
, model information to be stored in the model solution.
7.1.1.1 Model parameter with @param
The model parameters go inside the @param
block. The syntax is the following:
@param begin
∈ Domain(...)
ParameterName ...
end
To use the “in” (∈
) operator in Julia, you can either replace ∈
for in
or type \in <TAB>
for the Unicode character.
By using the begin ... end
block we can specify one parameter per line. We will just name the parameters the same name as it’s respective covariate, while also adding the β
prefix to make clear that this is a coefficient.
We will just name the parameters the same name as it’s respective covariate, while also adding the β
prefix to make clear that this is a coefficient.
RealDomain
for scalar parametersVectorDomain
for vectorsPDiagDomain
for positive definite matrices with diagonal structurePSDDomain
for general positive semi-definite matrices
If you don’t specify any arguments inside the domain constructor it will either error (for some domains that have required arguments) or will use the defaults. In the case of the RealDomain()
without arguments it just uses the following arguments:
RealDomain(; lower = -∞, upper = ∞, init = 0)
7.1.1.2 Model covariates with @covariates
We also have to specify our model covariates using the macro @covariates
. This can be done using a one-liner syntax, without begin ... end
block:
@covariates Group var1 var2 var3 # and so on
Or using the begin ... end
block and specifying one covariate per line. This is what we will do, since our model has several covariates, which makes easy to visualize them and also to comment out a covariate to perform iterative model exploration:
@covariates begin
var1
var2
var3...
end
7.1.1.3 Pre-processing variables with @pre
We can specify all the necessary variable and statistical pre-processing with the @pre
macro.
The @pre
block is traditionally used to specify the inputs of the Ordinary Differential Equations (ODE) system used for non-linear mixed-effects PK/PD models (NLME).
But we will use @pre
to specify our variable transformations instead.:::
Here, we are defining all the coefficients-covariates multiplication operation here. Note that we are adding the prefix _
to the resulting operation:
@pre begin
= @. var1 * βvar1
_var1 = @. var2 * βvar2
_var2 = @. var3 * βvar3
_var3 end
We are using the @.
macro which tells Julia to vectorize (add the “dot syntax”) to all operations and function calls to the right of it.
7.1.1.4 Statistical modeling of dependent variables with @derived
Our final block, @derived
, is used to specify all the assumed distributions of observed variables that are derived from the blocks above. This is where we include our dependent variable: depvar
.
Note that depvar
is being declared as following a Normal
distribution. This is why we use the tilde notation ~
. It means (much like the mathematical model notation) that depvar
follows a Normal
distribution. Since depvar
might be a vector of values, we need to broadcast, i.e. vectorize, the operation with the dot .
operator:
~ Normal.(μ, σ) depvar
where μ
and σ
are the parameters mean and standard deviation that parametrizes the Normal distribution.
If we would do more operations to the right of the ~
, we would need to vectorize all of them. This is cumbersome, so we can use the @.
macro which tells Julia to apply the .
in every operator and function call after it:
~ @. Normal(μ, σ) depvar
In this block we can use all variables defined in the previous blocks, in our example the @param
, @covariates
and @pre
blocks.
7.1.2 Define a Subject
and Population
Once we have our model defined we have to specify a Subject
and a Population
.
In Pumas, subjects (i.e. observations) are represented by the Subject
type and collections of subjects are represented as vectors of Subject
s are defined as Population
.
Subjects
can be constructed with the Subject()
constructor, for example:
Subject()
Subject
ID: 1
We just constructed a Subject
that has ID
equal to 1
and no extra information. In order to provide extra information, you would need to pass keyword arguments to the Subject()
constructor.
You can use the following keyword arguments inside Subject()
:
id
: identifier.observations
: a named tuple of the dependent variables.covariates
: a named tuple containing the covariates, ornothing
.events
: a vector ofEvent
s.time
: a vector of time stamps for the observations.
For this lesson, we will just use the covariates
keyword argument.
This is an example of a Subject
with ID
equal to 42
and with some covariates
specified as a named tuple:
Subject(id = 42, covariates = (; TRT = 1, AUC = 100.0, WT = 70))
Subject
ID: 42
Covariates: TRT, AUC, WT
Since a Population
is just a vector of Subjects
, we can use a simple map()
function over the ID
s present at our dataset. For example:
= map(
pop -> Subject(id = i, covariates = (; TRT = df.TRT, AUC = df.AUC, WT = df.WT)),
i 1:maximum(df.ID),
)
7.1.2.1 Reading Subject
s directly from a DataFrame
Another option is to use the read_pumas()
function to read Subject
s (and Population
) directly from a DataFrame
. This is more convenient than the map()
with the Subject()
constructor inside.
The read_pumas()
function accepts as first argument a DataFrame
followed by the following keyword options (which are very similar to the Subject()
constructor):
observations
: dependent variables specified by a vector of column names, i.e.[:depvar]
.covariates
: covariates specified by a vector of column names, i.e.[:TRT, :AUC, :WT, ...]
.id
: specifies theID
column of theDataFrame
.time
: specifies thetime
column of theDataFrame
.event_data
: toggles assertions applicable to event data, eithertrue
or `false.
Even if you don’t have a :time
column the read_pumas
function expects it. So you can pass a column of zeros as Float64
:
@rtransform! df :time = 0.0
= read_pumas(df; ..., time = :time) pop
7.1.3 Fit your model
Now that you have a population, you can fit your model!
Model fitting in Pumas has the purpose of estimating parameters and is done by calling the fit()
function with the following positional arguments:
- Pumas model.
- a
Population
. - a named tuple of the initial parameter values.
- an inference algorithm.
7.1.3.1 Initial Parameter’s Values
We already covered model and Population
, now let’s talk about initial parameter values. It is the 3rd positional argument inside fit()
.
You can specify your initial parameters as a named tuple. For instance, if you want to have a certain parameter, β1
, as having an initial value as 3.14
, you can do so by passing it inside a named tuple in the 3rd positional argument of fit()
:
fit(model, population, (; β1 = 3.14))
You can also use the helper function init_params()
which will recover all the initial parameters we specified inside the model’s @param
block.
7.1.3.2 Inference Algorithms
Finally, our last (4th) positional argument is the choice of inference algorithm.
Pumas has the following available inference algorithms:
Marginal Likelihood Estimation:
NaivePooled()
: first order approximation without random-effects.FO()
: first-order approximation.FOCE()
: first-order conditional estimation with automatic interaction detection.LaplaceI()
: second-order Laplace approximation.
Bayesian with Markov Chain Monte Carlo (MCMC):
BayesMCMC()
: MCMC using No-U-Turn Sampler (NUTS).
We can also use a Maximum A Posteriori (MAP) estimation procedure for any marginal likelihood estimation algorithm. You just need to call the MAP()
constructor with the desired marginal likelihood algorithm inside, for instance:
fit(model, population, init_params(model), MAP(FOCE()))
7.1.4 Perform inference on the model’s population-level parameters
Once the model is fitted, we can analyze our inference and estimates.
We can see that after the model is fitted, it prints a result with a summary and a table of the parameter estimates.
We can also recover the estimates as a named tuple with coef()
. Or as a DataFrame
with coeftable()
.
We can also inspect our estimated coefficients’ standard errors (SE
) along with the 95% confidence intervals with the infer()
function
7.1.5 Predict from a fitted model or Simulate random observations from a fitted model
Now that we have our model estimates, we can use it to make predictions or simulate random observations.
7.1.5.1 Predictions with predict()
In order to calculate prediction from a fitted model, we need to use the predict()
function and passing two positional arguments:
- a fitted Pumas model.
- either a single
Subject
or a collection of them with aPopulation
.
For example to make predictions onto a single Subject
, we can:
= Subject(id = 42, covariates = (; TRT = 1, AUC = 100.0, WT = 70)) my_subject
Subject
ID: 42
Covariates: TRT, AUC, WT
predict()
also needs a keyword argument obstimes
which represents the observation times that we want to predict the dependent variable. The value must be a vector. Since we used a dummy 0.0
value as :time
variable in our read_pumas
function, we can pass the same 0.0
but as a vector to the predict()
function.
predict(model_fit, my_subject; obstimes = [0.0])
We can also generate prediction for a Population
:
predict(model_fit, pop; obstimes = [0.0])
7.1.5.2 Simulations with simobstte()
Time to event models uses the simobstte
function instead of the simobs
function we covered in all of the discrete models so far.
We can generate random observations from our fitted Pumas model with the simobstte()
function. You need to supply at least three positional arguments:
- a non-fitted Pumas model, in our case
model
. - either a single
Subject
or a collection of them with aPopulation
. - a named tuple of parameter values.
The 1st and 2nd positional arguments we already covered in our previous examples. Let’s focus on the 3rd positional argument: param
. It can be a named tuple with the parameter values. For instance, we can use estimated parameters from a fitted model with coef(model_fit)
:
simobstte(model, pop, coef(model_fit))
You can also pass a single parameter as a Pumas fitted model to simobstte()
.
simobstte(model_fit)
7.1.6 Model diagnostics
Finally, our last step is to assess model diagnostics.
7.1.6.1 Assessing model diagnostics
To assess model diagnostics we can use the inspect()
function in a fitted Pumas model:
inspect(model_fit)
We can also check for Information Criteria, such as:
- Akaike Information Criterion (AIC):
aic(model_fit)
- Bayesian Information Criterion (BIC):
bic(model_fit)
8 tte_single
dataset
The tte_single
dataset from PharmaDatasets.jl
has just a dependent variable :DV
and covariates :TIME
and :DOSE
.
Since it is a single Time to Event (TTE) it has an :EVID
column filled with 3
s and 0
s.
:EVID
columns describes the event ID. It uses a pattern reminiscent from NONMEM, where 1
specifies a normal event. 3
means it’s a reset event, meaning that the value of the dynamical variable is reset to the amt
at the dosing event. If 4
, then the dynamical value and time are reset, and then a final dose is given. Defaults to 1
. For more about EVID
s see the Dosage Regimens documentation.
First, let’s load the dataset with PharmaDatasets.jl
:
using PharmaDatasets
= dataset("tte_single")
tte_single first(tte_single, 5)
Row | ID | TIME | DV | EVID | DOSE |
---|---|---|---|---|---|
Int64 | Float64 | Int64 | Int64 | Int64 | |
1 | 1 | 0.0 | 0 | 3 | 1 |
2 | 1 | 484.537 | 1 | 0 | 1 |
3 | 2 | 0.0 | 0 | 3 | 1 |
4 | 2 | 246.129 | 1 | 0 | 1 |
5 | 3 | 0.0 | 0 | 3 | 1 |
8.1 Exploratory Data Analysis
Let’s do some exploratory data analysis and start with loading DataFramesMeta.jl
:
using DataFrames
using DataFramesMeta
As you can see every ID
has two unique EVID
s:
3
for the reset event, i.e. the start of the dosing event.1
for a normal event, i.e. when a “failure” happens.
We have 300 unique ID
s from 1
to 300
:
= @by tte_single :ID :unique_EVID = length(:EVID)
by_df last(by_df, 5)
Row | ID | unique_EVID |
---|---|---|
Int64 | Int64 | |
1 | 296 | 2 |
2 | 297 | 2 |
3 | 298 | 2 |
4 | 299 | 2 |
5 | 300 | 2 |
Let’s see some summary statistics with describe()
:
describe(tte_single)
Row | variable | mean | min | median | max | nmissing | eltype |
---|---|---|---|---|---|---|---|
Symbol | Float64 | Real | Float64 | Real | Int64 | DataType | |
1 | ID | 150.5 | 1 | 150.5 | 300 | 0 | Int64 |
2 | TIME | 127.492 | 0.0 | 2.95488 | 500.0 | 0 | Float64 |
3 | DV | 0.413333 | 0 | 0.0 | 1 | 0 | Int64 |
4 | EVID | 1.5 | 0 | 1.5 | 3 | 0 | Int64 |
5 | DOSE | 0.5 | 0 | 0.5 | 1 | 0 | Int64 |
We need to have summary statistics by :EVID
. This can be accomplished with a groupby
or @by
:
@chain tte_single begin
@by :EVID :mean_failure = mean(:TIME)
@rsubset :EVID != 3
end
Row | EVID | mean_failure |
---|---|---|
Int64 | Float64 | |
1 | 0 | 254.984 |
The mean time of failure is the mean of the variable :TIME
only for EVID
s values different than 3: ≈255.
Also we would like to know if the mean time of failure differs for different :DOSE
values. :DOSE
can take only values 0
and 1
:
unique(tte_single.DOSE)
2-element Vector{Int64}:
1
0
This can also be accomplished with a groupby
or @by
:
@chain tte_single begin
@by [:DOSE, :EVID] :mean_failure = mean(:TIME)
@rsubset :EVID != 3
@select $(Not(:EVID))
end
Row | DOSE | mean_failure |
---|---|---|
Int64 | Float64 | |
1 | 0 | 180.261 |
2 | 1 | 329.707 |
So, definitely observations that have a value 1
of DOSE
have higher failures time than observations with DOSE
having a value of 0
.
8.2 Visualizations
Now let’s plot some visualizations with AlgebraOfGraphics.jl
using CairoMakie.jl
as a backend:
First, we need to convert tte_single
dataset to a time-series type of dataset. This can be accomplished by keeping only the EVID
s different from 3, then grouping by both :DOSE
and :TIME
, and counting the occurrences of :TIME
:
= @chain tte_single begin
tte_single_ts @rsubset :EVID != 3
@by [:DOSE, :TIME] :total_failure = length(:TIME)
@orderby :TIME :DOSE
end
Please heed that our tte_single
dataset is censored. So every observation that has not “failed” until time 500 automatically takes value 500
for the :TIME
column. This is a problem for the time series visualization, so we will keep only the observations with :TIME
lower than 500 in the tte_single_ts
dataset.
sum(tte_single_ts.total_failure)
300
@rsubset! tte_single_ts :TIME < 500;
sum(tte_single_ts.total_failure)
248
We can see that we have a total failure of 248 observations from the total of 300.
We need one last thing for our time series formatted data tte_single_ts
. In order to get a plot of the survival function through time, we need to normalize the :total_failure
to the interval from 0
to 1
and also calculate the cumulative sum by different :DOSE
values:
= groupby(tte_single_ts, :DOSE);
d0, d1
@transform! d0 @astable begin
= sum(:total_failure)
sum_total_failure = cumsum(:total_failure)
cumulative_failure :survival = (sum_total_failure .- cumulative_failure) ./ sum_total_failure
end
@transform! d1 @astable begin
= sum(:total_failure)
sum_total_failure = cumsum(:total_failure)
cumulative_failure :survival = (sum_total_failure .- cumulative_failure) ./ sum_total_failure
end
= vcat(d0, d1) tte_single_ts_combined
Now we can plot it:
data(tte_single_ts_combined) *
mapping(:TIME, :survival; color = :DOSE => nonnumeric) *
visual(Lines) |> draw
As we can clearly see, :DOSE
impacts the survival function.
8.3 Pumas Models
Now let’s create three Pumas AFT models. One for each of the assumed underlying distribution of the survival function:
- Exponential
- Weibull
- Gompertz
8.3.1 Exponential AFT Model
Let’s start with the simplest AFT model: exponential.
We define two parameters in the @params
block:
λ₁
: basal hazard.β
: coefficient for covariate:DOSE
.
Inside the @pre
block we define _λ₀
as the total hazard.
In the @vars
block we include our underlying exponential distribution. For the exponential distribution, the estimated hazard λ
is simply equal to the total hazard _λ₀
.
Now, most important, in the @dynamics
block, we defined the cumulative hazard as Λ
and the its derivative Λ'
equal to the estimated hazard λ
.
Finally, in the @derived
block, we condition our dependent variable :DV
as a TimeToEvent
Pumas’ distribution. TimeToEvent
takes the estimated hazard λ
as the first positional parameter and as the second positional parameter the cumulative hazard Λ
.
The dependent variable in a model that uses TimeToEvent
should be a censoring variable that is zero if the variable isn’t censored and one if the variable is right censored. Currently, no other censoring types are supported.
So, this means that your DV
has to take only values of 1 and 0. If you observe a “failure” event until the end of the data collection, you should make DV = 1
, and DV = 0
otherwise.
Since DV
is a column (and also a vector of values), we need to broadcast, i.e. vectorize, the operation with the dot .
operator. We do this with the @.
macro.
Here is the full TTE exponential Pumas model:
It is always best to initialize your parameters to something other than 0
(the default of RealDomain
). This makes the iterative estimation procedures more numerically robust. Whenever uncertain you can choose a small value close to 0
. In our case, I’ve chosen 0.001
.
= @model begin
tte_exp_model @param begin
∈ RealDomain(; lower = 0, init = 0.001)
λ₁ ∈ RealDomain(; init = 0.001)
β end
@covariates DOSE
@pre begin
= λ₁ * exp(β * DOSE)
_λ₀ end
@vars begin
# exponential
= _λ₀
λ end
@dynamics begin
# the derivative of cumulative hazard is equal to hazard
' = λ
Λend
@derived begin
# Pumas derived TTE function
~ @. TimeToEvent(λ, Λ)
DV end
end
PumasModel
Parameters: λ₁, β
Random effects:
Covariates: DOSE
Dynamical system variables: Λ
Dynamical system type: Matrix exponential
Derived: DV, λ
Observed: DV, λ
8.3.2 Weibull AFT Model
Let’s move on to the second AFT model: Weibull.
We now define three parameters, one additional from the exponential model, in the @params
block:
λ₁
: basal hazard.β
: coefficient for covariate:DOSE
.Κ
: the shape parameter of the underlying AFT model Weibull distribution. This is theα
parameter in the Weibull plots above in the beginning of the tutorials. Weibull distribution is used extensively by several modeling communities and, because of historical reasons, there are several naming conventions for the parameters that characterize the distribution. We will useK
from now, since it is customary in the biomedical modeling community.
Almost everything is similar from the exponential model. The only thing that changes is how we calculate the estimated hazard λ
inside the @vars
Inside the @pre
block we define _λ₀
as the total hazard.
In the @vars
block we include our underlying Weibull distribution. For the Weibull distribution, the estimated hazard λ
is the probability density function (PDF) of the Weibull distribution condition on the total hazard _λ₀
and also parameter K
.
Now, most important, in the @dynamics
block, we defined the cumulative hazard as Λ
and the its derivative Λ'
equal to the estimated hazard λ
.
Finally, in the @derived
block, we condition our dependent variable :DV
as a TimeToEvent
Pumas’ distribution. TimeToEvent
takes the estimated hazard λ
as the first positional parameter and as the second positional parameter the cumulative hazard Λ
.
The dependent variable in a model that uses TimeToEvent
should be a censoring variable that is zero if the variable isn’t censored and one if the variable is right censored. Currently, no other censoring types are supported.
So, this means that your DV
has to take only values of 1 and 0. If you observe a “failure” event until the end of the data collection, you should make DV = 1
, and DV = 0
otherwise.
Since DV
is a column (and also a vector of values), we need to broadcast, i.e. vectorize, the operation with the dot .
operator. We do this with the @.
macro.
Here is the full TTE Weibull Pumas model:
= @model begin
tte_wei_model @param begin
∈ RealDomain(; lower = 0, init = 0.001)
λ₁ ∈ RealDomain(; init = 0.001)
β ∈ RealDomain(; lower = 0, init = 0.001)
Κ end
@covariates DOSE
@pre begin
= Κ
_Κ = λ₁ * exp(β * DOSE)
_λ₀ end
@vars begin
# Weibull
# 1e-10 for model numerical stability
= _λ₀ * _Κ * (_λ₀ * t + 1e-10)^(_Κ - 1)
λ end
@dynamics begin
# the derivative of cumulative hazard is equal to hazard
' = λ
Λend
@derived begin
# Pumas derived TTE function
~ @. TimeToEvent(λ, Λ)
DV end
end
PumasModel
Parameters: λ₁, β, Κ
Random effects:
Covariates: DOSE
Dynamical system variables: Λ
Dynamical system type: Nonlinear ODE
Derived: DV, λ
Observed: DV, λ
8.3.3 Gompertz AFT Model
Let’s move on to the third and final AFT model: Gompertz.
We also define three parameters, one additional from the exponential model, in the @params
block:
λ₁
: basal hazard.β
: coefficient for covariate:DOSE
.Κ
: the shape parameter of the underlying AFT model Gompertz distribution.
Almost everything is similar from the exponential model. The only thing that changes is how we calculate the estimated hazard λ
inside the @vars
Inside the @pre
block we define _λ₀
as the total hazard.
In the @vars
block we include our underlying Weibull distribution. For the Gompertz distribution, the estimated hazard λ
is the probability density function (PDF) of the Gompertz distribution condition on the total hazard _λ₀
and also parameter K
.
Now, most important, in the @dynamics
block, we defined the cumulative hazard as Λ
and the its derivative Λ'
equal to the estimated hazard λ
.
Finally, in the @derived
block, we condition our dependent variable :DV
as a TimeToEvent
Pumas’ distribution. TimeToEvent
takes the estimated hazard λ
as the first positional parameter and as the second positional parameter the cumulative hazard Λ
.
The dependent variable in a model that uses TimeToEvent
should be a censoring variable that is zero if the variable isn’t censored and one if the variable is right censored. Currently, no other censoring types are supported.
So, this means that your DV
has to take only values of 1 and 0. If you observe a “failure” event until the end of the data collection, you should make DV = 1
, and DV = 0
otherwise.
Since DV
is a column (and also a vector of values), we need to broadcast, i.e. vectorize, the operation with the dot .
operator. We do this with the @.
macro.
Here is the full TTE Gompertz Pumas model:
= @model begin
tte_gomp_model @param begin
∈ RealDomain(; lower = 0, init = 0.001)
λ₁ ∈ RealDomain(; init = 0.001)
β ∈ RealDomain(; lower = 0, init = 0.001)
Κ end
@covariates DOSE
@pre begin
= Κ
_Κ = λ₁ * exp(β * DOSE)
_λ₀ end
@vars begin
# Gompertz
= _λ₀ * exp(_Κ * t)
λ end
@dynamics begin
# the derivative of cumulative hazard is equal to hazard
' = λ
Λend
@derived begin
# Pumas derived TTE function
~ @. TimeToEvent(λ, Λ)
DV end
end
PumasModel
Parameters: λ₁, β, Κ
Random effects:
Covariates: DOSE
Dynamical system variables: Λ
Dynamical system type: Nonlinear ODE
Derived: DV, λ
Observed: DV, λ
8.4 tte_single
Population
Before we do any fitting of the models, we need to convert our dataset tte_single
into a Population
with the read_pumas
function.
Since we have different :EVID
values, we need to also provide :AMT
and :CMT
columns.
:EVID
column is specified using the evid
keyword argument. :AMT
column is the dose amount column and is specified using the amt
keyword argument. Since we do not have any pharmacokinetics in our model and data, we simply create an :AMT
column filled with 0
s. :CMT
column is the compartment column and is specified using the cmt
keyword argument. Since we do not have any pharmacokinetics in our model and data, we also simply create an :CMT
column filled with 1
s to denote a single compartment model:
@rtransform! tte_single :AMT = 0 :CMT = 1;
= read_pumas(
tte_single_pop
tte_single;= [:DV],
observations = [:DOSE],
covariates = :ID,
id = :TIME,
time = :AMT,
amt = :CMT,
cmt = :EVID,
evid )
Population
Subjects: 300
Covariates: DOSE
Observations: DV
8.5 Estimation with fit
Now that we have Pumas models and a population we can start the estimation with fit
. We do not have any mixed-effects, so we will use the NaivePooled()
maximum likelihood estimation algorithm.
Let’s fit all three models using the same population:
=
tte_single_exp_fit fit(tte_exp_model, tte_single_pop, init_params(tte_exp_model), Pumas.NaivePooled());
=
tte_single_wei_fit fit(tte_wei_model, tte_single_pop, init_params(tte_wei_model), Pumas.NaivePooled());
=
tte_single_gomp_fit fit(tte_gomp_model, tte_single_pop, init_params(tte_gomp_model), Pumas.NaivePooled());
8.6 Comparing Estimates with compare_estimates
We can use the function compare_estimates
from PumasUtilities.jl
to compare estimates across different models.
Let’s load PumasUtilities.jl
:
using PumasUtilities
The compare_estimates
takes a NamedTuple
of models:
compare_estimates(;
= tte_single_exp_fit,
Exponential = tte_single_wei_fit,
Weibull = tte_single_gomp_fit,
Gompertz )
Row | parameter | Exponential | Weibull | Gompertz |
---|---|---|---|---|
String | Float64? | Float64? | Float64? | |
1 | λ₁ | 0.00532562 | 0.00496489 | 0.00375767 |
2 | β | -0.92922 | -0.811505 | -1.08131 |
3 | Κ | missing | 1.30176 | 0.00223201 |
As we can see all three AFT models have similar estimates for hazard λ₁
and DOSE
effect on hazard β
.
Let’s bring also the confidence intervals and standard error for the estimates of each model.
We can calculate confidence intervals with the infer
function, and then collect them into a DataFrame
with the coeftable
function. We do a little bit of data wrangling to get the comparisons and also exponentiate the estimates to make them interpretable:
= coeftable(infer(tte_single_exp_fit))
df_exponential = coeftable(infer(tte_single_wei_fit))
df_weibull = coeftable(infer(tte_single_gomp_fit))
df_gompertz
@rsubset! df_exponential :parameter == "β"
@rtransform! df_exponential :model = "exponential"
@rsubset! df_weibull :parameter == "β"
@rtransform! df_weibull :model = "Weibull"
@rsubset! df_gompertz :parameter == "β"
@rtransform! df_gompertz :model = "Gompertz"
= vcat(df_exponential, df_weibull, df_gompertz)
df
@rtransform! df $(
:estimate, :ci_lower, :ci_upper] .=>
[-> exp.(x)) .=> [:estimate, :ci_lower, :ci_upper]
(x )
[ Info: Calculating: variance-covariance matrix.
[ Info: Done.
[ Info: Calculating: variance-covariance matrix.
[ Info: Done.
[ Info: Calculating: variance-covariance matrix.
[ Info: Done.
Row | parameter | estimate | se | relative_se | ci_lower | ci_upper | model |
---|---|---|---|---|---|---|---|
String | Float64 | Float64 | Float64 | Float64 | Float64 | String | |
1 | β | 0.394862 | 0.112017 | 0.120549 | 0.317027 | 0.491806 | exponential |
2 | β | 0.444189 | 0.102203 | 0.125942 | 0.363557 | 0.542704 | Weibull |
3 | β | 0.339152 | 0.134403 | 0.124297 | 0.26061 | 0.441365 | Gompertz |
Weibull AFT model has the lowest standard error, as display in the column :se
, and, consequently, the narrowest confidence interval.
8.6.1 Visual Comparisons
Let’s do some visual comparisons between the models. We are going to use the estimated hazard and DOSE
covariate effect to plot different estimated survival functions across all three models using color to differentiate between DOSE
equal 0 or 1.
This means a line plot with:
t
in the x-axis- estimated hazard in the y-axis
- color is
DOSE
To get the estimated hazard as a function of time t
we are going to use the formula in the @vars
block in each model while using the estimated parameters from each model in the calculation.
First, some helper functions:
function hazard_exponential(λ₁, β, DOSE, T)
= λ₁ * exp(β * DOSE) # estimated hazard
λ₀ = [λ₀ for t = 1:T] # cumulative hazard
λ return 1 .- cumsum(λ)
end;
function hazard_weibull(λ₁, β, Κ, DOSE, T)
= λ₁ * exp(β * DOSE) # estimated hazard
λ₀ = [λ₀ * Κ * (λ₀ * t + 1e-10)^(Κ - 1) for t = 1:T] # cumulative hazard
λ return 1 .- cumsum(λ)
end;
function hazard_gompertz(λ₁, β, Κ, DOSE, T)
= λ₁ * exp(β * DOSE) # estimated hazard
λ₀ = [λ₀ * exp(Κ * t) for t = 1:T] # cumulative hazard
λ return 1 .- cumsum(λ)
end;
These functions will return a length-\(T\) vector of the survival distribution. That is, 1 minus the cumulative hazard until time \(T\) depending on the DOSE
value.
Now let’s plot it:
= 1:200
t
# exponential estimates
= hazard_exponential(
λ_exponential_dose_0 0.00532562, # estimated λ₁
-0.92922, # estimated β
0, # DOSE value
length(t), # T
)= hazard_exponential(
λ_exponential_dose_1 0.00532562, # estimated λ₁
-0.92922, # estimated β
1, # DOSE value
length(t), # T
)
# Weibull estimates
= hazard_weibull(
λ_weibull_dose_0 0.00496489, # estimated λ₁
-0.811505, # estimated β
1.30176, # Κ
0, # DOSE value
length(t), # T
)= hazard_weibull(
λ_weibull_dose_1 0.00496489, # estimated λ₁
-0.811505, # estimated β
1.30176, # Κ
1, # DOSE value
length(t), # T
)
# Gompertz estimates
= hazard_gompertz(
λ_gompertz_dose_0 0.00375767, # estimated λ₁
-1.08131, # estimated β
0.00223201, # Κ
0, # DOSE value
length(t), # T
)= hazard_gompertz(
λ_gompertz_dose_1 0.00375767, # estimated λ₁
-1.08131, # estimated β
0.00223201, # Κ
1, # DOSE value
length(t), # T
)
=
plt data(
# data in a long table format
(;= repeat(t, 6), # 2 DOSES ⋅ 3 AFT models
t = vcat(
λs # exponential estimates
λ_exponential_dose_0,
λ_exponential_dose_1,# Weibull estimates
λ_weibull_dose_0,
λ_weibull_dose_1,# Gompertz estimates
λ_gompertz_dose_0,
λ_gompertz_dose_1,
),= repeat(
DOSE 0, 1];
[= length(t), # repeat 0 500x then 1 500x
inner = 3, # 3 AFT models
outer
),= repeat(["exponential", "Weibull", "Gompertz"]; inner = 2 * length(t)), # repeat each model 1_000x
MODEL *
)) mapping(:t => L"t", :λs => L"S(t)"; color = :DOSE => nonnumeric, row = :MODEL) *
visual(Lines)
draw(plt; axis = (; yticks = 0.0:0.25:1.0))
The models differ little in the general estimation of the survival function across different DOSE
values.
9 tte_repeated
dataset
The tte_repeated
dataset from PharmaDatasets.jl
has just a dependent variable :DV
and covariate :TIME
.
This type of data is common when the time to event can occurs several times. For example, patient death is generally a single TTE data. Whereas, patient experiencing vomiting or being afflicted with cancer can occur more than once, then it is a repeated TTE data.
= dataset("tte_repeated")
tte_repeated first(tte_repeated, 5)
Row | ID | DV | TIME |
---|---|---|---|
Int64 | Int64 | Float64 | |
1 | 1 | 0 | 288.0 |
2 | 2 | 1 | 22.6626 |
3 | 2 | 1 | 29.3927 |
4 | 2 | 0 | 288.0 |
5 | 3 | 0 | 288.0 |
We can definitely see that we have 316 repeated observations from 120 IDs, from 1 to 120.
tte_repeated
does not have an :EVID
column like the tte_single
dataset. Hence we need to add an evid = 3
row for each measurement row evid = 0
.
:TAD] =
tte_repeated[!, combine(i -> [first(i.TIME); diff(i.TIME)], groupby(tte_repeated, [:ID])).x1
# Add EVID=3 events with TAD=1e-10 to reset the integration right after each observation
= DataFrame(;
evd = tte_repeated.ID,
ID = missing,
AMT = missing,
DV = tte_repeated.TIME,
TIME = 1e-10,
TAD = 3,
EVID
):EVID] .= 0
tte_repeated[!, :AMT] .= 0
tte_repeated[!, = vcat(tte_repeated, evd)
tte_repeated_evid sort!(tte_repeated_evid, [:ID, :TIME, :EVID])
Also, we have, on average, 2.6 events per ID:
@chain tte_repeated_evid begin
@rsubset :EVID == 0
@by :ID :freq_DV = length(:DV)
@combine :mean_freq_DV = mean(:freq_DV)
end
Row | mean_freq_DV |
---|---|
Float64 | |
1 | 2.63333 |
9.1 Pumas Models
Since we do not have any covariates besides :TIME
our previous defined Pumas models would need to be rewritten without DOSE
.
I will show an example with the exponential AFT model. The code for Weibull and Gompertz should be the same as above but without the β
and DOSE
.
= @model begin
tte_repeated_exp_model @param begin
∈ RealDomain(; lower = 0, init = 0.001)
λ₁ end
@pre begin
= λ₁
_λ₀ end
@vars begin
# exponential
= _λ₀
λ end
@dynamics begin
# the derivative of cumulative hazard is equal to hazard
' = λ
Λend
@derived begin
# Pumas derived TTE function
~ @. TimeToEvent(λ, Λ)
DV end
end
PumasModel
Parameters: λ₁
Random effects:
Covariates:
Dynamical system variables: Λ
Dynamical system type: Matrix exponential
Derived: DV, λ
Observed: DV, λ
9.2 tte_repeated
Population
Like we did for tte_single
, we need to convert our dataset tte_repeated
into a Population
with the read_pumas
function.
Since we have a dependent variable :DV
, we need to also provide an :AMT
column.
:AMT
column is the dose amount column and is specified using the amt
keyword argument. Since we do not have any pharmacokinetics in our model and data, we simply create an :AMT
column filled with 0
s:
@rtransform! tte_repeated_evid :AMT = 0;
= read_pumas(
tte_repeated_pop
tte_repeated_evid;= [:DV],
observations = :ID,
id = :TAD,
time = :EVID,
evid = :AMT,
amt )
9.3 Estimation with fit
Now that we have a Pumas model and a population we can start the estimation with fit
. We do not have any mixed-effects, so we will use the NaivePooled()
maximum likelihood estimation algorithm.
Let’s fit our tte_repeated_exp_model
using the population we just created:
= fit(
tte_repeated_exp_fit
tte_repeated_exp_model,
tte_repeated_pop,init_params(tte_repeated_exp_model),
NaivePooled(),
Pumas. )
[ Info: Checking the initial parameter values.
[ Info: The initial negative log likelihood and its gradient are finite. Check passed.
Iter Function value Gradient norm
0 1.388480e+03 1.614400e+02
* time: 0.00010800361633300781
1 1.251864e+03 1.020562e+02
* time: 0.002665996551513672
2 1.212583e+03 3.131723e+01
* time: 0.006311178207397461
3 1.210335e+03 1.514137e+01
* time: 0.008748054504394531
4 1.209782e+03 1.285035e+00
* time: 0.011193037033081055
5 1.209778e+03 4.741926e-02
* time: 0.013631105422973633
6 1.209778e+03 1.561555e-04
* time: 0.01604008674621582
FittedPumasModel
Successful minimization: true
Likelihood approximation: NaivePooled
Likelihood Optimizer: BFGS
Dynamical system type: Matrix exponential
Log-likelihood value: -1209.7782
Number of subjects: 120
Number of parameters: Fixed Optimized
0 1
Observation records: Active Missing
DV: 316 0
Total: 316 0
------------------
Estimate
------------------
λ₁ 0.0056713
------------------
10 Conclusion
I hope you enjoyed this tutorial on time to event modeling. Whenever you are dealing with survival data, or any other of time to event data, you can use any of the techniques described here. Also, don’t forget to check the recommended textbooks and resources in the Tip green box at the start of this tutorial.