`using Pumas`

# Time To Event

This lesson is really comprehensive and has a wide audience in mind. We recommend the reader to briefly navigate the Table of Contents and choose which topics to dive in. If you are in doubt feel free to read it from top to bottom.

In this lecture we’ll explore **t**ime **t**o **e**vent (TTE) models.

First, let’s define what are TTE models.

There are several resources available for a deep dive into time to event and survival models. Specifically, we recommend:

- Collett, D. (2015).
*Modelling survival data in medical research*. Chapman and Hall/CRC. - Smith, P. J. (2002).
*Analysis of failure and survival data*. Chapman and Hall/CRC. - Ibrahim, J. G., Chen, M. H., Sinha, D., Ibrahim, J. G., & Chen, M. H. (2001).
*Bayesian survival analysis*. New York: Springer.

## 1 What are Time to Event Models?

Time to event models, also called **survival analysis**, involves lifetime distributions. Despite having applications in engineering, social sciences, computer sciences and so on, we will focus on TTE models applied to health, i.e. lifetime distributions of biological systems.

In TTE models, we are generally interested in **lifetime distributions**. This can be analyzed as different populations, such as those under different treatment arms of a clinical trial. Or as covariates effects which are common in observational or epidemiological studies.

Most important in a TTE model is the survival function.

## 2 Survival Function

To understand survival function, first we need to learn what are **survival random variables**. **Survival random variables is simply a random variable whose outcomes are constrained to the positive domain**:

\[T \in [0, \infty)\]

Then, the **survival function is one minus the cumulative distribution function of the survival random variable \(T\)**:

\[\begin{aligned} F(t) &= P(T \leq t) = \int_0^t f(u)du \\ S(t) &= P(T > t) = 1 - F(t) = \int_t^\infty f(u)du \end{aligned}\]

where:

- \(F(t)\) is the
**c**umulative**d**istribution**f**unction of the survival random variable \(T\) at value \(T=t\). - \(S(t)\) is the survival function of the survival random variable \(T\) at value \(T=t\).

Furthermore, \(S(t)\) is a monotonic decreasing function which tends to 0 as \(t \to \infty\). This is the general shape of a survival function:

```
using AlgebraOfGraphics
using CairoMakie
```

```
= 0:0.1:5.0
x survival(t) = exp(-t)
=
plt data((; x, t = survival.(x))) *
mapping(:x => "time", :t => "Survival Function") *
visual(Lines; linewidth = 3)
draw(plt; axis = (; xticks = 0:5))
```

If you are curious why the `exp`

inside the survival function in the plot code, it is because \(S(y)\) is a cumulative function, hence by the fundamental theorem of calculus:

\[s(t) = \frac{d}{dt}S(t) = \frac{d}{dt}\Big( 1-F(t) \Big) = -f(t)\]

where \(s(t)\) is the derivative of the survival function \(S(t)\) and \(f(t)\) is the derivative of the CDF \(F(t)\).

If we exponentiate both left and right ends, we get:

\[S(t) = e^{-f(t)}\]

As you can see, as times goes by, survival function declines until it reaches almost zero at high time values.

### 2.1 Probability Distributions

Like all other phenomena we saw in these tutorials, we can use a **distribution to model survival functions**. There is a wide list of candidate distributions. However, most of the time these distributions used are:

**Exponential**: parametrized by \(\lambda\), which is called the rate.**Weibull**: parametrized by \(\alpha\) and \(\theta\), which are called shape and scale, respectively.

Let us show first the exponential distribution for different values of \(\alpha\):

`using Distributions`

```
= 0:0.1:10.0
x = [2, 5, 10]
λs = [Exponential(λ) for λ in λs]
dists = mapreduce(d -> pdf.(d, x), vcat, dists)
pdfs =
plt data((; λs = repeat(λs; inner = length(x)), x = repeat(x; outer = length(λs)), pdfs)) *
mapping(
:x => "outcome",
:pdfs => "PDF";
= :λs => nonnumeric => L"\mathrm{parameter}~\lambda",
color = :λs => nonnumeric,
col *
) visual(Lines)
draw(plt; axis = (; xticks = 0:10), legend = (; position = :bottom, titleposition = :left))
```

As you can see the **exponential distribution has the characteristic of having an exponential decay**, in which the parameter **\(\lambda\) dictates the rate of the decay**. Higher values of \(\lambda\) indicate slower decay rates and moves the density to higher outcome values. Whereas lower values of \(\lambda\) have higher decay rates, moving the density of the distribution towards lower outcome values.

Note that we cannot change the general form of the exponential distribution: it will always be a **monotonic exponential decay curve**. This is why the **Weibull distribution** is also used to model survival functions. Take a look at some different values of \(\theta\) (scale parameter) while fixing the value of the shape parameter \(\alpha\) to 3 for Weibull distribution:

```
= 0:0.1:10.0
x = [2, 3, 5]
θs = 3
α = [Weibull(α, θ) for θ in θs]
dists = mapreduce(d -> pdf.(d, x), vcat, dists)
pdfs =
plt data((; θs = repeat(θs; inner = length(x)), x = repeat(x; outer = length(θs)), pdfs)) *
mapping(
:x => "outcome",
:pdfs => "PDF";
= :θs => nonnumeric => L"\mathrm{parameter}~\theta",
color = :θs => nonnumeric,
col *
) visual(Lines)
draw(plt; axis = (; xticks = 0:10), legend = (; position = :bottom, titleposition = :left))
```

## 3 Censoring

We need to cover one last concept before we dive into how to model TTEs. **Censoring occurs when we are unable to observe the dependent variable in a survival analysis**.

For example, in a cancer survival clinical trial that lasts for 10 years after initial detection, we only have survival functions defined up to the total time of the trial. After the trial is finished, patients may have other outcomes, but we cannot observe them: they are **censored**.

We won’t be covering censoring in these tutorials. But it is important for you to know the concept and if you have censored observations in your analysis, you definitely should take them into account.

We will describe three possible ways to analyze survival data:

**Kaplan-Meyer Estimator**: a*non-parametric*procedure.**Cox Proportional Hazards Model**: a*semi-parametric*procedure.**Accelerated Failure Time Model**: a*parametric*procedure.

Below you’ll find a table that summarizes the main characteristics of each TTE analyzes approaches:

Model | Characteristics | Example |
---|---|---|

Non-Parametric |
Distribution of survival times is unknown; Survival Functions aren’t smooth; Difficult to incorporate complex covariates | Kaplan-Meier estimates |

Semi-Parametric |
Decomposes the hazard into a non-parametric baseline and a parametric covariate vector; Shape of baseline hazard is arbitrary; Allows for time-varying covariates | Cox-Proportial Hazards |

Parametric |
Distribution is assumed to be known and continuous; Hazard function completely specified; High degree of model flexibility | Exponential, Weibull, Log-Normal models |

## 4 Kaplan-Meyer (K-M) Estimator

The **Kaplan-Meyer Estimator is a non-parametric statistic used to estimate the survival function \(S(t)\)**.

Non-parametric statistics is the branch of statistics that is not based solely on parametrized families of probability distributions.

Non-parametric statistics is based on either being distribution-free or having a specified distribution but with the distribution’s parameters unspecified.

The basic idea is to estimate the survival function using the only assumption that the observations are independent. In other words, this means that what happens to observation \(a\) does not impact what happens to observation \(b\). They are only governed by the survival function.

We begin by constructing ordered time intervals for the survival function. We give these time intervals the index \(k\) and they have the following property:

\[t_1 > \ldots > t_k > t_{k+1}\]

and since the survival function is a monotonic decreasing function, this property also holds for the survival function:

\[S(t_1) > \ldots > S(t_k) > S(t_{k+1})\]

Then, the estimated survivor function at any time, \(t\), in the \(k\)th constructed time interval, will be the estimated probability of surviving beyond \(t_k\). To be more precise, this is the probability of surviving through the interval from t(k) to t(k+1), and all preceding intervals.

This leads us to the **K-M Estimate of the survival function**:

\[\hat{S}(t) = \prod^k_{j=1} \left( \frac{n_j - d_j}{n_j} \right)\]

where:

- \(n_j\) is the number of subjects who are alive just before time \(t_j\), including those who are about to die at this time.
- \(d_j\) is the number of subjects who die at time \(t)j\).

Since we discretized continuous time into discrete intervals, the K-M estimate of the survival function becomes a monotonic decreasing *step-function*. Here is a plot for an estimated K-M survival function using `Survival.jl`

available in Pumas’ sysimage and in any Pumas App:

`using Survival`

```
= [2, 2, 3, 5, 5, 7, 9, 16, 16, 18]
t = [1, 1, 0, 1, 0, 1, 1, 1, 1, 0]
s = fit(KaplanMeier, t, s)
survival_fit =
plt data((; t = survival_fit.times, s_t = survival_fit.survival)) *
mapping(:t => L"t", :s_t => L"S(t)") *
visual(Stairs; step = :post, linewidth = 3)
draw(
plt;= (; xticks = 1:18, yticks = 0.0:0.1:1.0, limits = ((0, nothing), (0, nothing))),
axis )
```

The main **drawback** from K-M estimates is the difficulty to incorporate covariates. This is addressed by the Cox’s Proportional Hazards model, which is a semi-parametric approach.

## 5 Cox’s Proportional Hazards (PH) Model

Often when we are analyzing survival data, we have individual data that is can shed light into the relationship between individual characteristics and survival rates. These are known as **covariates** or **explanatory variables**. Common examples are: physiological variables such as hemoglobin and heart rate; and lifestyle factors such as smoking, diet and exercise.

It’s difficult to incorporate **covariate effects into the survival analysis** when using non-parametric approaches such as the K-M Estimator. Hence, we need different approaches for that.

**Semi-parametric approaches are statistical models that have parametric and nonparametric components**. The most used semi-parametric model is **Cox’s Proportional Hazards (PH) model**.

PH has the following definition:

\[h(t) = h_0(t) \cdot e^{\left( \beta_1 x_1 + \beta_2 x_2 + \dots + \beta_p x_p \right)}\]

where:

- \(h(t)\) is the hazard function at time \(t\)
- \(h_0(t)\) is the
*baseline*hazard at time \(t\) - \(x_i\) is the \(i\)th covariate
- \(\beta_i\) the coefficient for the covariate \(x_i\)

We are now using hazards instead of survival function. The hazard function is widely used to express the risk or hazard of an event such as death occurring at some time \(t\). Also some authors and literature might use \(\lambda(t)\) instead of \(h(t)\) to denote the hazard function.

Notice that PH is called ** proportional hazards** because we can rewrite the above equation as:

\[\ln \frac{h(t)}{h_0(t)} = \beta_1 x_1 + \beta_2 x_2 + \dots + \beta_p x_p\]

where \(\ln\) is the natural logarithm function. So the linear combination of coefficients \(\beta_i\) and covariates \(x_i\) is equal to the logarithm of the hazard function divided by the baseline hazard. Hence, the linear combination is the logarithm of a proportional hazard with respect to a baseline hazard.

We say PH is **semi-parametric** because there is a ** non-parametric component** since no particular form of probability distribution is assumed for the survival times. And also has a

**because we can estimate the coefficients \(\beta_i\).**

*parametric*component### 5.1 How to interpret coefficients in a PH model

For the **interpretation** of the coefficients in a PH model note that:

- \(\beta\)s are in
**log-scale**. - \(\beta\)s represents the
.*proportional*hazard

So, if we exponentiate \(\beta\)s we get back something similar to an odds ratio that we saw already in the Logistic Regression tutorial. But now representing the proportional hazard compared to the baseline hazard.

## 6 Accelerated Failure Time (AFT) Model

When you combine the covariates effects (\(\beta_i\)) while making assumptions about the underlying distribution of the survival times we get a ** full parametric approach**. One of the most used parametric approach is the

**Accelerated Failure Time (AFT) model**.

In an AFT model, the covariates are assumed to act directly on lifetime, just the same as in the PH model. Also in an AFT model we can model the underlying survival distribution as a **known distribution but with unknown parameters to be estimated from data**.

The most used distributions are: **Exponential**, **Weibull**, and **Gompertz**.

## 7 Pumas modeling

Let’s create some **Pumas models** for **Time to Event models**. First, let’s revisit the Pumas’ workflow.

### 7.1 Pumas’ Workflow

Pumas’ workflow

**Define a model**. It can be a`PumasModel`

or a`PumasEMModel`

.**Define a**.`Subject`

and`Population`

**Fit your model**.**Perform inference on the model’s population-level parameters**(also known as*fixed*effects).**Predict from a fitted model**using the empirical Bayes estimate or**Simulate random observations from a fitted model**using the sampling distribution (also known as*likelihood*).**Model diagnostics**and**visual predictive checks**.

**Pumas modeling is a highly iterative process**. Eventually you’ll probably go back and forth in the stages of workflow. This is natural and expected in Pumas’ modeling. Just remember to have a good code organization and version control for all of the workflow iterations.

#### 7.1.1 Defining a model in Pumas

To define a model, you start with the `@model`

macro and a `begin .. end`

block of code:

`= @model begin end my_model `

```
PumasModel
Parameters:
Random effects:
Covariates:
Dynamical variables:
Derived:
Observed:
```

This creates an **empty model**. Now we need to populate it with model components. These are additional macros that we can include inside the `@model`

definition. We won’t be covering all the possible macros that we can use in this lesson, but here is the full list:

, fixed effects specifications.`@param`

, random effects specifications.`@random`

, covariate names.`@covariates`

, pre-processing variables for the dynamic system and statistical specification.`@pre`

, specification of any dose control parameters present in the model.`@dosecontrol`

, shorthand notation.`@vars`

, initial conditions for the dynamic system.`@init`

, dynamics of the model.`@dynamics`

, statistical modeling of dependent variables.`@derived`

, model information to be stored in the model solution.`@observed`

##### 7.1.1.1 Model parameter with `@param`

The model parameters go inside the `@param`

block. The syntax is the following:

```
@param begin
∈ Domain(...)
ParameterName ...
end
```

To use the “in” (`∈`

) operator in Julia, you can either replace `∈`

for `in`

or type `\in <TAB>`

for the Unicode character.

By using the `begin ... end`

block we can specify one parameter per line. We will just name the parameters the same name as it’s respective covariate, while also adding the `β`

prefix to make clear that this is a coefficient.

We will just name the parameters the same name as it’s respective covariate, while also adding the `β`

prefix to make clear that this is a coefficient.

for scalar parameters`RealDomain`

for vectors`VectorDomain`

for positive definite matrices with diagonal structure`PDiagDomain`

for general positive semi-definite matrices`PSDDomain`

If you don’t specify any arguments inside the domain constructor it will either error (for some domains that have *required* arguments) or will use the defaults. In the case of the `RealDomain()`

without arguments it just uses the following arguments:

`RealDomain(; lower = -∞, upper = ∞, init = 0)`

##### 7.1.1.2 Model covariates with @covariates

We also have to specify our **model covariates using the macro @covariates**. This can be done using a one-liner syntax, without

`begin ... end`

block:`@covariates Group var1 var2 var3 # and so on`

Or using the `begin ... end`

block and specifying one covariate per line. This is what we will do, since our model has several covariates, which makes easy to visualize them and also to comment out a covariate to perform iterative model exploration:

```
@covariates begin
var1
var2
var3...
end
```

##### 7.1.1.3 Pre-processing variables with `@pre`

We can specify all the **necessary variable and statistical pre-processing with the @pre macro**.

:::{.callout-note} The `@pre`

block is traditionally used to specify the inputs of the **O**rdinary **D**ifferential **E**quations (ODE) system used for **n**on-**l**inear **m**ixed-**e**ffects PK/PD models (NLME).

But we will use `@pre`

to specify our variable transformations instead.:::

Here, we are defining all the coefficients-covariates multiplication operation here. Note that we are adding the prefix `_`

to the resulting operation:

```
@pre begin
= @. var1 * βvar1
_var1 = @. var2 * βvar2
_var2 = @. var3 * βvar3
_var3 end
```

We are using the `@.`

macro which tells Julia to vectorize (add the “dot syntax”) to all operations and function calls to the right of it.

##### 7.1.1.4 Statistical modeling of dependent variables with `@derived`

Our final block, ** @derived**, is used to

**specify all the assumed distributions of observed variables that are derived from the blocks above**. This is where we include our

**dependent variable:**.

`depvar`

Note that `depvar`

is being declared as following a `Normal`

distribution. This is why we use the tilde notation `~`

. It means (much like the mathematical model notation) that `depvar`

follows a `Normal`

distribution. Since `depvar`

might be a vector of values, we need to broadcast, i.e. vectorize, the operation with the dot `.`

operator:

`~ Normal.(μ, σ) depvar `

where `μ`

and `σ`

are the parameters mean and standard deviation that parametrizes the Normal distribution.

If we would do more operations to the right of the `~`

, we would need to vectorize all of them. This is cumbersome, so we can use the `@.`

macro which tells Julia to apply the `.`

in every operator and function call after it:

`~ @. Normal(μ, σ) depvar `

In this block we can use all variables defined in the previous blocks, in our example the `@param`

, `@covariates`

and `@pre`

blocks.

#### 7.1.2 Define a `Subject`

and `Population`

Once we have our model defined we have to specify a `Subject`

and a `Population`

.

In Pumas, subjects (i.e. observations) are represented by the `Subject`

type and collections of subjects are represented as vectors of `Subject`

s are defined as `Population`

.

** Subjects** can be constructed with the

`Subject()`

constructor, for example:`Subject()`

```
Subject
ID: 1
```

We just constructed a `Subject`

that has `ID`

equal to `1`

and no extra information. In order to provide extra information, you would need to pass **keyword arguments** to the `Subject()`

constructor.

You can use the following keyword arguments inside `Subject()`

:

: identifier.`id`

: a named tuple of the dependent variables.`observations`

: a named tuple containing the covariates, or`covariates`

`nothing`

.: a vector of`events`

`Event`

s.: a vector of time stamps for the observations.`time`

For this lesson, we will just use the `covariates`

keyword argument.

This is an example of a `Subject`

with `ID`

equal to `42`

and with some `covariates`

specified as a named tuple:

`Subject(id = 42, covariates = (; TRT = 1, AUC = 100.0, WT = 70))`

```
Subject
ID: 42
Covariates: TRT, AUC, WT
```

**Since a Population is just a vector of Subjects**, we can use a simple

`map()`

function over the `ID`

s present at our dataset. For example:```
= map(
pop -> Subject(id = i, covariates = (; TRT = df.TRT, AUC = df.AUC, WT = df.WT)),
i 1:maximum(df.ID),
)
```

##### 7.1.2.1 Reading `Subject`

s directly from a `DataFrame`

Another option is to use the ** read_pumas()** function to read

`Subject`

s (and `Population`

) directly from a `DataFrame`

. This is more convenient than the `map()`

with the `Subject()`

constructor inside.The `read_pumas()`

function accepts as first argument a `DataFrame`

followed by the following keyword options (which are very similar to the `Subject()`

constructor):

: dependent variables specified by a vector of column names, i.e.`observations`

`[:depvar]`

.: covariates specified by a vector of column names, i.e.`covariates`

`[:TRT, :AUC, :WT, ...]`

.: specifies the`id`

`ID`

column of the`DataFrame`

.: specifies the`time`

`time`

column of the`DataFrame`

.: toggles assertions applicable to event data, either`event_data`

`true`

or `false.

Even if you don’t have a `:time`

column the `read_pumas`

function expects it. So you can pass a column of zeros as `Float64`

:

```
@rtransform! df :time = 0.0
= read_pumas(df; ..., time = :time) pop
```

#### 7.1.3 Fit your model

Now that you have a population, you can fit your model!

Model fitting in Pumas has the purpose of **estimating parameters** and is done by calling the ** fit() function** with the following positional arguments:

- Pumas
**model**. - a
.`Population`

- a named tuple of the
**initial parameter values**. - an
**inference algorithm**.

##### 7.1.3.1 Initial Parameter’s Values

We already covered model and `Population`

, now let’s talk about **initial parameter values**. It is the **3rd positional argument inside fit()**.

You can specify your initial parameters as a named tuple. For instance, if you want to have a certain parameter, `β1`

, as having an initial value as `3.14`

, you can do so by passing it inside a named tuple in the 3rd positional argument of `fit()`

:

`fit(model, population, (; β1 = 3.14))`

You can also use the helper function ** init_params()** which will recover all the initial parameters we specified inside the model’s

`@param`

block.##### 7.1.3.2 Inference Algorithms

Finally, our **last (4th) positional argument is the choice of inference algorithm**.

Pumas has the following available inference algorithms:

**Marginal Likelihood Estimation**:: first order approximation without random-effects.`NaivePooled()`

: first-order approximation.`FO()`

: first-order conditional estimation with automatic interaction detection.`FOCE()`

: second-order Laplace approximation.`LaplaceI()`

**Bayesian with Markov Chain Monte Carlo (MCMC)**:: MCMC using`BayesMCMC()`

**N**o-**U**-**T**urn**S**ampler (NUTS).

We can also use a **M**aximum **A** **P**osteriori (MAP) estimation procedure for any marginal likelihood estimation algorithm. You just need to call the `MAP()`

constructor with the desired marginal likelihood algorithm inside, for instance:

`fit(model, population, init_params(model), MAP(FOCE()))`

#### 7.1.4 Perform inference on the model’s population-level parameters

Once the model is fitted, we can analyze our inference and estimates.

We can see that after the model is fitted, it prints a result with a summary and a table of the parameter estimates.

We can also recover the estimates as a named tuple with ** coef()**. Or as a

`DataFrame`

with **.**

`coeftable()`

We can also inspect our estimated coefficients’ **standard errors ( SE)** along with the

**95% confidence intervals**with the

**function**

`infer()`

#### 7.1.5 Predict from a fitted model or Simulate random observations from a fitted model

Now that we have our model estimates, we can use it to make predictions or simulate random observations.

##### 7.1.5.1 Predictions with `predict()`

In order to calculate prediction from a fitted model, we need to use the `predict()`

function and passing **two positional arguments**:

- a fitted Pumas model.
- either a single
`Subject`

or a collection of them with a`Population`

.

For example to make predictions onto a single `Subject`

, we can:

`= Subject(id = 42, covariates = (; TRT = 1, AUC = 100.0, WT = 70)) my_subject `

```
Subject
ID: 42
Covariates: TRT, AUC, WT
```

`predict()`

also needs a **keyword argument obstimes** which represents the observation times that we want to predict the dependent variable. The value must be a vector. Since we used a dummy

`0.0`

value as `:time`

variable in our `read_pumas`

function, we can pass the same `0.0`

but as a vector to the `predict()`

function.`predict(model_fit, my_subject; obstimes = [0.0])`

We can also generate prediction for a `Population`

:

`predict(model_fit, pop; obstimes = [0.0])`

##### 7.1.5.2 Simulations with `simobstte()`

Time to event models uses the `simobstte`

function instead of the `simobs`

function we covered in all of the discrete models so far.

We can generate random observations from our fitted Pumas model with the `simobstte()`

function. You need to supply at least **three positional arguments**:

- a
*non-fitted*Pumas model, in our case`model`

. - either a single
`Subject`

or a collection of them with a`Population`

. - a named tuple of parameter values.

The 1st and 2nd positional arguments we already covered in our previous examples. Let’s focus on the **3rd positional argument: param**. It can be a named tuple with the parameter values. For instance, we can use estimated parameters from a fitted model with

`coef(model_fit)`

:`simobstte(model, pop, coef(model_fit))`

You can also pass a single parameter as a Pumas *fitted* model to `simobstte()`

.

`simobstte(model_fit)`

#### 7.1.6 Model diagnostics

Finally, our last step is to **assess model diagnostics**.

##### 7.1.6.1 Assessing model diagnostics

To assess model diagnostics we can use the `inspect()`

function in a fitted Pumas model:

`inspect(model_fit)`

We can also check for **I**nformation **C**riteria, such as:

**A**kaike**I**nformation**C**riterion (AIC):`aic(model_fit)`

**B**ayesian**I**nformation**C**riterion (BIC):`bic(model_fit)`

## 8 `tte_single`

dataset

The `tte_single`

dataset from `PharmaDatasets.jl`

has just a dependent variable `:DV`

and covariates `:TIME`

and `:DOSE`

.

Since it is a single Time to Event (TTE) it has an `:EVID`

column filled with `3`

s and `0`

s.

`:EVID`

columns describes the **event ID**. It uses a pattern reminiscent from NONMEM, where `1`

specifies a normal event. `3`

means it’s a reset event, meaning that the value of the dynamical variable is reset to the `amt`

at the dosing event. If `4`

, then the dynamical value and time are reset, and then a final dose is given. Defaults to `1`

. For more about `EVID`

s see the **Dosage Regimens** documentation.

First, let’s load the dataset with `PharmaDatasets.jl`

:

```
using PharmaDatasets
= dataset("tte_single")
tte_single first(tte_single, 5)
```

Row | ID | TIME | DV | EVID | DOSE |
---|---|---|---|---|---|

Int64 | Float64 | Int64 | Int64 | Int64 | |

1 | 1 | 0.0 | 0 | 3 | 1 |

2 | 1 | 484.537 | 1 | 0 | 1 |

3 | 2 | 0.0 | 0 | 3 | 1 |

4 | 2 | 246.129 | 1 | 0 | 1 |

5 | 3 | 0.0 | 0 | 3 | 1 |

### 8.1 Exploratory Data Analysis

Let’s do some exploratory data analysis and start with loading `DataFramesMeta.jl`

:

```
using DataFrames
using DataFramesMeta
```

As you can see every `ID`

has two unique `EVID`

s:

`3`

for the reset event, i.e. the start of the dosing event.`1`

for a normal event, i.e. when a “failure” happens.

We have 300 unique `ID`

s from `1`

to `300`

:

```
= @by tte_single :ID :unique_EVID = length(:EVID)
by_df last(by_df, 5)
```

Row | ID | unique_EVID |
---|---|---|

Int64 | Int64 | |

1 | 296 | 2 |

2 | 297 | 2 |

3 | 298 | 2 |

4 | 299 | 2 |

5 | 300 | 2 |

Let’s see some summary statistics with `describe()`

:

`describe(tte_single)`

Row | variable | mean | min | median | max | nmissing | eltype |
---|---|---|---|---|---|---|---|

Symbol | Float64 | Real | Float64 | Real | Int64 | DataType | |

1 | ID | 150.5 | 1 | 150.5 | 300 | 0 | Int64 |

2 | TIME | 127.492 | 0.0 | 2.95488 | 500.0 | 0 | Float64 |

3 | DV | 0.413333 | 0 | 0.0 | 1 | 0 | Int64 |

4 | EVID | 1.5 | 0 | 1.5 | 3 | 0 | Int64 |

5 | DOSE | 0.5 | 0 | 0.5 | 1 | 0 | Int64 |

We need to have summary statistics by `:EVID`

. This can be accomplished with a `groupby`

or `@by`

:

```
@chain tte_single begin
@by :EVID :mean_failure = mean(:TIME)
@rsubset :EVID != 3
end
```

Row | EVID | mean_failure |
---|---|---|

Int64 | Float64 | |

1 | 0 | 254.984 |

The mean time of failure is the mean of the variable `:TIME`

only for `EVID`

s values different than 3: ≈255.

Also we would like to know if the mean time of failure differs for different `:DOSE`

values. `:DOSE`

can take only values `0`

and `1`

:

`unique(tte_single.DOSE)`

```
2-element Vector{Int64}:
1
0
```

This can also be accomplished with a `groupby`

or `@by`

:

```
@chain tte_single begin
@by [:DOSE, :EVID] :mean_failure = mean(:TIME)
@rsubset :EVID != 3
@select $(Not(:EVID))
end
```

Row | DOSE | mean_failure |
---|---|---|

Int64 | Float64 | |

1 | 0 | 180.261 |

2 | 1 | 329.707 |

So, definitely observations that have a value `1`

of `DOSE`

have higher failures time than observations with `DOSE`

having a value of `0`

.

### 8.2 Visualizations

Now let’s plot some visualizations with `AlgebraOfGraphics.jl`

using `CairoMakie.jl`

as a backend:

First, we need to convert `tte_single`

dataset to a time-series type of dataset. This can be accomplished by keeping only the `EVID`

s different from 3, then grouping by both `:DOSE`

and `:TIME`

, and counting the occurrences of `:TIME`

:

```
= @chain tte_single begin
tte_single_ts @rsubset :EVID != 3
@by [:DOSE, :TIME] :total_failure = length(:TIME)
@orderby :TIME :DOSE
end
```

Please heed that our `tte_single`

dataset is censored. So every observation that has not “failed” until time 500 automatically takes value `500`

for the `:TIME`

column. This is a problem for the time series visualization, so we will keep only the observations with `:TIME`

lower than 500 in the `tte_single_ts`

dataset.

`sum(tte_single_ts.total_failure)`

`300`

```
@rsubset! tte_single_ts :TIME < 500;
sum(tte_single_ts.total_failure)
```

`248`

We can see that we have a total failure of 248 observations from the total of 300.

We need one last thing for our time series formatted data `tte_single_ts`

. In order to get a plot of the survival function through time, we need to normalize the `:total_failure`

to the interval from `0`

to `1`

and also calculate the cumulative sum by different `:DOSE`

values:

```
= groupby(tte_single_ts, :DOSE);
d0, d1
@transform! d0 @astable begin
= sum(:total_failure)
sum_total_failure = cumsum(:total_failure)
cumulative_failure :survival = (sum_total_failure .- cumulative_failure) ./ sum_total_failure
end
@transform! d1 @astable begin
= sum(:total_failure)
sum_total_failure = cumsum(:total_failure)
cumulative_failure :survival = (sum_total_failure .- cumulative_failure) ./ sum_total_failure
end
= vcat(d0, d1) tte_single_ts_combined
```

Now we can plot it:

```
data(tte_single_ts_combined) *
mapping(:TIME, :survival; color = :DOSE => nonnumeric) *
visual(Lines) |> draw
```

As we can clearly see, ** :DOSE impacts the survival function**.

### 8.3 Pumas Models

Now let’s create three Pumas AFT models. One for each of the assumed underlying distribution of the survival function:

**Exponential****Weibull****Gompertz**

#### 8.3.1 Exponential AFT Model

Let’s start with the simplest AFT model: **exponential**.

We define two parameters in the `@params`

block:

: basal hazard.`λ₁`

: coefficient for covariate`β`

`:DOSE`

.

Inside the ** @pre** block we define

**as the**

`_λ₀`

**.**

*total*hazardIn the ** @vars** block we include our underlying

**exponential**distribution. For the exponential distribution, the

**.**

*estimated*hazard`λ`

is simply equal to the *total*hazard`_λ₀`

Now, most important, in the ** @dynamics** block, we defined the

**and the**

*cumulative*hazard as`Λ`

**its**equal to the

*derivative*`Λ'`

**.**

*estimated*hazard`λ`

Finally, in the ** @derived** block, we condition our

**as a**

*dependent*variable`:DV`

**.**

`TimeToEvent`

Pumas’ distribution**as the first positional parameter and as the second positional parameter the**

`TimeToEvent`

takes the *estimated*hazard`λ`

**.**

*cumulative*hazard`Λ`

The dependent variable in a model that uses `TimeToEvent`

should be a **censoring variable** that is zero if the variable isn’t censored and one if the variable is right censored. Currently, no other censoring types are supported.

So, this means that your ** DV has to take only values of 1 and 0**. If you observe a “failure” event until the end of the data collection, you should make

`DV = 1`

, and `DV = 0`

otherwise.Since `DV`

is a column (and also a vector of values), we need to broadcast, i.e. vectorize, the operation with the dot `.`

operator. We do this with the `@.`

macro.

Here is the **full TTE exponential Pumas model**:

It is always best to initialize your parameters to something other than `0`

(the default of `RealDomain`

). This makes the iterative estimation procedures more numerically robust. Whenever uncertain you can choose a small value close to `0`

. In our case, I’ve chosen `0.001`

.

```
= @model begin
tte_exp_model @param begin
∈ RealDomain(; lower = 0, init = 0.001)
λ₁ ∈ RealDomain(; init = 0.001)
β end
@covariates DOSE
@pre begin
= λ₁ * exp(β * DOSE)
_λ₀ end
@vars begin
# exponential
= _λ₀
λ end
@dynamics begin
# the derivative of cumulative hazard is equal to hazard
' = λ
Λend
@derived begin
# Pumas derived TTE function
~ @. TimeToEvent(λ, Λ)
DV end
end
```

```
PumasModel
Parameters: λ₁, β
Random effects:
Covariates: DOSE
Dynamical variables: Λ
Derived: DV, λ
Observed: DV, λ
```

#### 8.3.2 Weibull AFT Model

Let’s move on to the second AFT model: **Weibull**.

We now define three parameters, one additional from the exponential model, in the `@params`

block:

: basal hazard.`λ₁`

: coefficient for covariate`β`

`:DOSE`

.: the shape parameter of the underlying AFT model Weibull distribution. This is the`Κ`

`α`

parameter in the Weibull plots above in the beginning of the tutorials. Weibull distribution is used extensively by several modeling communities and, because of historical reasons, there are several naming conventions for the parameters that characterize the distribution. We will use`K`

from now, since it is customary in the biomedical modeling community.

Almost everything is similar from the exponential model. The only thing that changes is how we calculate the estimated hazard `λ`

inside the `@vars`

Inside the ** @pre** block we define

**as the**

`_λ₀`

**.**

*total*hazardIn the ** @vars** block we include our underlying

**Weibull**distribution. For the Weibull distribution, the

**.**

*estimated*hazard`λ`

is the probability density function (PDF) of the Weibull distribution condition on the *total*hazard`_λ₀`

and also parameter `K`

Now, most important, in the ** @dynamics** block, we defined the

**and the**

*cumulative*hazard as`Λ`

**its**equal to the

*derivative*`Λ'`

**.**

*estimated*hazard`λ`

Finally, in the ** @derived** block, we condition our

**as a**

*dependent*variable`:DV`

**.**

`TimeToEvent`

Pumas’ distribution**as the first positional parameter and as the second positional parameter the**

`TimeToEvent`

takes the *estimated*hazard`λ`

**.**

*cumulative*hazard`Λ`

The dependent variable in a model that uses `TimeToEvent`

should be a **censoring variable** that is zero if the variable isn’t censored and one if the variable is right censored. Currently, no other censoring types are supported.

So, this means that your ** DV has to take only values of 1 and 0**. If you observe a “failure” event until the end of the data collection, you should make

`DV = 1`

, and `DV = 0`

otherwise.Since `DV`

is a column (and also a vector of values), we need to broadcast, i.e. vectorize, the operation with the dot `.`

operator. We do this with the `@.`

macro.

Here is the **full TTE Weibull Pumas model**:

```
= @model begin
tte_wei_model @param begin
∈ RealDomain(; lower = 0, init = 0.001)
λ₁ ∈ RealDomain(; init = 0.001)
β ∈ RealDomain(; lower = 0, init = 0.001)
Κ end
@covariates DOSE
@pre begin
= Κ
_Κ = λ₁ * exp(β * DOSE)
_λ₀ end
@vars begin
# Weibull
# 1e-10 for model numerical stability
= _λ₀ * _Κ * (_λ₀ * t + 1e-10)^(_Κ - 1)
λ end
@dynamics begin
# the derivative of cumulative hazard is equal to hazard
' = λ
Λend
@derived begin
# Pumas derived TTE function
~ @. TimeToEvent(λ, Λ)
DV end
end
```

```
PumasModel
Parameters: λ₁, β, Κ
Random effects:
Covariates: DOSE
Dynamical variables: Λ
Derived: DV, λ
Observed: DV, λ
```

#### 8.3.3 Gompertz AFT Model

Let’s move on to the third and final AFT model: **Gompertz**.

We also define three parameters, one additional from the exponential model, in the `@params`

block:

: basal hazard.`λ₁`

: coefficient for covariate`β`

`:DOSE`

.: the shape parameter of the underlying AFT model Gompertz distribution.`Κ`

Almost everything is similar from the exponential model. The only thing that changes is how we calculate the estimated hazard `λ`

inside the `@vars`

Inside the ** @pre** block we define

**as the**

`_λ₀`

**.**

*total*hazardIn the ** @vars** block we include our underlying

**Weibull**distribution. For the Gompertz distribution, the

**.**

*estimated*hazard`λ`

is the probability density function (PDF) of the Gompertz distribution condition on the *total*hazard`_λ₀`

and also parameter `K`

Now, most important, in the ** @dynamics** block, we defined the

**and the**

*cumulative*hazard as`Λ`

**its**equal to the

*derivative*`Λ'`

**.**

*estimated*hazard`λ`

Finally, in the ** @derived** block, we condition our

**as a**

*dependent*variable`:DV`

**.**

`TimeToEvent`

Pumas’ distribution**as the first positional parameter and as the second positional parameter the**

`TimeToEvent`

takes the *estimated*hazard`λ`

**.**

*cumulative*hazard`Λ`

The dependent variable in a model that uses `TimeToEvent`

should be a **censoring variable** that is zero if the variable isn’t censored and one if the variable is right censored. Currently, no other censoring types are supported.

So, this means that your ** DV has to take only values of 1 and 0**. If you observe a “failure” event until the end of the data collection, you should make

`DV = 1`

, and `DV = 0`

otherwise.Since `DV`

is a column (and also a vector of values), we need to broadcast, i.e. vectorize, the operation with the dot `.`

operator. We do this with the `@.`

macro.

Here is the **full TTE Gompertz Pumas model**:

```
= @model begin
tte_gomp_model @param begin
∈ RealDomain(; lower = 0, init = 0.001)
λ₁ ∈ RealDomain(; init = 0.001)
β ∈ RealDomain(; lower = 0, init = 0.001)
Κ end
@covariates DOSE
@pre begin
= Κ
_Κ = λ₁ * exp(β * DOSE)
_λ₀ end
@vars begin
# Gompertz
= _λ₀ * exp(_Κ * t)
λ end
@dynamics begin
# the derivative of cumulative hazard is equal to hazard
' = λ
Λend
@derived begin
# Pumas derived TTE function
~ @. TimeToEvent(λ, Λ)
DV end
end
```

```
PumasModel
Parameters: λ₁, β, Κ
Random effects:
Covariates: DOSE
Dynamical variables: Λ
Derived: DV, λ
Observed: DV, λ
```

### 8.4 `tte_single`

Population

Before we do any fitting of the models, we need to convert our dataset `tte_single`

into a `Population`

with the `read_pumas`

function.

Since we have different `:EVID`

values, we need to also provide `:AMT`

and `:CMT`

columns.

`:EVID`

column is specified using the `evid`

keyword argument. `:AMT`

column is the dose amount column and is specified using the `amt`

keyword argument. Since we do not have any pharmacokinetics in our model and data, we simply create an `:AMT`

column filled with `0`

s. `:CMT`

column is the compartment column and is specified using the `cmt`

keyword argument. Since we do not have any pharmacokinetics in our model and data, we also simply create an `:CMT`

column filled with `1`

s to denote a single compartment model:

`@rtransform! tte_single :AMT = 0 :CMT = 1;`

```
= read_pumas(
tte_single_pop
tte_single;= [:DV],
observations = [:DOSE],
covariates = :ID,
id = :TIME,
time = :AMT,
amt = :CMT,
cmt = :EVID,
evid )
```

```
Population
Subjects: 300
Covariates: DOSE
Observations: DV
```

### 8.5 Estimation with `fit`

Now that we have Pumas models and a population we can start the **estimation with fit**. We do not have any mixed-effects, so we will use the

**.**

`NaivePooled()`

maximum likelihood estimation algorithmLet’s fit all three models using the same population:

```
=
tte_single_exp_fit fit(tte_exp_model, tte_single_pop, init_params(tte_exp_model), Pumas.NaivePooled());
=
tte_single_wei_fit fit(tte_wei_model, tte_single_pop, init_params(tte_wei_model), Pumas.NaivePooled());
=
tte_single_gomp_fit fit(tte_gomp_model, tte_single_pop, init_params(tte_gomp_model), Pumas.NaivePooled());
```

### 8.6 Comparing Estimates with `compare_estimates`

We can use the function `compare_estimates`

from `PumasUtilities.jl`

to compare estimates across different models.

Let’s load `PumasUtilities.jl`

:

`using PumasUtilities`

The `compare_estimates`

takes a `NamedTuple`

of models:

```
compare_estimates(;
= tte_single_exp_fit,
Exponential = tte_single_wei_fit,
Weibull = tte_single_gomp_fit,
Gompertz )
```

Row | parameter | Exponential | Weibull | Gompertz |
---|---|---|---|---|

String | Float64? | Float64? | Float64? | |

1 | λ₁ | 0.00532562 | 0.00496489 | 0.00375767 |

2 | β | -0.92922 | -0.811505 | -1.08131 |

3 | Κ | missing | 1.30176 | 0.00223201 |

As we can see all three AFT models have similar estimates for hazard `λ₁`

and `DOSE`

effect on hazard `β`

.

Let’s bring also the confidence intervals and standard error for the estimates of each model.

We can calculate confidence intervals with the `infer`

function, and then collect them into a `DataFrame`

with the `coeftable`

function. We do a little bit of data wrangling to get the comparisons and also exponentiate the estimates to make them interpretable:

```
= coeftable(infer(tte_single_exp_fit))
df_exponential = coeftable(infer(tte_single_wei_fit))
df_weibull = coeftable(infer(tte_single_gomp_fit))
df_gompertz
@rsubset! df_exponential :parameter == "β"
@rtransform! df_exponential :model = "exponential"
@rsubset! df_weibull :parameter == "β"
@rtransform! df_weibull :model = "Weibull"
@rsubset! df_gompertz :parameter == "β"
@rtransform! df_gompertz :model = "Gompertz"
= vcat(df_exponential, df_weibull, df_gompertz)
df
@rtransform! df $(
:estimate, :ci_lower, :ci_upper] .=>
[-> exp.(x)) .=> [:estimate, :ci_lower, :ci_upper]
(x )
```

```
[ Info: Calculating: variance-covariance matrix.
[ Info: Done.
[ Info: Calculating: variance-covariance matrix.
[ Info: Done.
[ Info: Calculating: variance-covariance matrix.
[ Info: Done.
```

Row | parameter | estimate | se | ci_lower | ci_upper | model |
---|---|---|---|---|---|---|

String | Float64 | Float64 | Float64 | Float64 | String | |

1 | β | 0.394862 | 0.112017 | 0.317027 | 0.491806 | exponential |

2 | β | 0.444189 | 0.102203 | 0.363557 | 0.542704 | Weibull |

3 | β | 0.339152 | 0.134403 | 0.26061 | 0.441365 | Gompertz |

Weibull AFT model has the lowest standard error, as display in the column `:se`

, and, consequently, the narrowest confidence interval.

#### 8.6.1 Visual Comparisons

Let’s do some visual comparisons between the models. We are going to use the estimated hazard and `DOSE`

covariate effect to plot different estimated survival functions across all three models using color to differentiate between `DOSE`

equal 0 or 1.

This means a line plot with:

`t`

in the x-axis- estimated hazard in the y-axis
- color is
`DOSE`

To get the estimated hazard as a function of time `t`

we are going to use the formula in the `@vars`

block in each model while using the estimated parameters from each model in the calculation.

First, some helper functions:

```
function hazard_exponential(λ₁, β, DOSE, T)
= λ₁ * exp(β * DOSE) # estimated hazard
λ₀ = [λ₀ for t = 1:T] # cumulative hazard
λ return 1 .- cumsum(λ)
end;
function hazard_weibull(λ₁, β, Κ, DOSE, T)
= λ₁ * exp(β * DOSE) # estimated hazard
λ₀ = [λ₀ * Κ * (λ₀ * t + 1e-10)^(Κ - 1) for t = 1:T] # cumulative hazard
λ return 1 .- cumsum(λ)
end;
function hazard_gompertz(λ₁, β, Κ, DOSE, T)
= λ₁ * exp(β * DOSE) # estimated hazard
λ₀ = [λ₀ * exp(Κ * t) for t = 1:T] # cumulative hazard
λ return 1 .- cumsum(λ)
end;
```

These functions will return a length-\(T\) vector of the survival distribution. That is, 1 minus the cumulative hazard until time \(T\) depending on the `DOSE`

value.

Now let’s plot it:

```
= 1:200
t
# exponential estimates
= hazard_exponential(
λ_exponential_dose_0 0.00532562, # estimated λ₁
-0.92922, # estimated β
0, # DOSE value
length(t), # T
)= hazard_exponential(
λ_exponential_dose_1 0.00532562, # estimated λ₁
-0.92922, # estimated β
1, # DOSE value
length(t), # T
)
# Weibull estimates
= hazard_weibull(
λ_weibull_dose_0 0.00496489, # estimated λ₁
-0.811505, # estimated β
1.30176, # Κ
0, # DOSE value
length(t), # T
)= hazard_weibull(
λ_weibull_dose_1 0.00496489, # estimated λ₁
-0.811505, # estimated β
1.30176, # Κ
1, # DOSE value
length(t), # T
)
# Gompertz estimates
= hazard_gompertz(
λ_gompertz_dose_0 0.00375767, # estimated λ₁
-1.08131, # estimated β
0.00223201, # Κ
0, # DOSE value
length(t), # T
)= hazard_gompertz(
λ_gompertz_dose_1 0.00375767, # estimated λ₁
-1.08131, # estimated β
0.00223201, # Κ
1, # DOSE value
length(t), # T
)
=
plt data(
# data in a long table format
(;= repeat(t, 6), # 2 DOSES ⋅ 3 AFT models
t = vcat(
λs # exponential estimates
λ_exponential_dose_0,
λ_exponential_dose_1,# Weibull estimates
λ_weibull_dose_0,
λ_weibull_dose_1,# Gompertz estimates
λ_gompertz_dose_0,
λ_gompertz_dose_1,
),= repeat(
DOSE 0, 1];
[= length(t), # repeat 0 500x then 1 500x
inner = 3, # 3 AFT models
outer
),= repeat(["exponential", "Weibull", "Gompertz"]; inner = 2 * length(t)), # repeat each model 1_000x
MODEL *
)) mapping(:t => L"t", :λs => L"S(t)"; color = :DOSE => nonnumeric, row = :MODEL) *
visual(Lines)
draw(plt; axis = (; yticks = 0.0:0.25:1.0))
```

The models differ little in the general estimation of the survival function across different `DOSE`

values.

## 9 `tte_repeated`

dataset

The `tte_repeated`

dataset from `PharmaDatasets.jl`

has just a dependent variable `:DV`

and covariate `:TIME`

.

This type of data is common when the time to event can occurs several times. For example, patient death is generally a single TTE data. Whereas, patient experiencing vomiting or being afflicted with cancer can occur more than once, then it is a repeated TTE data.

```
= dataset("tte_repeated")
tte_repeated first(tte_repeated, 5)
```

Row | ID | DV | TIME |
---|---|---|---|

Int64 | Int64 | Float64 | |

1 | 1 | 0 | 288.0 |

2 | 2 | 1 | 22.6626 |

3 | 2 | 1 | 29.3927 |

4 | 2 | 0 | 288.0 |

5 | 3 | 0 | 288.0 |

We can definitely see that we have 316 repeated observations from 120 IDs, from 1 to 120.

`tte_repeated`

does *not* have an `:EVID`

column like the `tte_single`

dataset. Hence we need to add an `evid = 3`

row for each measurement row `evid = 0`

.

```
:TAD] =
tte_repeated[!, combine(i -> [first(i.TIME); diff(i.TIME)], groupby(tte_repeated, [:ID])).x1
# Add EVID=3 events with TAD=1e-10 to reset the integration right after each observation
= DataFrame(;
evd = tte_repeated.ID,
ID = missing,
AMT = missing,
DV = tte_repeated.TIME,
TIME = 1e-10,
TAD = 3,
EVID
):EVID] .= 0
tte_repeated[!, :AMT] .= 0
tte_repeated[!, = vcat(tte_repeated, evd)
tte_repeated_evid sort!(tte_repeated_evid, [:ID, :TIME, :EVID])
```

Also, we have, on average, 2.6 events per ID:

```
@chain tte_repeated_evid begin
@rsubset :EVID == 0
@by :ID :freq_DV = length(:DV)
@combine :mean_freq_DV = mean(:freq_DV)
end
```

Row | mean_freq_DV |
---|---|

Float64 | |

1 | 2.63333 |

### 9.1 Pumas Models

Since we do not have any covariates besides `:TIME`

our previous defined Pumas models would need to be rewritten without `DOSE`

.

I will show an example with the exponential AFT model. The code for Weibull and Gompertz should be the same as above but without the `β`

and `DOSE`

.

```
= @model begin
tte_repeated_exp_model @param begin
∈ RealDomain(; lower = 0, init = 0.001)
λ₁ end
@pre begin
= λ₁
_λ₀ end
@vars begin
# exponential
= _λ₀
λ end
@dynamics begin
# the derivative of cumulative hazard is equal to hazard
' = λ
Λend
@derived begin
# Pumas derived TTE function
~ @. TimeToEvent(λ, Λ)
DV end
end
```

```
PumasModel
Parameters: λ₁
Random effects:
Covariates:
Dynamical variables: Λ
Derived: DV, λ
Observed: DV, λ
```

### 9.2 `tte_repeated`

Population

Like we did for `tte_single`

, we need to convert our dataset `tte_repeated`

into a `Population`

with the `read_pumas`

function.

Since we have a dependent variable `:DV`

, we need to also provide an `:AMT`

column.

`:AMT`

column is the dose amount column and is specified using the `amt`

keyword argument. Since we do not have any pharmacokinetics in our model and data, we simply create an `:AMT`

column filled with `0`

s:

`@rtransform! tte_repeated_evid :AMT = 0;`

```
= read_pumas(
tte_repeated_pop
tte_repeated_evid;= [:DV],
observations = :ID,
id = :TAD,
time = :EVID,
evid = :AMT,
amt )
```

### 9.3 Estimation with `fit`

Now that we have a Pumas model and a population we can start the **estimation with fit**. We do not have any mixed-effects, so we will use the

**.**

`NaivePooled()`

maximum likelihood estimation algorithmLet’s fit our `tte_repeated_exp_model`

using the population we just created:

```
= fit(
tte_repeated_exp_fit
tte_repeated_exp_model,
tte_repeated_pop,init_params(tte_repeated_exp_model),
NaivePooled(),
Pumas. )
```

```
[ Info: Checking the initial parameter values.
[ Info: The initial negative log likelihood and its gradient are finite. Check passed.
Iter Function value Gradient norm
0 1.388480e+03 1.614400e+02
* time: 6.699562072753906e-5
1 1.251864e+03 1.020562e+02
* time: 0.0007879734039306641
2 1.212583e+03 3.131723e+01
* time: 0.0016911029815673828
3 1.210335e+03 1.514137e+01
* time: 0.0021631717681884766
4 1.209782e+03 1.285035e+00
* time: 0.002593994140625
5 1.209778e+03 4.741926e-02
* time: 0.0030760765075683594
6 1.209778e+03 1.561555e-04
* time: 0.0035400390625
```

```
FittedPumasModel
Successful minimization: true
Likelihood approximation: NaivePooled
Log-likelihood value: -1209.7782
Number of subjects: 120
Number of parameters: Fixed Optimized
0 1
Observation records: Active Missing
DV: 316 0
Total: 316 0
------------------
Estimate
------------------
λ₁ 0.0056713
------------------
```

## 10 Conclusion

I hope you enjoyed this tutorial on time to event modeling. Whenever you are dealing with survival data, or any other of time to event data, you can use any of the techniques described here. Also, don’t forget to check the recommended textbooks and resources in the Tip green box at the start of this tutorial.