```
using Pumas
using DataFramesMeta
using PumasUtilities
using Random
```

# NONMEM style BLQ models (M1-M4) models in Pumas

Before we begin we need to load the necessary packages and set a seed for reproducibility.

This tutorial is about handling the missing data problem related to observations being outside the limits of quantification. In 2022, we held a webinar about the theory of such statistical modeling. It was was recorded and is available on YouTube. The video is embedded below.

## 1 Handling Lower Limit of Quantification

In the real world, assays often come with limitations such as lower or upper limits of quantification. When pharmacometricians receive data it has often been processed by the lab or by statisticians in other departments. The data can have various levels of information about the measured concentration if it was at or below the Lower Limit of Quantification (LLOQ). Unfortunately, such measurements are often simply labeled `"BQL"`

in the dependent variable column to indicate that the measurement was taken but it was Below Quatification Limit (BQL). If the value was provided it could be modelled in a more explicit manner, but with just `BQL`

indications, we need to treat it as a missing data problem.

In pharmacometrics, the main references on how to handle this kind of data are Beal (2001) for a conceptual description and Bergstrand and Karlsson (2009) for a simulation exercise. These papers use the M1-M7 nomenclature of the former. In Pumas, we try to always refer to things according to their actual distributional assumptions or data transformation steps. In this tutorial, we will try to use both to ease the transition for a NONMEM workflow to a Pumas workflow. The discussions around M* methods assumes familiarity with the nomenclature described in the above papers. See our other tutorials for BQL introductions that does not directly refer to NONMEM methods if you are not transitioning, as the discussion in terms of M* methods is mostly done for legacy reasons.

## 2 Definitions

Before we start, we need to define some terms:

Truncation is a statistical term that refers to the removal of data points that fall outside a certain range. In the context of pharmacometrics, truncation is used to remove data points that fall below the lower limit of quantification (LLOQ). Mathematically, say that \(X\) is our dependent variable, i.e. the quantity we want to measure:

\[X \sim N(\mu, \sigma^2)\]

\(X\) is generally measured with some instrument. To account for the fact that the instrument might have some limitations, we generally observe in our dataset either the truncated or censored values of \(X\). We denote \(Y^t\) as the truncated value of \(X\) and \(Y^c\) as the censored value of \(X\).

In trucation, we observe the value of \(X\) if it is between lower and higher thresholds, \(a\) and \(b\), respectively. For some reason, the data processing step has removed the data points that fall outside this range. We can represent a truncated variable \(Y^t\) as:

\[Y^t = X \mid a \leq X \leq b\]

The censoring case is different. In censoring, we observe the value of \(X\) if it is above a certain threshold, \(a\); or below a certain threshold, \(b\). The censored variable \(Y^c\) is then:

\[Y^c = \min(\max(X, a), b)\]

Here are examples of truncation and censoring:

**Truncation**(missing information): A device measures a plasma concentration. We only see data points if they are within the truncation limits.**Censoring**(missing information): A device measures a plasma concentration as before. However, below some measurement value (LLOQ) the number cannot be trusted so we only know that our measurement is below that value.

There’s a difference between data truncation and a truncated distribution. **Data truncation** refers to the removal of data points that fall outside a certain range. Whereas a **truncated distribution** is a probability distribution that is restricted to a certain range. It assumes that the Data Generating Process (DGP) is truncated. In the sense that the data is generated from a distribution that is truncated.

For example an assay that cannot return negative values, or a measurement device that cannot measure values above a certain threshold. In these cases the Normal distribution, which originally has support on the entire real line, is truncated to a certain range.

## 3 Generating data

To study a specific statistical phenomenon and properties of any estimator it is often important to be able to separately specify:

- the Data Generating Process (DGP); and
- the estimation procedure or estimator.

The data in this tutorial will come from a simple, simulated data set with 90 subjects that receive an injection of 200 units at time 0. The assay used to measure the concentration of the drug in plasma has a Lower Limit of Quantification (LLOQ) of 0.9 units of the concentration. By the simple properties of a two compartment model with linear elimination rate, all subjects will eventually have concentrations that are sufficiently low that samples will have a chance to be Below the Quantification Limit (BQL). The error model is a combined error model. With such a distribution we can have observations that are *not* BQL even at high `t`

(model time), but the proportion of BQL values should be increasing over time.

Notice, that the above type of BQL observations is only one type of many. The proportion of BQL is simply a function of time. In some cases, the probability of a being BQL may be a function of covariates as well. Some patients may have a higher volume of distribution for example. This does not change how we handle the issue statistically, but it may affect how we decide whether or not to ignore or explicitly handle the BQL issue. You could argue that you should always explicitly handle the BQL values, but the choice may not be so simple. The methods that explicitly model the BQL values in a statistical manner require the use of the `LaplaceI`

method if we are using first order approximation methods. This method is known to be slower and sometimes less numerically robust than `FOCE`

. For this reason, an analyst may choose to only use BQL methods if the bias from simply ignoring the problematic observations is expected to be high. An alternative is to use Expectation-Maximization (EM) methods such as `SAEM`

. The treatment of such an approach is done in a separate tutorial.

## 4 Data Generating Process

The Data Generating Process (DGP) consists of the specification of:

- the population
- the doses
- the sampling times
- the model
- the parameters

### 4.1 The population, doses, and sampling times

As mentioned previously, each subject is dosed 200 units of the drug at time zero. There are 90 subjects in total and we sample their plasma at the time of dosing and each hours thereafter for twelve hours.

```
= [
pop Subject(
= string(i),
id = (CONC = nothing,),
observations = DosageRegimen(200),
events = [1.0, 3.0, 5.0, 7.0, 9.0, 11.0],
time = 1:90
) for i ]
```

This population is empty in the sense that it has no simulated observations yet. To do that, we need to define a model and the parameters. The following model can be seen as a model for the distribution and elimination of the drug (the dynamical model) as well as the values from the assay (the derived block) without any masking of BQL values.

```
= @model begin
iv2cmt_dgp @param begin
∈ RealDomain(lower = 0.1)
θcl ∈ RealDomain(lower = 1.0)
θvc ∈ RealDomain(lower = 0.1)
θq ∈ RealDomain(lower = 1.1)
θvp ∈ PDiagDomain(2)
Ω ∈ RealDomain(lower = 0.0)
σ_add ∈ RealDomain(lower = 0.0)
σ_prop end
@random begin
~ MvNormal(Ω)
η end
@pre begin
= θcl * exp(η[1])
CL = θvc * exp(η[2])
Vc = θq
Q = θvp
Vp end
@dynamics Central1Periph1
@derived begin
:= @. Central / Vc
conc_model ~ @. Normal(conc_model, sqrt(σ_add^2 + (abs(conc_model) * σ_prop)^2))
CONC end
end
```

```
PumasModel
Parameters: θcl, θvc, θq, θvp, Ω, σ_add, σ_prop
Random effects: η
Covariates:
Dynamical variables: Central, Peripheral
Derived: CONC
Observed: CONC
```

We can also choose some parameters to simulate from:

```
= (;
param0 = 2.9,
θcl = 7.0,
θvc = 1.8,
θq = 9.0,
θvp = Diagonal([0.09, 0.09]),
Ω = sqrt(0.03),
σ_add = sqrt(0.03),
σ_prop )
```

```
(θcl = 2.9,
θvc = 7.0,
θq = 1.8,
θvp = 9.0,
Ω = [0.09 0.0; 0.0 0.09],
σ_add = 0.17320508075688773,
σ_prop = 0.17320508075688773,)
```

### 4.2 A simulated population

To simulate the population, we simple use `simobs`

and the objects defined above.

```
# Set a seed for reproducibility
= Random.seed!(2142)
rng = simobs(iv2cmt_dgp, pop, param0; rng)
simpop DataFrame(simpop)
```

`simpop`

carries the simulated information about the assay, but does not yet apply the BQL masking. This is done in the next step.

### 4.3 Lower Limit Of Quantification (LLOQ)

In this example, we will assume that our assay has a lower limit of quantification at `0.9`

. We can convert our simulated population to a tabular format (`DataFrame`

) and calculate what rows are BQL. This information can be used to produce plots showing the censoring, and to create our masked dependent variable.

```
= 0.9
LLOQ = DataFrame(simpop)
pkdata .= pkdata.CONC
pkdata.CONC = pkdata.CONC .<= LLOQ
pkdata.BQL = filter(x -> x.evid == 0, pkdata) pkdata_obs
```

#### 4.3.1 Calculating proportions BQL by time

To investigate whether censoring of the dependent variable is an issue or not, we can calculate the proportion of BQL at each observation time.

```
= pkdata.CONC .<= LLOQ
pkdata.BQL = groupby(pkdata_obs, :time)
pkdata_obs_gdf = combine(pkdata_obs_gdf, :BQL => mean => "BQL_proportion") pkdata_obs_gdf_comb
```

Row | time | BQL_proportion |
---|---|---|

Float64 | Float64 | |

1 | 1.0 | 0.0 |

2 | 3.0 | 0.0 |

3 | 5.0 | 0.0444444 |

4 | 7.0 | 0.0666667 |

5 | 9.0 | 0.177778 |

6 | 11.0 | 0.244444 |

We see that the value of the `BQL_proportion`

column is increasing, and towards the end we do have a fair amount of BQL values. Another way to investigate this number is in a plot.

#### 4.3.2 Visualizing BQL proportions

Let us visualize the effect of the censoring in the following plot.

## Show/Hide Code

```
using AlgebraOfGraphics
= AlgebraOfGraphics.Figure()
f = draw!(
f1 1:2, 1],
f[data(pkdata_obs) *
mapping(:time, :CONC => "CONC", color = :BQL) *
visual(AlgebraOfGraphics.Scatter) +
mapping([LLOQ], [0.0]) * visual(AlgebraOfGraphics.ABLines; color = :orange);
= (; xlabel = "",),
axis = (; color = fcolor(i) = (:black, :orange)[i+1]),
palettes
)= draw!(
f2 3, 1],
f[data(pkdata_obs_gdf_comb) *
mapping(:time, :BQL_proportion => "BLQ proportion") *
visual(AlgebraOfGraphics.Scatter; color = :orange);
= (limits = (nothing, nothing, 0, 1),),
axis
)legend!(f[1:2, 2], f1)
f
```

In the bottom panel, we see that the proportion of BQL values increases over time. The censoring is a little bit hard to see. In the next plot, it is easier to see the effect when we plot the y-axis as the log-transformed variable.

## Show/Hide Code

```
using AlgebraOfGraphics
= AlgebraOfGraphics.Figure()
f = draw!(
f1 1:2, 1],
f[data(pkdata_obs) *
mapping(:time, :CONC => log => "log(CONC)", color = :BQL) *
visual(AlgebraOfGraphics.Scatter) +
mapping([log(LLOQ)], [0.0]) * visual(AlgebraOfGraphics.ABLines; color = :orange);
= (; xlabel = ""),
axis = (; color = fcolor(i) = (:black, :orange)[i+1]),
palettes
)= draw!(
f2 3, 1],
f[data(pkdata_obs_gdf_comb) *
mapping(:time, :BQL_proportion => "BLQ proportion") *
visual(AlgebraOfGraphics.Scatter; color = :orange);
= (; limits = (nothing, nothing, 0, 1)),
axis
)legend!(f[1:2, 2], f1)
f
```

### 4.4 Masking the data

As mentioned in the introduction, real world data often comes without the assay values if they are BQL.

```
= copy(pkdata)
pkdata_censored .= max.(LLOQ, pkdata_censored.CONC)
pkdata_censored.CONC first(pkdata_censored, 20)
```

Row | id | time | CONC | evid | amt | cmt | rate | duration | ss | ii | route | η_1 | η_2 | Central | Peripheral | CL | Vc | Q | Vp | BQL |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

String? | Float64 | Float64? | Int64? | Float64? | Symbol? | Float64? | Float64? | Int8? | Float64? | NCA.Route? | Float64? | Float64? | Float64? | Float64? | Float64? | Float64? | Float64? | Float64? | Bool? | |

1 | 1 | 0.0 | missing | 1 | 200.0 | Central | 0.0 | 0.0 | 0 | 0.0 | NullRoute | -0.205373 | -0.122347 | missing | missing | 2.3616 | 6.19389 | 1.8 | 9.0 | missing |

2 | 1 | 1.0 | 9.71942 | 0 | missing | missing | missing | missing | missing | missing | missing | -0.205373 | -0.122347 | 105.662 | 38.3016 | 2.3616 | 6.19389 | 1.8 | 9.0 | false |

3 | 1 | 3.0 | 5.33975 | 0 | missing | missing | missing | missing | missing | missing | missing | -0.205373 | -0.122347 | 39.0625 | 55.6195 | 2.3616 | 6.19389 | 1.8 | 9.0 | false |

4 | 1 | 5.0 | 3.72554 | 0 | missing | missing | missing | missing | missing | missing | missing | -0.205373 | -0.122347 | 21.914 | 50.8124 | 2.3616 | 6.19389 | 1.8 | 9.0 | false |

5 | 1 | 7.0 | 1.82011 | 0 | missing | missing | missing | missing | missing | missing | missing | -0.205373 | -0.122347 | 15.8283 | 42.8137 | 2.3616 | 6.19389 | 1.8 | 9.0 | false |

6 | 1 | 9.0 | 2.26987 | 0 | missing | missing | missing | missing | missing | missing | missing | -0.205373 | -0.122347 | 12.5344 | 35.3837 | 2.3616 | 6.19389 | 1.8 | 9.0 | false |

7 | 1 | 11.0 | 1.31686 | 0 | missing | missing | missing | missing | missing | missing | missing | -0.205373 | -0.122347 | 10.1926 | 29.0999 | 2.3616 | 6.19389 | 1.8 | 9.0 | false |

8 | 2 | 0.0 | missing | 1 | 200.0 | Central | 0.0 | 0.0 | 0 | 0.0 | NullRoute | -0.461936 | 0.160303 | missing | missing | 1.82718 | 8.21707 | 1.8 | 9.0 | missing |

9 | 2 | 1.0 | 16.1205 | 0 | missing | missing | missing | missing | missing | missing | missing | -0.461936 | 0.160303 | 131.695 | 32.1012 | 1.82718 | 8.21707 | 1.8 | 9.0 | false |

10 | 2 | 3.0 | 8.51043 | 0 | missing | missing | missing | missing | missing | missing | missing | -0.461936 | 0.160303 | 67.2258 | 54.7327 | 1.82718 | 8.21707 | 1.8 | 9.0 | false |

11 | 2 | 5.0 | 5.3066 | 0 | missing | missing | missing | missing | missing | missing | missing | -0.461936 | 0.160303 | 42.711 | 55.6004 | 1.82718 | 8.21707 | 1.8 | 9.0 | false |

12 | 2 | 7.0 | 3.00945 | 0 | missing | missing | missing | missing | missing | missing | missing | -0.461936 | 0.160303 | 31.6951 | 50.3485 | 1.82718 | 8.21707 | 1.8 | 9.0 | false |

13 | 2 | 9.0 | 2.95567 | 0 | missing | missing | missing | missing | missing | missing | missing | -0.461936 | 0.160303 | 25.5099 | 43.9187 | 1.82718 | 8.21707 | 1.8 | 9.0 | false |

14 | 2 | 11.0 | 3.32116 | 0 | missing | missing | missing | missing | missing | missing | missing | -0.461936 | 0.160303 | 21.2751 | 37.7975 | 1.82718 | 8.21707 | 1.8 | 9.0 | false |

15 | 3 | 0.0 | missing | 1 | 200.0 | Central | 0.0 | 0.0 | 0 | 0.0 | NullRoute | 0.15667 | -0.226875 | missing | missing | 3.39187 | 5.57914 | 1.8 | 9.0 | missing |

16 | 3 | 1.0 | 19.9397 | 0 | missing | missing | missing | missing | missing | missing | missing | 0.15667 | -0.226875 | 82.1791 | 37.885 | 3.39187 | 5.57914 | 1.8 | 9.0 | false |

17 | 3 | 3.0 | 4.29876 | 0 | missing | missing | missing | missing | missing | missing | missing | 0.15667 | -0.226875 | 21.2668 | 47.0343 | 3.39187 | 5.57914 | 1.8 | 9.0 | false |

18 | 3 | 5.0 | 1.96556 | 0 | missing | missing | missing | missing | missing | missing | missing | 0.15667 | -0.226875 | 10.9715 | 39.2597 | 3.39187 | 5.57914 | 1.8 | 9.0 | false |

19 | 3 | 7.0 | 1.26342 | 0 | missing | missing | missing | missing | missing | missing | missing | 0.15667 | -0.226875 | 7.86242 | 31.1639 | 3.39187 | 5.57914 | 1.8 | 9.0 | false |

20 | 3 | 9.0 | 0.9 | 0 | missing | missing | missing | missing | missing | missing | missing | 0.15667 | -0.226875 | 6.07933 | 24.5369 | 3.39187 | 5.57914 | 1.8 | 9.0 | true |

Here `LLOQ`

is the variable defined above, and has the value of `0.9`

. This type of dataset has the dependent variable column set to the actual LLOQ. The data sometimes has the string `"BQL"`

or a missing value and the `BQL`

in a separate column.

```
= copy(pkdata)
pkdata_string_bql @rtransform! pkdata_string_bql :CONC = !ismissing(:CONC) && LLOQ >= :CONC ? "BQL" : :CONC
first(pkdata_string_bql, 20)
```

Row | id | time | CONC | evid | amt | cmt | rate | duration | ss | ii | route | η_1 | η_2 | Central | Peripheral | CL | Vc | Q | Vp | BQL |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

String? | Float64 | Any | Int64? | Float64? | Symbol? | Float64? | Float64? | Int8? | Float64? | NCA.Route? | Float64? | Float64? | Float64? | Float64? | Float64? | Float64? | Float64? | Float64? | Bool? | |

1 | 1 | 0.0 | missing | 1 | 200.0 | Central | 0.0 | 0.0 | 0 | 0.0 | NullRoute | -0.205373 | -0.122347 | missing | missing | 2.3616 | 6.19389 | 1.8 | 9.0 | missing |

2 | 1 | 1.0 | 9.71942 | 0 | missing | missing | missing | missing | missing | missing | missing | -0.205373 | -0.122347 | 105.662 | 38.3016 | 2.3616 | 6.19389 | 1.8 | 9.0 | false |

3 | 1 | 3.0 | 5.33975 | 0 | missing | missing | missing | missing | missing | missing | missing | -0.205373 | -0.122347 | 39.0625 | 55.6195 | 2.3616 | 6.19389 | 1.8 | 9.0 | false |

4 | 1 | 5.0 | 3.72554 | 0 | missing | missing | missing | missing | missing | missing | missing | -0.205373 | -0.122347 | 21.914 | 50.8124 | 2.3616 | 6.19389 | 1.8 | 9.0 | false |

5 | 1 | 7.0 | 1.82011 | 0 | missing | missing | missing | missing | missing | missing | missing | -0.205373 | -0.122347 | 15.8283 | 42.8137 | 2.3616 | 6.19389 | 1.8 | 9.0 | false |

6 | 1 | 9.0 | 2.26987 | 0 | missing | missing | missing | missing | missing | missing | missing | -0.205373 | -0.122347 | 12.5344 | 35.3837 | 2.3616 | 6.19389 | 1.8 | 9.0 | false |

7 | 1 | 11.0 | 1.31686 | 0 | missing | missing | missing | missing | missing | missing | missing | -0.205373 | -0.122347 | 10.1926 | 29.0999 | 2.3616 | 6.19389 | 1.8 | 9.0 | false |

8 | 2 | 0.0 | missing | 1 | 200.0 | Central | 0.0 | 0.0 | 0 | 0.0 | NullRoute | -0.461936 | 0.160303 | missing | missing | 1.82718 | 8.21707 | 1.8 | 9.0 | missing |

9 | 2 | 1.0 | 16.1205 | 0 | missing | missing | missing | missing | missing | missing | missing | -0.461936 | 0.160303 | 131.695 | 32.1012 | 1.82718 | 8.21707 | 1.8 | 9.0 | false |

10 | 2 | 3.0 | 8.51043 | 0 | missing | missing | missing | missing | missing | missing | missing | -0.461936 | 0.160303 | 67.2258 | 54.7327 | 1.82718 | 8.21707 | 1.8 | 9.0 | false |

11 | 2 | 5.0 | 5.3066 | 0 | missing | missing | missing | missing | missing | missing | missing | -0.461936 | 0.160303 | 42.711 | 55.6004 | 1.82718 | 8.21707 | 1.8 | 9.0 | false |

12 | 2 | 7.0 | 3.00945 | 0 | missing | missing | missing | missing | missing | missing | missing | -0.461936 | 0.160303 | 31.6951 | 50.3485 | 1.82718 | 8.21707 | 1.8 | 9.0 | false |

13 | 2 | 9.0 | 2.95567 | 0 | missing | missing | missing | missing | missing | missing | missing | -0.461936 | 0.160303 | 25.5099 | 43.9187 | 1.82718 | 8.21707 | 1.8 | 9.0 | false |

14 | 2 | 11.0 | 3.32116 | 0 | missing | missing | missing | missing | missing | missing | missing | -0.461936 | 0.160303 | 21.2751 | 37.7975 | 1.82718 | 8.21707 | 1.8 | 9.0 | false |

15 | 3 | 0.0 | missing | 1 | 200.0 | Central | 0.0 | 0.0 | 0 | 0.0 | NullRoute | 0.15667 | -0.226875 | missing | missing | 3.39187 | 5.57914 | 1.8 | 9.0 | missing |

16 | 3 | 1.0 | 19.9397 | 0 | missing | missing | missing | missing | missing | missing | missing | 0.15667 | -0.226875 | 82.1791 | 37.885 | 3.39187 | 5.57914 | 1.8 | 9.0 | false |

17 | 3 | 3.0 | 4.29876 | 0 | missing | missing | missing | missing | missing | missing | missing | 0.15667 | -0.226875 | 21.2668 | 47.0343 | 3.39187 | 5.57914 | 1.8 | 9.0 | false |

18 | 3 | 5.0 | 1.96556 | 0 | missing | missing | missing | missing | missing | missing | missing | 0.15667 | -0.226875 | 10.9715 | 39.2597 | 3.39187 | 5.57914 | 1.8 | 9.0 | false |

19 | 3 | 7.0 | 1.26342 | 0 | missing | missing | missing | missing | missing | missing | missing | 0.15667 | -0.226875 | 7.86242 | 31.1639 | 3.39187 | 5.57914 | 1.8 | 9.0 | false |

20 | 3 | 9.0 | BQL | 0 | missing | missing | missing | missing | missing | missing | missing | 0.15667 | -0.226875 | 6.07933 | 24.5369 | 3.39187 | 5.57914 | 1.8 | 9.0 | true |

The string version can be handled with the inverse `@rtransform!`

that changes the `"BQL"`

value to `LLOQ`

. If the dependent variable is `missing`

and a `BQL`

flag is present, something like the following can be used instead.

`@rtransform! pkdata_bql_missing :CONC = ismissing(:CONC) && :BQL == true ? LLOQ : :CONC`

In any case, it is important to note, that the format we expect in Pumas is, that the value of the dependent variable is set to the LLOQ or a value below. This means that setting the BQL rows to any number at or below `LLOQ`

(0.9 in this case) is fine, including the value 0.

## 5 Estimating a model with BQL values

We will now estimate some models:

- M1: remove the data
- M2: truncated observations
- M3: censored observations
- M4: censored and positive

### 5.1 M1: remove the data

Now, we are finally ready to estimate a model based on our censored data. The most simple approach is to simply discard the data. Beal (2001) calls this approach “M1”. To perform such an analysis, we can filter away the BQL values and fit the model using `iv2cmt_dgp`

.

```
= copy(pkdata_censored)
pkdata_m1 @rtransform! pkdata_m1 :CONC = :evid == 0 && :BQL == 0 ? :CONC : missing
first(pkdata_m1, 20)
```

We can now fit the model.

```
= read_pumas(pkdata_m1; observations = [:CONC])
pop_m1 = fit(iv2cmt_dgp, pop_m1, param0, LaplaceI(); optim_options = (; show_trace = false)) res_m1
```

```
[ Info: Checking the initial parameter values.
[ Info: The initial negative log likelihood and its gradient are finite. Check passed.
```

```
FittedPumasModel
Successful minimization: true
Likelihood approximation: LaplaceI
Log-likelihood value: -633.05437
Number of subjects: 90
Number of parameters: Fixed Optimized
0 8
Observation records: Active Missing
CONC: 492 48
Total: 492 48
---------------------
Estimate
---------------------
θcl 2.6745
θvc 6.5711
θq 1.9778
θvp 12.206
Ω₁,₁ 0.077309
Ω₂,₂ 0.11843
σ_add 0.053029
σ_prop 0.17819
---------------------
```

### 5.2 M2: Truncated Observations

Another method to use is the so-called M2 method. M1 and M2 both discard BQL observations, or alternatively they are the only option because the BQL observations were already dropped from the dataset by someone else! For an application of M2, we approach the data as if we know the LLOQ value, but if any observations was indeed BQL it was dropped from the data. Then, the appropriate distributional assumption on the observed values remaining in the dataset is to use a truncated combined normal if we believe the original data was indeed a combined normal.

```
= @model begin
iv2cmt_m2 @param begin
∈ RealDomain(lower = 0.1)
θcl ∈ RealDomain(lower = 1.0)
θvc ∈ RealDomain(lower = 0.1)
θq ∈ RealDomain(lower = 1.1)
θvp ∈ PDiagDomain(2)
Ω ∈ RealDomain(lower = 0.0)
σ_add ∈ RealDomain(lower = 0.0)
σ_prop end
@random begin
~ MvNormal(Ω)
η end
@pre begin
= θcl * exp(η[1])
CL = θvc * exp(η[2])
Vc = θq
Q = θvp
Vp end
@dynamics Central1Periph1
@derived begin
:= @. Central / Vc
conc_model := @. Normal(conc_model, sqrt(σ_add^2 + (conc_model * σ_prop)^2))
CONC_normal ~ @. truncated(CONC_normal, lower = LLOQ)
CONC end
end
```

```
PumasModel
Parameters: θcl, θvc, θq, θvp, Ω, σ_add, σ_prop
Random effects: η
Covariates:
Dynamical variables: Central, Peripheral
Derived: CONC
Observed: CONC
```

```
= read_pumas(pkdata_m1; observations = [:CONC])
pop_m2 = fit(iv2cmt_dgp, pop_m2, param0, LaplaceI(); optim_options = (; show_trace = false)) res_m2
```

```
[ Info: Checking the initial parameter values.
[ Info: The initial negative log likelihood and its gradient are finite. Check passed.
```

```
FittedPumasModel
Successful minimization: true
Likelihood approximation: LaplaceI
Log-likelihood value: -633.05437
Number of subjects: 90
Number of parameters: Fixed Optimized
0 8
Observation records: Active Missing
CONC: 492 48
Total: 492 48
---------------------
Estimate
---------------------
θcl 2.6745
θvc 6.5711
θq 1.9778
θvp 12.206
Ω₁,₁ 0.077309
Ω₂,₂ 0.11843
σ_add 0.053029
σ_prop 0.17819
---------------------
```

### 5.3 M3: Censored Observations

If we do have the observations flagged BQL, but not the actual values, we can also use a censored distribution. That is, we either observe the data directly or we observe that it was at LLOQ or below. This is exactly what a censored distribution allows us to model, because it places all the observations below LLOQ in a point mass *at* the LLOQ. This is still useful information, and allows us to estimate parameters in the underlying distribution (the latent variable or non-censored distribution) from our *assumed* data generating process (DGP).

```
= @model begin
iv2cmt_m3 @param begin
∈ RealDomain(lower = 0.1)
θcl ∈ RealDomain(lower = 1.0)
θvc ∈ RealDomain(lower = 0.1)
θq ∈ RealDomain(lower = 1.1)
θvp ∈ PDiagDomain(2)
Ω ∈ RealDomain(lower = 0.0)
σ_add ∈ RealDomain(lower = 0.0)
σ_prop end
@random begin
~ MvNormal(Ω)
η end
@pre begin
= θcl * exp(η[1])
CL = θvc * exp(η[2])
Vc = θq
Q = θvp
Vp end
@dynamics Central1Periph1
@derived begin
:= @. Central / Vc
conc_model := @. Normal(conc_model, sqrt(σ_add^2 + (conc_model * σ_prop)^2))
CONC_normal ~ @. Censored(CONC_normal, LLOQ, Inf)
CONC end
end
```

```
PumasModel
Parameters: θcl, θvc, θq, θvp, Ω, σ_add, σ_prop
Random effects: η
Covariates:
Dynamical variables: Central, Peripheral
Derived: CONC
Observed: CONC
```

```
= read_pumas(pkdata; observations = [:CONC])
pop_m3 = fit(iv2cmt_m3, pop_m3, param0, LaplaceI(); optim_options = (show_trace = false,)) res_m3
```

```
[ Info: Checking the initial parameter values.
[ Info: The initial negative log likelihood and its gradient are finite. Check passed.
```

```
FittedPumasModel
Successful minimization: true
Likelihood approximation: LaplaceI
Log-likelihood value: -667.50211
Number of subjects: 90
Number of parameters: Fixed Optimized
0 8
Observation records: Active Missing
CONC: 540 0
Total: 540 0
---------------------
Estimate
---------------------
θcl 2.821
θvc 6.6741
θq 1.8474
θvp 10.165
Ω₁,₁ 0.087478
Ω₂,₂ 0.14104
σ_add 0.15322
σ_prop 0.16691
---------------------
```

### 5.4 M4: Censored and positive

The last possibility we will consider is the M4 method. This method takes the approach that BQL values are represented with the probability of being in the left tail of the combined normal distribution given all parameters, but adds the fact that the observations must have been positive as well for concentration data. In our humble opinion, this approach is slightly strange given the assumed data generating process. Instead, we find it more useful to just think in terms of the DGP + Censoring. Then, M4 is simply M3 with the assumption that the original data was *also* naturally truncated. In other words, if M3 is “add censoring”, M4 is not really a separate statistical handling of BQL. It is simply a different assumption on the DGP.

```
= @model begin
iv2cmt_m4 @param begin
∈ RealDomain(lower = 0.1)
θcl ∈ RealDomain(lower = 1.0)
θvc ∈ RealDomain(lower = 0.1)
θq ∈ RealDomain(lower = 1.1)
θvp ∈ PDiagDomain(2)
Ω ∈ RealDomain(lower = 0.0)
σ_add ∈ RealDomain(lower = 0.0)
σ_prop end
@random begin
~ MvNormal(Ω)
η end
@pre begin
= θcl * exp(η[1])
CL = θvc * exp(η[2])
Vc = θq
Q = θvp
Vp end
@dynamics Central1Periph1
@derived begin
:= @. Central / Vc
conc_model := @. Normal(conc_model, sqrt(σ_add^2 + (conc_model * σ_prop)^2))
CONC_normal ~ @. truncated(CONC_normal, lower = 0)
CONC_trunc ~ @. Censored(CONC_trunc, LLOQ, Inf)
CONC end
end
```

```
PumasModel
Parameters: θcl, θvc, θq, θvp, Ω, σ_add, σ_prop
Random effects: η
Covariates:
Dynamical variables: Central, Peripheral
Derived: CONC_trunc, CONC
Observed: CONC_trunc, CONC
```

`= fit(iv2cmt_m4, pop_m3, param0, LaplaceI(); optim_options = (show_trace = false,)) res_m4 `

```
[ Info: Checking the initial parameter values.
[ Info: The initial negative log likelihood and its gradient are finite. Check passed.
```

```
FittedPumasModel
Successful minimization: true
Likelihood approximation: LaplaceI
Log-likelihood value: -667.50251
Number of subjects: 90
Number of parameters: Fixed Optimized
0 8
Observation records: Active Missing
CONC: 540 0
Total: 540 0
---------------------
Estimate
---------------------
θcl 2.821
θvc 6.6741
θq 1.8474
θvp 10.165
Ω₁,₁ 0.087478
Ω₂,₂ 0.14104
σ_add 0.15318
σ_prop 0.16691
---------------------
```

The DGP for such data could have been represented with the following model.

```
= @model begin
iv2cmt_dgp_trunc @param begin
∈ RealDomain(lower = 0.1)
θcl ∈ RealDomain(lower = 1.0)
θvc ∈ RealDomain(lower = 0.1)
θq ∈ RealDomain(lower = 1.1)
θvp ∈ PDiagDomain(2)
Ω ∈ RealDomain(lower = 0.0)
σ_add ∈ RealDomain(lower = 0.0)
σ_prop end
@random begin
~ MvNormal(Ω)
η end
@pre begin
= θcl * exp(η[1])
CL = θvc * exp(η[2])
Vc = θq
Q = θvp
Vp end
@dynamics Central1Periph1
@derived begin
:= @. Central / Vc
conc_model := @. Normal(conc_model, sqrt(σ_add^2 + (conc_model * σ_prop)^2))
CONC_normal ~ @. truncated(CONC_normal, lower = 0)
CONC end
end
```

```
PumasModel
Parameters: θcl, θvc, θq, θvp, Ω, σ_add, σ_prop
Random effects: η
Covariates:
Dynamical variables: Central, Peripheral
Derived: CONC
Observed: CONC
```

## 6 Subject-dependent LLOQ

It is possible that some subjects have different LLOQ values because they come from different studies, their plasma samples were sent to different labs, and so on. In this case, the LLOQ value should still be set as the value of the `CONC`

at the time of censoring, and the specific censoring cen then be retrieved as a covariate. For example, the following population has two different LLOQs depending on the study.

```
= [
pop1 Subject(
= string(i),
id = (CONC = nothing,),
observations = DosageRegimen(200),
events = [1.0, 3.0, 5.0, 7.0, 9.0, 11.0],
time = (STUDY = 001, LLOQ_cov = 0.9),
covariates = 1:90
) for i
]= [
pop2 Subject(
= string(i),
id = (CONC = nothing,),
observations = DosageRegimen(200),
events = [1.0, 3.0, 5.0, 7.0, 9.0, 11.0],
time = (STUDY = 002, LLOQ_cov = 0.09),
covariates = 1:10
) for i
]= vcat(pop1, pop2) pop
```

```
Population
Subjects: 100
Covariates: STUDY, LLOQ_cov
Observations: CONC
```

```
= @model begin
iv2cmt_m4 @param begin
∈ RealDomain(lower = 0.1)
θcl ∈ RealDomain(lower = 1.0)
θvc ∈ RealDomain(lower = 0.1)
θq ∈ RealDomain(lower = 1.1)
θvp ∈ PDiagDomain(2)
Ω ∈ RealDomain(lower = 0.0)
σ_add ∈ RealDomain(lower = 0.0)
σ_prop end
@covariates STUDY LLOQ_cov
@random begin
~ MvNormal(Ω)
η end
@pre begin
= θcl * exp(η[1])
CL = θvc * exp(η[2])
Vc = θq
Q = θvp
Vp end
@dynamics Central1Periph1
@derived begin
:= @. Central / Vc
conc_model := @. Normal(conc_model, sqrt(σ_add^2 + (conc_model * σ_prop)^2))
CONC_normal ~ @. truncated(CONC_normal; lower = 0)
CONC_trunc ~ @. Censored(CONC_trunc, LLOQ_cov, Inf)
CONC end
end
```

```
PumasModel
Parameters: θcl, θvc, θq, θvp, Ω, σ_add, σ_prop
Random effects: η
Covariates: STUDY, LLOQ_cov
Dynamical variables: Central, Peripheral
Derived: CONC_trunc, CONC
Observed: CONC_trunc, CONC
```

## 7 Conclusion

In this tutorial, we showed how to use filtering and use `Censored`

to handle missing data problems for models fitted with linearized approximations to the likelihood function of a non-linear mixed effects model. Whether or not to include the BQL data rows in the data set depends on the expectations about any bias that may be introduced from ignoring the missing data problem. If the bias is expected to be significant, this tutorial showed how to write up a model that takes the missing data into consideration and estimate parameters more accurately.

## 8 References

*Journal of Pharmacokinetics and Pharmacodynamics*28: 481–504.

*The AAPS Journal*11: 371–80.