`using Pumas`

# Model Selection Methods - Backward Elimination

As stated in Introduction to Covariate Selection Methods, broadly, there are two main types of **S**tepwise **C**ovariate **M**odel (SCM):

**Forward Selection**(FS)**Backward Elimination**(BE)

In this tutorial we’ll explore **Backward Elimination** (BE) in detail.

## 1 Backward Elimination (BE)

Backward elimination stepwise covariate modeling procedures are very similar to FS procedures. However, instead of beginning with a base model, BE starts with a full model, denoted as \(M_0\), and is the reference model for the models of the first iteration, The first iteration tests all defined covariate-parameter effects in the full model (\(c \in C\)) in univariable runs by removing all available \(c\) from \(M_0\) in separate runs. The model with the smallest increase in criterion value is subject to being removed from the model, and is designated \(M_k\), the reference for the next iteration.

BE proceeds with the subsequent iterations by individually removing all remaining \(c_k\) from \(M_k\), where \(k\) is the iteration number, and computes the selected metric for each model. The \(c_k\) model selected for removal (the worst \(c\) in iteration \(k\)) is removed from the set \(C_{k+1}\) of covariate effects for iteration \(k+1\). BE proceeds with additional iterations until it cannot find \(c_k\) to remove without degrading the fit of the model fit metric or if there aren’t \(c_k\) left to be removed from the model.

Here’s a breakdown of the BE algorithm:

- Full model with all covariate effects: \(M_{\text{full}}\)
- Set of covariate effects: \(C\)
- Data: \(D\)
- Criterion, a function that takes a model along with data, and outputs a real number (with the assumption that lower is better): \(f: (M, D) \to \mathbb{R}\)

- Best model with a subset of covariate effects \(C^*\) according to criterion \(f\) evaluated on data \(D\)

`define`

\(M = M_{\text{full}}\)`define`

\(f_m = f(M, D)\)`for`

every \(c \in C\):`define`

\(M_c = M - c\)`define`

\(f_c = f(M_c, D)\)`if`

\(f_c < f_m\)`define`

\(M_{\text{best}} = M_c\)`define`

\(c_{\text{best}} = c\)`else`

`unset`

\(c_{\text{best}}\)

`if`

\(c_{\text{best}}\) is`defined`

`define`

\(M = M_{\text{best}}\)`define`

\(C = C \setminus \{ c_{\text{best}} \}\)`goto`

step 2

define \(C^* = C^\complement\)

`return`

\(M\) and \(C^*\)

In the procedures above, we start by defining the initial model \(M\) as the full model with all covariate effects \(M_{\text{full}}\). Then we proceed by defining the initial criterion \(f_m\) as the criterion \(f\) of the initial model \(M\) and data \(D\). Next, we do a for loop on every covariate effect \(c\) in the set of all covariate effects \(C\), where we:

define the current model \(M_c\) as the model \(M\) without the covariate effect \(c\), that is \(M - c\).

define the criterion of the current model \(f_c\) as the criterion \(f\) applied to the current model \(M_c\) and data \(D\).

if the criterion improved, i.e \(f_c < f_m\), then:

- define the new best model \(M_{\text{best}}\) as the current model \(M_c\).
- define the new best covariate effect \(c_{\text{best}}\) as the covariate effect \(c\).

if we have a best covariate effect \(c_{\text{best}}\), then:

- define the new “initial model” \(M\) as the best model \(M_{\text{best}}\).
- define the new set of covariate effects as the set \(C\) minus best covariate effect \(c_{\text{best}}\)
- go to the step 2 and resume from there.

define the new set of covariate effects \(C^*\) as the set complement \(C^\complement\) of the set \(C\)

return the final “initial model” \(M\) and the new set of covariate effects \(C^*\)

\(C^\complement\) is the set complement of the set \(C\).

\[C = C^{\complement^\complement}\]

## 2 The `covariate_select`

function

All stepwise covariate selection technique in Pumas uses the `covariate_select`

function. It has the following function signature:

```
covariate_select(
model,
population,
param,
approx;
control_param,= aic,
criterion = CovariateSelection.Forward,
method
fit_options, )
```

There are positional arguments and keyword arguments.

`model`

: a Pumas model`population`

: a`Population`

that is commonly obtained from`read_pumas`

`param`

: a set of initial parameter estimates`approx`

: a likelihood approximation method, e.g.`FOCE()`

,`LaplaceI()`

, etc.

These are exactly the same positional arguments that we pass to the `fit`

function to perform a model fit procedure.

The keyword arguments we have some new concepts:

`control_param`

: this is a tuple of parameters that encapsulates unique covariate effects such that, if we restrain their values to \(0\), we effectively eliminate any influence of the covariate effects in our model.`control_param`

can be thought as a list of covariate effects that we want to screen in our covariate selection routine.`criterion`

: Criterion for covariate selection. This needs to be a function that accepts a Pumas model and returns a real value. If you don’t provide a function Pumas will, by default, use the`aic`

function. Note that`criterion`

has the assumption that**lower is better**. Hence, you’ll need to take that into account when supplying a different function or defining your own user-defined function.`method`

: the desired method to be used for the stepwise covariate selection procedure. The default is`CovariateSelection.Forward`

which stands for Forward Selection (FS). You can also opt for`CovariateSelection.Backward`

if you want to use Backward Elimination (BE)`fit_options`

: a`NamedTuple`

of options that will be used during the model fit procedures that`covariate_select`

will perform. These will be the keyword arguments available for the`fit`

function, e.g.`(; constantcoef = (; myparam = 1.0))`

.

## 3 Example in Pumas

Let’s showcase an example that builds from the model we used in Covariate Selection Methods - Introduction.

First, let’s import the following packages:

`using PharmaDatasets`

We are going to use the `po_sad_1`

dataset from `PharmaDatasets`

:

```
= dataset("po_sad_1")
df first(df, 5)
```

Row | id | time | dv | amt | evid | cmt | rate | age | wt | doselevel | isPM | isfed | sex | route |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

Int64 | Float64 | Float64? | Float64? | Int64 | Int64? | Float64 | Int64 | Int64 | Int64 | String3 | String3 | String7 | String3 | |

1 | 1 | 0.0 | missing | 30.0 | 1 | 1 | 0.0 | 51 | 74 | 30 | no | yes | male | ev |

2 | 1 | 0.25 | 35.7636 | missing | 0 | missing | 0.0 | 51 | 74 | 30 | no | yes | male | ev |

3 | 1 | 0.5 | 71.9551 | missing | 0 | missing | 0.0 | 51 | 74 | 30 | no | yes | male | ev |

4 | 1 | 0.75 | 97.3356 | missing | 0 | missing | 0.0 | 51 | 74 | 30 | no | yes | male | ev |

5 | 1 | 1.0 | 128.919 | missing | 0 | missing | 0.0 | 51 | 74 | 30 | no | yes | male | ev |

This is an oral dosing (`route = "ev"`

) NMTRAN-formatted dataset. It has 18 subjects, each with 1 dosing event (`evid = 1`

) and 18 measurement events (`evid = 0`

); and the following covariates:

`age`

: age in years (continuous)`wt`

: weight in kg (continuous)`doselevel`

: dosing amount, either`30`

,`60`

or`90`

milligrams (categorical)`isPM`

: subject is a**p**oor**m**etabolizer (binary)`isfed`

: subject is**fed**(binary)`sex`

: subject sex (binary)

Let’s parse `df`

into a `Population`

with `read_pumas`

:

```
=
population read_pumas(df; observations = [:dv], covariates = [:wt, :isPM, :isfed], route = :route)
```

```
Population
Subjects: 18
Covariates: wt, isPM, isfed
Observations: dv
```

We’ll use the same 2-compartment oral absorption model as in Covariate Selection Methods - Introduction, but now adding additional covariates and covariate effects:

```
= @model begin
model @metadata begin
= "full covariate model"
desc = u"hr"
timeu end
@param begin
"""
Clearance (L/hr)
"""
∈ RealDomain(; lower = 0)
tvcl """
Central Volume (L)
"""
∈ RealDomain(; lower = 0)
tvvc """
Peripheral Volume (L)
"""
∈ RealDomain(; lower = 0)
tvvp """
Distributional Clearance (L/hr)
"""
∈ RealDomain(; lower = 0)
tvq """
Fed - Absorption rate constant (h-1)
"""
∈ RealDomain(; lower = 0)
tvkafed """
Fasted - Absorption rate constant (h-1)
"""
∈ RealDomain(; lower = 0)
tvkafasted """
Power exponent on weight for Clearance
"""
∈ RealDomain()
dwtcl """
Proportional change in CL due to PM
"""
∈ RealDomain(; lower = -1, upper = 1)
ispmoncl """
- ΩCL
- ΩVc
- ΩKa
- ΩVp
- ΩQ
"""
∈ PDiagDomain(5)
Ω """
Proportional RUV (SD scale)
"""
∈ RealDomain(; lower = 0)
σₚ end
@random begin
~ MvNormal(Ω)
η end
@covariates begin
"""
Poor Metabolizer
"""
isPM"""
Fed status
"""
isfed"""
Weight (kg)
"""
wtend
@pre begin
= tvcl * (wt / 70)^dwtcl * (1 + ispmoncl * (isPM == "yes")) * exp(η[1])
CL = tvvc * (wt / 70) * exp(η[2])
Vc = (tvkafed + (isfed == "no") * (tvkafasted)) * exp(η[3])
Ka = tvq * (wt / 70)^0.75 * exp(η[4])
Q = tvvp * (wt / 70) * exp(η[5])
Vp end
@dynamics Depots1Central1Periph1
@derived begin
:= @. 1_000 * (Central / Vc)
cp """
DrugY Concentration (ng/mL)
"""
~ @. Normal(cp, cp * σₚ)
dv end
end
```

```
PumasModel
Parameters: tvcl, tvvc, tvvp, tvq, tvkafed, tvkafasted, dwtcl, ispmoncl, Ω, σₚ
Random effects: η
Covariates: isPM, isfed, wt
Dynamical variables: Depot, Central, Peripheral
Derived: dv
Observed: dv
```

The underlying Pumas model that `covariate_select`

will use needs to include **all** of the covariate effects to be tested. This is different than just specifying particular covariates in individual models for a more manual process. Note that one covariate could have more than one effect.

During each iteration of BE, `covariate_select`

will set the covariate effect being evaluated for elimination from the model to zero. Each of the covariate effects should be totally controlled by a respective parameter defined in the `@param`

block. The model should be coded in such way that, if the corresponding parameter value is set to zero, then the covariate effect is completely removed from the model.

In the model above, we are defining the following parameters in the `@param`

block with respect to covariate effects:

`tvkafasted`

: typical value of the magnitude of the`isfed`

covariate effect on`tvka`

`dwtcl`

: typical value of the magnitude of the`wt`

covariate efffect on`tvcl`

`ispmoncl`

: typical value of the magnitude of the`isPM`

covariate efffect on`tvcl`

These will be the parameters that we will be testing for the covariate selection.

Further down in the `@pre`

block we see that if we set any of these parameters value to zero we will completely remove the covariate effect from the model. For the case of `tvkafasted`

, this expression in the `@pre`

block:

`= (tvkafed + (isfed == "no") * (tvkafasted)) * exp(η[3]) Ka `

shows us that if `tvkafasted`

is zero, then the whole expression `(isfed == "no")*(tvkafasted)`

evaluates to zero, and we are only left with:

`= tvkafed * exp(η[3]) Ka `

which removes the effect of `tvkafasted`

from the value of Ka.

For the case of `dwtcl`

and `ispmoncl`

, we have the expression:

`= tvcl * (wt / 70)^dwtcl * (1 + ispmoncl * (isPM == "yes")) * exp(η[1]) CL `

in the `@pre`

block.

- if
`dwtcl`

is set to zero, then`(wt / 70)^dwtcl`

evaluates to one. This would remove the`wt`

effect on`tvcl`

. - if
`ispmoncl`

is set to zero, then`ispmoncl * (isPM == "yes")`

evaluates to zero. This would remove the`isPM`

effect on`tvcl`

.

Note that it is the effect size parameter (i.e., dwtcl or ispmoncl), that is set to zero and effectively removes the covariate effect, whether continuous or categorical.

With the model done, now we define our initial parameters:

```
= (;
iparams = 0.4,
tvkafed = 1.5,
tvkafasted = 4.0,
tvcl = 70.0,
tvvc = 4.0,
tvq = 50.0,
tvvp = 0.75,
dwtcl = -0.7,
ispmoncl = Diagonal(fill(0.04, 5)),
Ω = 0.1,
σₚ )
```

```
(tvkafed = 0.4,
tvkafasted = 1.5,
tvcl = 4.0,
tvvc = 70.0,
tvq = 4.0,
tvvp = 50.0,
dwtcl = 0.75,
ispmoncl = -0.7,
Ω = [0.04 0.0 … 0.0 0.0; 0.0 0.04 … 0.0 0.0; … ; 0.0 0.0 … 0.04 0.0; 0.0 0.0 … 0.0 0.04],
σₚ = 0.1,)
```

### 3.1 BE with Different Metrics

Finally, we can run the covariate selection procedure.

We’ll use `FOCE()`

as our estimation method. For the `control_param`

tuple, we include all of the covariate effects we mentioned above: `dwtcl`

, `tvkafasted`

and `ispmoncl`

. These will be the parameters that we’ll test for covariate selection.

We will now choose the criterion as BIC (`bic`

) and the Backward Elimination (BE) method (`CovariateSelection.Backward`

):

```
= covariate_select(
covar_result
model,
population,
iparams,FOCE();
= (:dwtcl, :tvkafasted, :ispmoncl),
control_param = bic,
criterion = CovariateSelection.Backward,
method )
```

```
[ Info: fitting model with zero restrictions on:
[ Info: criterion: 2691.963853389571
[ Info: fitting model with zero restrictions on: dwtcl
[ Info: criterion: 2714.4999368673703
[ Info: fitting model with zero restrictions on: tvkafasted
[ Info: criterion: 2724.047178565395
[ Info: fitting model with zero restrictions on: ispmoncl
[ Info: criterion: 2725.0775832771033
[ Info: current best model has zero restrictions on dwtcl
[ Info: criterion did not improve, stopping!
┌ Info: final best model summary
│ best_criterion = 2691.963853389571
└ best_zero_restrictions = ()
```

Pumas.CovariateSelectionResults Criterion: bic Method: Backward Best model: Unrestricted parameters: dwtcl, tvkafasted, ispmoncl Zero restricted parameters: Criterion value: 2691.96 Fitted model summary: 4×4 DataFrame Row │ free_parameters zero_parameters fitted_model ⋯ │ Tuple… Tuple… FittedPumas… ⋯ ─────┼────────────────────────────────────────────────────────────────────────── 1 │ (:dwtcl, :tvkafasted, :ispmoncl) () FittedPumasModel(... ⋯ 2 │ (:tvkafasted, :ispmoncl) (:dwtcl,) FittedPumasModel(... 3 │ (:dwtcl, :ispmoncl) (:tvkafasted,) FittedPumasModel(... 4 │ (:dwtcl, :tvkafasted) (:ispmoncl,) FittedPumasModel(... 2 columns omitted

As you can see, `covariate_select`

, upon completion, prints its result.

Now, let’s showcase how you can use a custom `criterion`

.

Here’s an example of a function that takes a fitted Pumas model, `m`

, and outputs the -2LL (minus 2 times log-likelihood) without the constant. This is a model fit measure commonly used by NONMEM users and is is known as OFV: **O**bjective **F**unction **V**alue.

Hence, we will name the function `ofv`

:

`ofv(m) = (-2 * loglikelihood(m)) - (nobs(m) * log(2π))`

`ofv (generic function with 1 method)`

Now let’s run `covariate_select`

again but with `ofv`

as a criterion:

```
= covariate_select(
covar_result_ofv
model,
population,
iparams,FOCE();
= (:dwtcl, :tvkafasted, :ispmoncl),
control_param = ofv,
criterion = CovariateSelection.Backward,
method )
```

```
[ Info: fitting model with zero restrictions on:
[ Info: criterion: 2117.35913803307
[ Info: fitting model with zero restrictions on: dwtcl
[ Info: criterion: 2145.4936434698684
[ Info: fitting model with zero restrictions on: tvkafasted
[ Info: criterion: 2155.040885167893
[ Info: fitting model with zero restrictions on: ispmoncl
[ Info: criterion: 2156.071289879602
[ Info: current best model has zero restrictions on dwtcl
[ Info: criterion did not improve, stopping!
┌ Info: final best model summary
│ best_criterion = 2117.35913803307
└ best_zero_restrictions = ()
```

Pumas.CovariateSelectionResults Criterion: ofv Method: Backward Best model: Unrestricted parameters: dwtcl, tvkafasted, ispmoncl Zero restricted parameters: Criterion value: 2117.36 Fitted model summary: 4×4 DataFrame Row │ free_parameters zero_parameters fitted_model ⋯ │ Tuple… Tuple… FittedPumas… ⋯ ─────┼────────────────────────────────────────────────────────────────────────── 1 │ (:dwtcl, :tvkafasted, :ispmoncl) () FittedPumasModel(... ⋯ 2 │ (:tvkafasted, :ispmoncl) (:dwtcl,) FittedPumasModel(... 3 │ (:dwtcl, :ispmoncl) (:tvkafasted,) FittedPumasModel(... 4 │ (:dwtcl, :tvkafasted) (:ispmoncl,) FittedPumasModel(... 2 columns omitted

As you can see, this works as expected and returns the same best model as the BIC approach.

First, we have the best model that was found with its parameters and criterion value (showcasing the `bic`

case).

Next, we have a `DataFrame`

-rendered table with the iterations that BE tested as the rows, and the following columns:

`free_parameters`

: covariate effects that were allowed to influence the model, i.e. were*not*forced to zero.`zero_parameters`

: covariate effects that were*not*allowed to influence the model, i.e. were forced to zero.`fitted_model`

: the underlying Pumas’ model fit`criterion`

: criterion value (`bic`

in our case)

We can access the results of the covariate selection procedure by accessing the `fits`

field of the `covar_result`

object, and convert it to a `DataFrame`

in order to perform any data wrangling or plotting.

Here’s an example of how to sort the results by the `criterion`

value:

`sort(DataFrame(covar_result.fits), :criterion)`

Row | free_parameters | zero_parameters | fitted_model | criterion |
---|---|---|---|---|

Tuple… | Tuple… | FittedPu… | Float64 | |

1 | (:dwtcl, :tvkafasted, :ispmoncl) | () | FittedPumasModel(...) | 2691.96 |

2 | (:tvkafasted, :ispmoncl) | (:dwtcl,) | FittedPumasModel(...) | 2714.5 |

3 | (:dwtcl, :ispmoncl) | (:tvkafasted,) | FittedPumasModel(...) | 2724.05 |

4 | (:dwtcl, :tvkafasted) | (:ispmoncl,) | FittedPumasModel(...) | 2725.08 |

As you can see this puts the “best” model (according to the `method`

and `criterion`

) in the first row.

As you can see, after running the covariate selection procedure, we get a summary of the best model found and the fitted model for each step of the procedure.

You can recover the best model fit by accessing the `best_model`

field of the `covar_result`

object:

` covar_result.best_model`

```
FittedPumasModel
Successful minimization: true
Likelihood approximation: FOCE
Log-likelihood value: -1306.793
Number of subjects: 18
Number of parameters: Fixed Optimized
0 14
Observation records: Active Missing
dv: 270 0
Total: 270 0
-------------------------
Estimate
-------------------------
tvcl 3.6888
tvvc 70.043
tvvp 48.707
tvq 4.3511
tvkafed 0.41113
tvkafasted 1.0894
dwtcl 0.81858
ispmoncl -0.62382
Ω₁,₁ 0.022271
Ω₂,₂ 0.050184
Ω₃,₃ 0.031199
Ω₄,₄ 0.10595
Ω₅,₅ 0.035064
σₚ 0.098043
-------------------------
```