```
using Pumas
using PharmaDatasets
```

# Model Selection Methods - Mixed Routines

As stated in Introduction to Covariate Selection Methods, broadly, there are two main types of **S**tepwise **C**ovariate **M**odel (SCM):

**Forward Selection**(FS)**Backward Elimination**(BE)

In this tutorial we’ll explore a mixed routine based on both **Forward Selection** (FS) and **Backward Elimination** (BE).

## 1 Mixed Routines

One common mixed routine is to first perform a FS, followed by a BE. In all of these SCM routines, we’ll use the following criterion:

\[\operatorname{criterion}(f_p, f_b) = \operatorname{LRT}(f_p, f_b, \alpha)\]

where:

- \(f_p\) is the “proposed” model fit at the current iteration
- \(f_b\) is the model fit at the previous iteration (i.e. the current “best model”)
- \(\operatorname{LRT}\) is the likelihood-ratio test between models \(f_p\) and \(f_m\) using the \(\alpha\) value as the significance threshold

We’ll use different \(\alpha\) values depending on stage of the analysis. Common values are:

- FS: \(\alpha = 0.01\)
- BE: \(\alpha = 0.001\)

The rationale is that a less strict criterion value is used when adding variables during FS, and a more stringent criterion is used for allowing covariates to remain in the model during BE.

## 2 Example in Pumas

Let’s showcase an example that builds from the model we used in Covariate Selection Methods - Introduction.

First, let’s import the following packages:

We are going to use the `po_sad_1`

dataset from `PharmaDatasets`

:

```
= dataset("po_sad_1")
df first(df, 5)
```

Row | id | time | dv | amt | evid | cmt | rate | age | wt | doselevel | isPM | isfed | sex | route |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

Int64 | Float64 | Float64? | Float64? | Int64 | Int64? | Float64 | Int64 | Int64 | Int64 | String3 | String3 | String7 | String3 | |

1 | 1 | 0.0 | missing | 30.0 | 1 | 1 | 0.0 | 51 | 74 | 30 | no | yes | male | ev |

2 | 1 | 0.25 | 35.7636 | missing | 0 | missing | 0.0 | 51 | 74 | 30 | no | yes | male | ev |

3 | 1 | 0.5 | 71.9551 | missing | 0 | missing | 0.0 | 51 | 74 | 30 | no | yes | male | ev |

4 | 1 | 0.75 | 97.3356 | missing | 0 | missing | 0.0 | 51 | 74 | 30 | no | yes | male | ev |

5 | 1 | 1.0 | 128.919 | missing | 0 | missing | 0.0 | 51 | 74 | 30 | no | yes | male | ev |

This is an oral dosing (`route = "ev"`

) NMTRAN-formatted dataset. It has 18 subjects, each with 1 dosing event (`evid = 1`

) and 18 measurement events (`evid = 0`

); and the following covariates:

`age`

: age in years (continuous)`wt`

: weight in kg (continuous)`doselevel`

: dosing amount, either`30`

,`60`

or`90`

milligrams (categorical)`isPM`

: subject is a**p**oor**m**etabolizer (binary)`isfed`

: subject is**fed**(binary)`sex`

: subject sex (binary)

```
=
population read_pumas(df; observations = [:dv], covariates = [:wt, :isPM, :isfed], route = :route)
```

```
Population
Subjects: 18
Covariates: wt, isPM, isfed
Observations: dv
```

We’ll use the same 2-compartment oral absorption model as in Covariate Selection Methods - Introduction, but now adding additional covariates and covariate effects:

```
= @model begin
model @metadata begin
= "full covariate model"
desc = u"hr"
timeu end
@param begin
"""
Clearance (L/hr)
"""
∈ RealDomain(; lower = 0)
tvcl """
Central Volume (L)
"""
∈ RealDomain(; lower = 0)
tvvc """
Peripheral Volume (L)
"""
∈ RealDomain(; lower = 0)
tvvp """
Distributional Clearance (L/hr)
"""
∈ RealDomain(; lower = 0)
tvq """
Fed - Absorption rate constant (h-1)
"""
∈ RealDomain(; lower = 0)
tvkafed """
Fasted - Absorption rate constant (h-1)
"""
∈ RealDomain(; lower = 0)
tvkafasted """
Power exponent on weight for Clearance
"""
∈ RealDomain()
dwtcl """
Proportional change in CL due to PM
"""
∈ RealDomain(; lower = -1, upper = 1)
ispmoncl """
- ΩCL
- ΩVc
- ΩKa
- ΩVp
- ΩQ
"""
∈ PDiagDomain(5)
Ω """
Proportional RUV (SD scale)
"""
∈ RealDomain(; lower = 0)
σₚ end
@random begin
~ MvNormal(Ω)
η end
@covariates begin
"""
Poor Metabolizer
"""
isPM"""
Fed status
"""
isfed"""
Weight (kg)
"""
wtend
@pre begin
= tvcl * (wt / 70)^dwtcl * (1 + ispmoncl * (isPM == "yes")) * exp(η[1])
CL = tvvc * (wt / 70) * exp(η[2])
Vc = (tvkafed + (isfed == "no") * (tvkafasted)) * exp(η[3])
Ka = tvq * (wt / 70)^0.75 * exp(η[4])
Q = tvvp * (wt / 70) * exp(η[5])
Vp end
@dynamics Depots1Central1Periph1
@derived begin
:= @. 1_000 * (Central / Vc)
cp """
DrugY Concentration (ng/mL)
"""
~ @. Normal(cp, cp * σₚ)
dv end
end
```

```
PumasModel
Parameters: tvcl, tvvc, tvvp, tvq, tvkafed, tvkafasted, dwtcl, ispmoncl, Ω, σₚ
Random effects: η
Covariates: isPM, isfed, wt
Dynamical variables: Depot, Central, Peripheral
Derived: dv
Observed: dv
```

With the model done, now we define our initial parameters:

```
= (;
iparams = 0.4,
tvkafed = 1.5,
tvkafasted = 4.0,
tvcl = 70.0,
tvvc = 4.0,
tvq = 50.0,
tvvp = 0.75,
dwtcl = -0.7,
ispmoncl = Diagonal(fill(0.04, 5)),
Ω = 0.1,
σₚ )
```

```
(tvkafed = 0.4,
tvkafasted = 1.5,
tvcl = 4.0,
tvvc = 70.0,
tvq = 4.0,
tvvp = 50.0,
dwtcl = 0.75,
ispmoncl = -0.7,
Ω = [0.04 0.0 … 0.0 0.0; 0.0 0.04 … 0.0 0.0; … ; 0.0 0.0 … 0.04 0.0; 0.0 0.0 … 0.0 0.04],
σₚ = 0.1,)
```

Now we create our criterion:

`lrt_criterion(f; α = 0.01) = dof(f) * cquantile(Chisq(1), α) - (2 * loglikelihood(f))`

`lrt_criterion (generic function with 1 method)`

`lrt_criterion`

takes a model fit `f`

and a desired \(\alpha\) value `α`

. It calculates the LRT by first multiplying the model’s degree-of-freedom (`dof`

) by the complementary quantile (`cquantile`

) of the \(\alpha\) value from a \(\chi^2\) distribution with one degree of freedom. Then, it subtracts from this value the model fit -2LL (minus 2 times log-likelihood) with constant. We use one degree-of-freedom because our SCM methods as currently implemented for both FS and BE, always add or remove a single covariate effect in each model evaluation.

First, we’ll run `covariate_select`

with `lrt_criterion`

as a criterion and the FS method. Finally, we run `covariate_select`

with `lrt_criterion`

as a criterion and the BE method.

Our `α = 0.01`

, so we use the default value defined in `lrt_criterion`

function.

```
= covariate_select(
covar_result_fs
model,
population,
iparams,FOCE();
= (:dwtcl, :tvkafasted, :ispmoncl),
control_param = lrt_criterion,
criterion = CovariateSelection.Forward,
method )
```

```
[ Info: fitting model with zero restrictions on: dwtcl, tvkafasted, ispmoncl
[ Info: criterion: 2765.9438260321062
[ Info: fitting model with zero restrictions on: tvkafasted, ispmoncl
[ Info: criterion: 2768.8716816089895
[ Info: fitting model with zero restrictions on: dwtcl, ispmoncl
[ Info: criterion: 2735.6913640080456
[ Info: fitting model with zero restrictions on: dwtcl, tvkafasted
[ Info: criterion: 2758.2100172199457
[ Info: current best model has zero restrictions on dwtcl, ispmoncl
[ Info: criterion improved, continuing!
[ Info: fitting model with zero restrictions on: ispmoncl
[ Info: criterion: 2738.5517536234006
[ Info: fitting model with zero restrictions on: dwtcl
[ Info: criterion: 2727.974107213667
[ Info: current best model has zero restrictions on dwtcl
[ Info: criterion improved, continuing!
[ Info: fitting model with zero restrictions on:
[ Info: criterion: 2706.47449837789
[ Info: current best model has zero restrictions on
[ Info: criterion improved, continuing!
┌ Info: final best model summary
│ best_criterion = 2706.47449837789
└ best_zero_restrictions = ()
```

Pumas.CovariateSelectionResults Criterion: lrt_criterion Method: Forward Best model: Unrestricted parameters: dwtcl, tvkafasted, ispmoncl Zero restricted parameters: Criterion value: 2706.47 Fitted model summary: 7×4 DataFrame Row │ free_parameters zero_parameters fit ⋯ │ Tuple… Tuple… Fit ⋯ ─────┼────────────────────────────────────────────────────────────────────────── 1 │ () (:dwtcl, :tvkafasted, :ispmoncl) Fit ⋯ 2 │ (:dwtcl,) (:tvkafasted, :ispmoncl) Fit 3 │ (:tvkafasted,) (:dwtcl, :ispmoncl) Fit 4 │ (:ispmoncl,) (:dwtcl, :tvkafasted) Fit 5 │ (:dwtcl, :tvkafasted) (:ispmoncl,) Fit ⋯ 6 │ (:tvkafasted, :ispmoncl) (:dwtcl,) Fit 7 │ (:dwtcl, :tvkafasted, :ispmoncl) () Fit 2 columns omitted

Now we extract the best model from the `covar_result`

by accessing the `best_model`

field. That gives us a model fit, which we can use to extract the best model:

`= covar_result_fs.best_model.model best_model_fs `

```
PumasModel
Parameters: tvcl, tvvc, tvvp, tvq, tvkafed, tvkafasted, dwtcl, ispmoncl, Ω, σₚ
Random effects: η
Covariates: isPM, isfed, wt
Dynamical variables: Depot, Central, Peripheral
Derived: dv
Observed: dv
```

Our `α = 0.001`

, so need to override the default value defined in `lrt_criterion`

function.

```
= covariate_select(
covar_result_bs
best_model_fs,
population,
iparams,FOCE();
= (:dwtcl, :tvkafasted, :ispmoncl),
control_param = x -> lrt_criterion(x; α = 0.001),
criterion = CovariateSelection.Backward,
method )
```

```
[ Info: fitting model with zero restrictions on:
[ Info: criterion: 2765.171872352871
[ Info: fitting model with zero restrictions on: dwtcl
[ Info: criterion: 2782.478811619007
[ Info: fitting model with zero restrictions on: tvkafasted
[ Info: criterion: 2792.0260533170317
[ Info: fitting model with zero restrictions on: ispmoncl
[ Info: criterion: 2793.0564580287405
[ Info: current best model has zero restrictions on dwtcl
[ Info: criterion did not improve, stopping!
┌ Info: final best model summary
│ best_criterion = 2765.171872352871
└ best_zero_restrictions = ()
```

Pumas.CovariateSelectionResults Criterion: #16 Method: Backward Best model: Unrestricted parameters: dwtcl, tvkafasted, ispmoncl Zero restricted parameters: Criterion value: 2765.17 Fitted model summary: 4×4 DataFrame Row │ free_parameters zero_parameters fitted_model ⋯ │ Tuple… Tuple… FittedPumas… ⋯ ─────┼────────────────────────────────────────────────────────────────────────── 1 │ (:dwtcl, :tvkafasted, :ispmoncl) () FittedPumasModel(... ⋯ 2 │ (:tvkafasted, :ispmoncl) (:dwtcl,) FittedPumasModel(... 3 │ (:dwtcl, :ispmoncl) (:tvkafasted,) FittedPumasModel(... 4 │ (:dwtcl, :tvkafasted) (:ispmoncl,) FittedPumasModel(... 2 columns omitted

`x -> fun(x; ...)`

is an anonymous function. Since we need to override the default keyword argument `α`

in `lrt_criterion`

for the backward elimination pass, we cannot use:

`= lrt_criterion criterion `

We need to use:

`= x -> lrt_criterion(x; α = 0.001) criterion `

As you can see, the results of this mixed routine: FS followed by BE process; results in the same final model as the FS and BE processes used in the Covariate Selection Methods - Forward Selection and Covariate Selection Methods - Backward Elimination.

` covar_result_bs.best_model`

```
FittedPumasModel
Successful minimization: true
Likelihood approximation: FOCE
Log-likelihood value: -1306.793
Number of subjects: 18
Number of parameters: Fixed Optimized
0 14
Observation records: Active Missing
dv: 270 0
Total: 270 0
-------------------------
Estimate
-------------------------
tvcl 3.6888
tvvc 70.043
tvvp 48.707
tvq 4.3511
tvkafed 0.41113
tvkafasted 1.0894
dwtcl 0.81858
ispmoncl -0.62382
Ω₁,₁ 0.022271
Ω₂,₂ 0.050184
Ω₃,₃ 0.031199
Ω₄,₄ 0.10595
Ω₅,₅ 0.035064
σₚ 0.098043
-------------------------
```