using Pumas
using PharmaDatasets

Covariate Selection Methods - Introduction
In pharmacometric workflows, we often have competing models to select from. In this tutorial we will review selection criteria and automated procedures to select the best model out of a set of competing candidate models.
First, we’ll review how to measure model fit, then we’ll cover model selection algorithms.
1 Model Fit Measures
Traditionally in Statistics, model comparison has been done based on a theoretical divergence metric that originates from information theory’s entropy:
\[H = - \operatorname{E}\log(p) = -\sum_i p_i \log(p_i)\]
where \(p_i\) is the probability of occurrence of the \(i\)-th possible value.
We use the \(\log\) scale because it transforms a product of probabilities into a sum, which is both numerically faster and numerically more stable due to the robustness against floating point underflow.
Entropy was the inspiration behind Akaike’s Information Criterion (AIC) (Akaike, 1973):
\[\operatorname{AIC} = -2\log{\hat{\mathcal{L}}} + 2k\]
where \(\hat{\mathcal{L}}\) is the estimated value of the likelihood for a given model and data, and \(k\) is the number of parameters in the model. Generally the likelihood is estimated by maximizing the likelihood function, thus the name maximum likelihood estimation (MLE). The likelihood describes how well the model fits the data, and in certain conditions, can be treated similarly to a probability: higher values means higher plausibility. Hence, models with higher likelihood values demonstrate better fits to the data. Since we are multiplying by a negative value, this means that lower values are preferred.
The \(-2\) was proposed in Akaike’s 1973 original paper to simplify some calculations involving \(\chi^2\) distributions and was kept around since then.
AIC was devised to “punish” model complexity, i.e models that have more parameters to fit to the data. This is why we add \(2\) to the loglikelihood value for every parameter that the model has. Due to the preference of lower AIC values this penalizes models by the number of parameters, while also making it possible to compare models with different complexities.
Building from the AIC, the Bayesian Information Criterion (BIC) (Schwarz, 1978) uses the same idea, but the penalty term is different:
\[\operatorname{BIC} = -2\log{\hat{\mathcal{L}}} + k\log(n)\]
where \(\hat{\mathcal{L}}\) is the estimated value of the likelihood for a given model and data, \(k\) is the model’s number of parameters, and \(n\) is the number of observations. It is called Bayesian because it uses a “Bayesian” argument to derive the punishment term \(k\log(n)\) in the original 1975 paper.
1.1 Example in Pumas
Let’s go over an example of model fit measures in Pumas.
First, let’s import the following packages:
We are going to use the po_sad_1
dataset from PharmaDatasets
:
= dataset("po_sad_1")
df first(df, 5)
Row | id | time | dv | amt | evid | cmt | rate | age | wt | doselevel | isPM | isfed | sex | route |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Int64 | Float64 | Float64? | Float64? | Int64 | Int64? | Float64 | Int64 | Int64 | Int64 | String3 | String3 | String7 | String3 | |
1 | 1 | 0.0 | missing | 30.0 | 1 | 1 | 0.0 | 51 | 74 | 30 | no | yes | male | ev |
2 | 1 | 0.25 | 35.7636 | missing | 0 | missing | 0.0 | 51 | 74 | 30 | no | yes | male | ev |
3 | 1 | 0.5 | 71.9551 | missing | 0 | missing | 0.0 | 51 | 74 | 30 | no | yes | male | ev |
4 | 1 | 0.75 | 97.3356 | missing | 0 | missing | 0.0 | 51 | 74 | 30 | no | yes | male | ev |
5 | 1 | 1.0 | 128.919 | missing | 0 | missing | 0.0 | 51 | 74 | 30 | no | yes | male | ev |
This is an oral dosing (route = "ev"
) NMTRAN-formatted dataset. It has 18 subjects, each with 1 dosing event (evid = 1
) and 18 measurement events (evid = 0
); and the following covariates:
age
: age in years (continuous)wt
: weight in kg (continuous)doselevel
: dosing amount, either30
,60
or90
milligrams (categorical)isPM
: subject is a poor metabolizer (binary)isfed
: subject is fed (binary)sex
: subject sex (binary)
Let’s parse df
into a Population
with read_pumas
:
=
population read_pumas(df; observations = [:dv], covariates = [:wt, :isPM, :isfed], route = :route)
Population
Subjects: 18
Covariates: wt, isPM, isfed
Observations: dv
Let’s create a 2-compartment oral absorption base model with no covariate effects:
= @model begin
base_model @metadata begin
= "base model"
desc = u"hr"
timeu end
@param begin
"""
Clearance (L/hr)
"""
∈ RealDomain(; lower = 0)
tvcl """
Central Volume (L)
"""
∈ RealDomain(; lower = 0)
tvvc """
Peripheral Volume (L)
"""
∈ RealDomain(; lower = 0)
tvvp """
Distributional Clearance (L/hr)
"""
∈ RealDomain(; lower = 0)
tvq """
Absorption rate constant (1/h)
"""
∈ RealDomain(; lower = 0)
tvka """
- ΩCL
- ΩVc
- ΩKa
- ΩVp
- ΩQ
"""
∈ PDiagDomain(5)
Ω """
Proportional RUV (SD scale)
"""
∈ RealDomain(; lower = 0)
σₚ end
@random begin
~ MvNormal(Ω)
η end
@pre begin
= tvcl * exp(η[1])
CL = tvvc * exp(η[2])
Vc = tvka * exp(η[3])
Ka = tvq * exp(η[4])
Q = tvvp * exp(η[5])
Vp end
@dynamics Depots1Central1Periph1
@derived begin
:= @. 1_000 * (Central / Vc)
cp """
Drug Concentration (ng/mL)
"""
~ @. Normal(cp, cp * σₚ)
dv end
end
PumasModel
Parameters: tvcl, tvvc, tvvp, tvq, tvka, Ω, σₚ
Random effects: η
Covariates:
Dynamical system variables: Depot, Central, Peripheral
Dynamical system type: Closed form
Derived: dv
Observed: dv
Let’s go over the model.
In the @metadata
block we are adding a model description and adding information regarding the time units (hours).
Next, we define the model’s parameters in @param
while also prepending them with a string that serves as an annotation for the parameter description. This is helpful for post-processing, since Pumas can use the description instead of the parameter name in tables and figures.
Our η
s are defined in the @random
block and are sampled from a multivariate normal distribution with mean 0
and a positive-diagonal covariance matrix Ω
. We have 5 η
s, one for each PK typical value (also known as θ
s).
We proceed by defining the individual PK parameters in the @pre
block. Each typical value is incremented by the subject’s η
s in a non-linear exponential transformation. This is done to enforce that all individual PK parameters are constrained to being positive. This also has a side effect that the individual PK parameters will be log-normally distributed.
We use the aliased short notation Depots1Central1Periph1
for the ODE system in the @dynamics
. This is equivalent to having the following equations:
' = -Ka * Depot
Depot' = Ka * Depot - (CL + Q) / Vc * Central + Q / Vp * Peripheral
Central' = Q / Vc * Central - Q / Vp * Peripheral Peripheral
Note that, in order for Depots1Central1Periph1
work correctly, we need to define Ka
, CL
, Q
, Vc
, and Vp
in the @pre
block.
Finally, in the @derived
block we define our error model (or likelihood for the statistically-minded). Here we are using a proportional error model with the Gaussian/normal likelihood. Note that Normal
is parameterized with mean and standard deviation, not with variance. That’s why we name our proportional error parameter as σₚ
and not σ²ₚ
.
Let’s now define a initial set of parameter estimates to fit our model:
= (;
iparams = 0.4,
tvka = 4.0,
tvcl = 70.0,
tvvc = 4.0,
tvq = 50.0,
tvvp = Diagonal(fill(0.04, 5)),
Ω = 0.1,
σₚ )
(tvka = 0.4,
tvcl = 4.0,
tvvc = 70.0,
tvq = 4.0,
tvvp = 50.0,
Ω = [0.04 0.0 … 0.0 0.0; 0.0 0.04 … 0.0 0.0; … ; 0.0 0.0 … 0.04 0.0; 0.0 0.0 … 0.0 0.04],
σₚ = 0.1,)
We call the fit
function to estimate the parameters of the model:
= fit(base_model, population, iparams, FOCE()) base_fit
[ Info: Checking the initial parameter values.
[ Info: The initial negative log likelihood and its gradient are finite. Check passed.
Iter Function value Gradient norm
0 1.630402e+03 2.604358e+02
* time: 0.030391931533813477
1 1.499510e+03 9.365700e+01
* time: 1.2553670406341553
2 1.447619e+03 4.714464e+01
* time: 1.3209569454193115
3 1.427906e+03 4.439232e+01
* time: 1.382883071899414
4 1.414326e+03 2.726109e+01
* time: 1.5063579082489014
5 1.387798e+03 1.159019e+01
* time: 1.570739984512329
6 1.382364e+03 7.060796e+00
* time: 1.6344430446624756
7 1.380839e+03 4.839103e+00
* time: 1.7083699703216553
8 1.380281e+03 4.075615e+00
* time: 1.7768120765686035
9 1.379767e+03 3.303901e+00
* time: 1.86576509475708
10 1.379390e+03 2.856359e+00
* time: 1.9193689823150635
11 1.379193e+03 2.650736e+00
* time: 1.976161003112793
12 1.379036e+03 2.523349e+00
* time: 2.041282892227173
13 1.378830e+03 2.638648e+00
* time: 2.122659921646118
14 1.378593e+03 3.463990e+00
* time: 2.17655611038208
15 1.378335e+03 3.471127e+00
* time: 2.232707977294922
16 1.378143e+03 2.756670e+00
* time: 2.2996599674224854
17 1.378019e+03 2.541343e+00
* time: 2.3669750690460205
18 1.377888e+03 2.163251e+00
* time: 2.44222092628479
19 1.377754e+03 2.571076e+00
* time: 2.4995651245117188
20 1.377620e+03 3.370764e+00
* time: 2.5591111183166504
21 1.377413e+03 3.938291e+00
* time: 2.629741907119751
22 1.377094e+03 4.458016e+00
* time: 2.699260950088501
23 1.376674e+03 5.713348e+00
* time: 2.7779159545898438
24 1.375946e+03 5.417530e+00
* time: 2.835129976272583
25 1.375343e+03 5.862876e+00
* time: 2.8954949378967285
26 1.374689e+03 5.717165e+00
* time: 2.9641740322113037
27 1.374056e+03 4.400490e+00
* time: 3.059544086456299
28 1.373510e+03 2.191437e+00
* time: 3.1310389041900635
29 1.373277e+03 1.203587e+00
* time: 3.204134941101074
30 1.373233e+03 1.157761e+00
* time: 3.2833340167999268
31 1.373218e+03 8.770728e-01
* time: 3.361867904663086
32 1.373204e+03 8.021952e-01
* time: 3.4149880409240723
33 1.373190e+03 6.613857e-01
* time: 3.475200891494751
34 1.373183e+03 7.602394e-01
* time: 3.549186944961548
35 1.373173e+03 8.552154e-01
* time: 3.6222569942474365
36 1.373162e+03 6.961928e-01
* time: 3.710840940475464
37 1.373152e+03 3.162546e-01
* time: 3.7762489318847656
38 1.373148e+03 1.747381e-01
* time: 3.836759090423584
39 1.373147e+03 1.258699e-01
* time: 3.8995919227600098
40 1.373147e+03 1.074908e-01
* time: 3.981126070022583
41 1.373147e+03 6.799619e-02
* time: 4.038373947143555
42 1.373147e+03 1.819329e-02
* time: 4.094132900238037
43 1.373147e+03 1.338880e-02
* time: 4.155318021774292
44 1.373147e+03 1.370144e-02
* time: 4.220257997512817
45 1.373147e+03 1.315666e-02
* time: 4.299412965774536
46 1.373147e+03 1.065953e-02
* time: 4.348546981811523
47 1.373147e+03 1.069775e-02
* time: 4.396286964416504
48 1.373147e+03 6.234846e-03
* time: 4.453671932220459
49 1.373147e+03 6.234846e-03
* time: 4.577395915985107
50 1.373147e+03 6.234846e-03
* time: 4.682121992111206
FittedPumasModel
Successful minimization: true
Likelihood approximation: FOCE
Likelihood Optimizer: BFGS
Dynamical system type: Closed form
Log-likelihood value: -1373.1468
Number of subjects: 18
Number of parameters: Fixed Optimized
0 11
Observation records: Active Missing
dv: 270 0
Total: 270 0
-------------------
Estimate
-------------------
tvcl 2.8344
tvvc 77.801
tvvp 48.754
tvq 3.9789
tvka 1.028
Ω₁,₁ 0.2638
Ω₂,₂ 0.2288
Ω₃,₃ 0.40047
Ω₄,₄ 0.37968
Ω₅,₅ 0.21495
σₚ 0.097805
-------------------
Now we are ready to showcase model fit measures. All of these functions should take a result from fit
and output a real number.
Let’s start with aic
and bic
which are included in Pumas:
aic(base_fit)
2768.2935804173985
bic(base_fit)
2807.876221966381
We are also free to create our own functions if we want to use something different than aic
or bic
.
Here’s an example of a function that takes a fitted Pumas model, m
, and outputs the -2LL (minus 2 times log-likelihood) without the constant. This is a model fit measure commonly used by NONMEM users and is is known as OFV: Objective Function Value. Hence, we will name the function ofv
:
ofv(m) = (-2 * loglikelihood(m)) - (nobs(m) * log(2π))
ofv (generic function with 1 method)
We can use it on our base_fit
model fit
result:
ofv(base_fit)
2250.0667724868754
2 Likelihood Ratio Tests
A likelihood-ratio test (LRT) is a statistical hypothesis test used in the field of statistics and probability theory to compare two statistical models and determine which one provides a better fit to a given set of observed data. It is particularly useful in the context of maximum likelihood estimation (MLE) and is commonly used for hypothesis testing in parametric statistical modeling.
The basic idea behind the likelihood ratio test is to compare the likelihoods of two competing models:
Null Hypothesis (\(H_0\)): This is the model that you want to test against. It represents a specific set of parameter values or restrictions on the model.
Alternative Hypothesis (\(H_a\)): This is the alternative model, often a more complex one or the one you want to support.
The test statistic is calculated as the ratio of the likelihood under the alternative model (\(H_a\)) to the likelihood under the null model (\(H_0\)). Mathematically, it can be expressed as:
\[\operatorname{LRT} = - 2 \log \left( \frac{\mathcal{L}(H_0)}{\mathcal{L}(H_a)} \right)\]
where:
- \(\operatorname{LRT}\): likelihood ratio test statistic
- \(\mathcal{L}(H_0)\): likelihood under \(H_0\), the likelihood of the data under the null hypothesis
- \(\mathcal{L}(H_a)\): likelihood under \(H_a\), the likelihood of the data under the alternative hypothesis
The LRT statistic follows a \(\chi^2\) (chi-squared) distribution with degrees of freedom equal to the difference in the number of parameters between the two models (i.e., the degrees of freedom is the number of additional parameters in the alternative model). In practice, you compare the LRT statistic to \(\chi^2\) distribution to determine whether the alternative model is a significantly better fit to the data than the null model.
The key idea is that if the p-value derived from the LRT statistic is lower than your desired \(\alpha\) (the type-1 error rate, commonly set to \(0.05\)), you would reject the null hypothesis in favor of the alternative hypothesis, indicating that the alternative model provides a better fit to the data.
The likelihood-ratio test requires that the models be nested, i.e. the more complex model can be transformed into the simpler model by imposing constraints on the former’s parameters.
This is generally the case when performing LRT in a covariate selection context. However, be mindful of not violating this assumption when performing LRT.
2.1 Example in Pumas
Pumas provides us with the lrtest
function to perform LRT. It takes 2 positional arguments as competing models:
- Model under \(H_0\) (i.e. the model with less parameters)
- Model under \(H_a\) (i.e. the model with more parameters)
Let’s define a covariate model that takes wt
into consideration for all the clearance and volume PK parameters:
= @model begin
covariate_model @metadata begin
= "covariate model that uses weight covariate information"
desc = u"hr"
timeu end
@param begin
"""
Clearance (L/hr)
"""
∈ RealDomain(; lower = 0)
tvcl """
Central Volume (L)
"""
∈ RealDomain(; lower = 0)
tvvc """
Peripheral Volume (L)
"""
∈ RealDomain(; lower = 0)
tvvp """
Distributional Clearance (L/hr)
"""
∈ RealDomain(; lower = 0)
tvq """
Absorption rate constant (h-1)
"""
∈ RealDomain(; lower = 0)
tvka """
Power exponent on weight for Clearance # new
"""
∈ RealDomain() # new
dwtcl """
Power exponent on weight for Distributional Clearance # new
"""
∈ RealDomain() # new
dwtq """
- ΩCL
- ΩVc
- ΩKa
- ΩVp
- ΩQ
"""
∈ PDiagDomain(5)
Ω """
Proportional RUV (SD scale)
"""
∈ RealDomain(; lower = 0)
σₚ end
@random begin
~ MvNormal(Ω)
η end
@covariates begin
"""
Weight (kg) # new
"""
# new
wt end
@pre begin
= tvcl * exp(η[1]) * (wt / 70)^dwtcl # new
CL = tvvc * exp(η[2]) * (wt / 70) # new
Vc = tvka * exp(η[3])
Ka = tvq * exp(η[4]) * (wt / 70)^dwtq # new
Q = tvvp * exp(η[5]) * (wt / 70) # new
Vp end
@dynamics Depots1Central1Periph1
@derived begin
:= @. 1000 * (Central / Vc)
cp """
Drug Concentration (ng/mL)
"""
~ @. Normal(cp, cp * σₚ)
dv end
end
PumasModel
Parameters: tvcl, tvvc, tvvp, tvq, tvka, dwtcl, dwtq, Ω, σₚ
Random effects: η
Covariates: wt
Dynamical system variables: Depot, Central, Peripheral
Dynamical system type: Closed form
Derived: dv
Observed: dv
This is almost the same model as before. However, we are adding a few tweaks (commented with # new
):
wt
in the new@covariates
block- allometric scaling based on
wt
for the individual PK parametersCL
,Q
,Vc
andVp
- new parameters in
@param
for the exponent of the power function ofwt
on both individual clearance PK parametersCL
andQ
Since covariate_model
has two new parameters in the @param
block, we need to add them to the initial set of parameter estimates. We can do this by creating a new NamedTuple
that builts upon the last one iparams
, while also adding initial values for dwtcl
and dwtq
:
= (; iparams..., dwtcl = 0.75, dwtq = 0.75) iparams_covariate
(tvka = 0.4,
tvcl = 4.0,
tvvc = 70.0,
tvq = 4.0,
tvvp = 50.0,
Ω = [0.04 0.0 … 0.0 0.0; 0.0 0.04 … 0.0 0.0; … ; 0.0 0.0 … 0.04 0.0; 0.0 0.0 … 0.0 0.04],
σₚ = 0.1,
dwtcl = 0.75,
dwtq = 0.75,)
We are using Julia’s splatting ...
operator to expand inline the iparams
NamedTuple
.
Now we fit
our covariate_model
:
= fit(covariate_model, population, iparams_covariate, FOCE()) covariate_fit
[ Info: Checking the initial parameter values.
[ Info: The initial negative log likelihood and its gradient are finite. Check passed.
Iter Function value Gradient norm
0 1.555051e+03 2.584685e+02
* time: 7.295608520507812e-5
1 1.436886e+03 9.959639e+01
* time: 0.0535581111907959
2 1.383250e+03 3.318037e+01
* time: 0.10277700424194336
3 1.372961e+03 2.525341e+01
* time: 0.1646270751953125
4 1.365242e+03 2.081002e+01
* time: 0.23254799842834473
5 1.350200e+03 1.667386e+01
* time: 0.37499213218688965
6 1.346374e+03 9.195785e+00
* time: 0.4345109462738037
7 1.344738e+03 8.614309e+00
* time: 0.4897799491882324
8 1.343902e+03 4.950745e+00
* time: 0.546334981918335
9 1.343662e+03 1.478699e+00
* time: 0.6077799797058105
10 1.343626e+03 9.575005e-01
* time: 0.6866040229797363
11 1.343609e+03 8.509968e-01
* time: 0.738084077835083
12 1.343589e+03 7.964671e-01
* time: 0.7896270751953125
13 1.343567e+03 8.202459e-01
* time: 0.853302001953125
14 1.343550e+03 8.133359e-01
* time: 0.9178280830383301
15 1.343542e+03 6.865506e-01
* time: 0.9925949573516846
16 1.343538e+03 3.869567e-01
* time: 1.0407390594482422
17 1.343534e+03 2.805019e-01
* time: 1.0918819904327393
18 1.343531e+03 3.271442e-01
* time: 1.1520581245422363
19 1.343529e+03 4.584302e-01
* time: 1.2129099369049072
20 1.343527e+03 3.951940e-01
* time: 1.289031982421875
21 1.343525e+03 1.928385e-01
* time: 1.3397479057312012
22 1.343524e+03 1.958575e-01
* time: 1.3903720378875732
23 1.343523e+03 2.008844e-01
* time: 1.4434869289398193
24 1.343522e+03 1.636364e-01
* time: 1.496819019317627
25 1.343522e+03 1.041929e-01
* time: 1.5712981224060059
26 1.343521e+03 7.417497e-02
* time: 1.6245670318603516
27 1.343521e+03 7.297961e-02
* time: 1.6764869689941406
28 1.343521e+03 8.109591e-02
* time: 1.7339410781860352
29 1.343520e+03 7.067080e-02
* time: 1.7946460247039795
30 1.343520e+03 5.088025e-02
* time: 1.8732099533081055
31 1.343520e+03 4.980085e-02
* time: 1.9182209968566895
32 1.343520e+03 4.778940e-02
* time: 1.9602279663085938
33 1.343520e+03 5.667067e-02
* time: 2.0063071250915527
34 1.343520e+03 5.825591e-02
* time: 2.06377911567688
35 1.343519e+03 5.354660e-02
* time: 2.1450459957122803
36 1.343519e+03 5.300792e-02
* time: 2.197843074798584
37 1.343519e+03 4.011720e-02
* time: 2.2499709129333496
38 1.343519e+03 3.606197e-02
* time: 2.3055200576782227
39 1.343519e+03 3.546034e-02
* time: 2.365812063217163
40 1.343519e+03 3.525307e-02
* time: 2.41853404045105
41 1.343519e+03 3.468091e-02
* time: 2.4812629222869873
42 1.343519e+03 3.313732e-02
* time: 2.525546073913574
43 1.343518e+03 4.524162e-02
* time: 2.5762641429901123
44 1.343518e+03 5.769309e-02
* time: 2.6345889568328857
45 1.343518e+03 5.716613e-02
* time: 2.693413019180298
46 1.343517e+03 4.600797e-02
* time: 2.7682931423187256
47 1.343517e+03 3.221948e-02
* time: 2.8200759887695312
48 1.343517e+03 2.610758e-02
* time: 2.870687961578369
49 1.343517e+03 2.120270e-02
* time: 2.9235310554504395
50 1.343517e+03 1.887916e-02
* time: 2.9759349822998047
51 1.343517e+03 1.229271e-02
* time: 3.0464630126953125
52 1.343517e+03 4.778802e-03
* time: 3.099393129348755
53 1.343517e+03 2.158460e-03
* time: 3.1487200260162354
54 1.343517e+03 2.158460e-03
* time: 3.2362220287323
55 1.343517e+03 2.158460e-03
* time: 3.356226921081543
FittedPumasModel
Successful minimization: true
Likelihood approximation: FOCE
Likelihood Optimizer: BFGS
Dynamical system type: Closed form
Log-likelihood value: -1343.5173
Number of subjects: 18
Number of parameters: Fixed Optimized
0 13
Observation records: Active Missing
dv: 270 0
Total: 270 0
--------------------
Estimate
--------------------
tvcl 2.7287
tvvc 70.681
tvvp 47.396
tvq 4.0573
tvka 0.98725
dwtcl 0.58351
dwtq 1.176
Ω₁,₁ 0.21435
Ω₂,₂ 0.050415
Ω₃,₃ 0.42468
Ω₄,₄ 0.040356
Ω₅,₅ 0.045987
σₚ 0.097904
--------------------
Now we are ready to perform LRT with lrtest
:
= lrtest(base_fit, covariate_fit) mytest
Statistic: 59.3
Degrees of freedom: 2
P-value: 0.0
The degrees of freedom of the underlying \(\chi^2\) distribution is \(2\), i.e. we have two additional parameters in the model under \(H_a\); and the test statistic is \(59.3\).
The \(p\)-value corresponding for the test statistic and degree of freedom is very close to \(0\). It prints as 0.0
, but we can access the value with the pvalue
function:
pvalue(mytest)
1.3554737256701043e-13
This indicates strong evidence against the base_model
(i.e. model under \(H_0\)) and in favor of the covariate_model
(i.e. model under \(H_a\)).
3 Model Selection Algorithms
There are several model selection techniques that take into account covariate selection. In the statistical literature, the reader can check Thayer (1990), and for the pharmacometric context, the reader can check Hutmacher & Kowalski (2015) and Jonsson & Karlsson (1998).
Pumas currently only implements the Stepwise Covariate Model (SCM). SCM, also known as stepwise procedures, is a model building strategy that is used to identify the best covariate model for a given dataset by a series of iterations (Hutmacher & Kowalski, 2015). Broadly, there are two main types of SCM:
- Forward Selection (FS)
- Backward Elimination (BE)
We will be covering these in detail in a new set of tutorials, please check tutorials.pumas.ai.
4 References
Akaike, H. (1973). Information theory and the extension of the maximum likelihood principle. Proceedings of the Second International Symposium on Information Theory.
Hutmacher, M. M., & Kowalski, K. G. (2015). Covariate selection in pharmacometric analyses: a review of methods. British journal of clinical pharmacology, 79(1), 132–147. https://doi.org/10.1111/bcp.12451
Jonsson, E. N., & Karlsson, M. O. (1998). Automated covariate model building within NONMEM. Pharmaceutical research, 15(9), 1463–1468. https://doi.org/10.1023/a:1011970125687
Schwarz, Gideon E. (1978). Estimating the dimension of a model. Annals of Statistics, 6 (2): 461–464, doi:10.1214/aos/1176344136.
Thayer, J. D. (1990). Implementing Variable Selection Techniques in Regression. ERIC.