Why are non-Gaussian random effects relevant?

Authors

Jose Storopoli

Andreas Noack

using Dates
using Pumas
using PumasUtilities
using DataFramesMeta
using PharmaDatasets
using CairoMakie
using AlgebraOfGraphics
using Random

1 Motivation - PK model

Why using a non-Gaussian distribution as the underlying distribution for the random effects? There are a couple of arguments.

First, the Gaussian distribution has unbounded support, i.e. it take any value in \((-\infty, \infty)\). While phamacokinetic parameters typically are (semi) bounded, e.g.:

  • clearance and volumes, \((0, \infty)\)
  • bioavailability, \([0, 1]\)

Additionally, in order for a Gaussian distribution to work as the underlying distribution, often we need to transform them (e.g. exponentiation and logistic transformation). But these transformations in some settings, when the random effects do not have a great impact, i.e. they do not have large values, may shift the mean of the typical values (\(\theta\)) so that the expectation of the typical values (\(\operatorname{E}\)) are not equal to the mean. For example, the following code block is a traditional 1-compartment PK model with a Gaussian random effect that needs to be constrained to positive values, \((0, \infty)\):

@random begin
    ηCL ~ Normal(0.0, ωCL)
    ηVc ~ Normal(0.0, ωVc)
end
@pre begin
    CL = θCL * exp(ηCL)
    Vc = θVc * exp(ηVc)
end
@dynamics begin
    Central' = -CL / Vc * Central
end

If we recover the formula for the expectation of the log-normal distribution, we have that:

\[\operatorname{E}[CL] = \exp \left\{ \log(\theta_{CL}) + \frac{\omega^2_{CL}}{2} \right\} \approx \theta_{CL}\]

This approximation only holds for small \(\omega_{CL}\).

Hence, \(\theta_{CL}\) is only the typical value when \(\omega_{CL}\) is small.

Here is a small tabulation for \(\operatorname{E}[CL]\) when \(\theta_{CL} = 0.5\):

ωs = [0.1, 0.2, 0.4, 0.8, 1.6]
DataFrame(; ω_CL = ωs, E_CL =-> exp(log(0.5) + ω^2 / 2)).(ωs))
5×2 DataFrame
Row ω_CL E_CL
Float64 Float64
1 0.1 0.502506
2 0.2 0.510101
3 0.4 0.541644
4 0.8 0.688564
5 1.6 1.79832

As you can see, the larger the \(\omega_{CL}\) the more \(\operatorname{E}[CL]\) deviates from \(\theta_{CL}\).

1.1 Gamma distribution for the rescue

We can use the gamma distribution which has the following parametrization:

\[\text{Gamma}(k, \theta)\]

where \(k\) is a shape parameter and \(\theta\) is a scale parameter.

Note

Shape parameters generally control the shape of the distribution rather than shifting it (as a location parameter) of stretching/shrinking it (as a scale parameter)

We can use an alternative parametrization where the mean-value appears directly a parameter:

\[\text{Gamma}(\mu, \sigma)\]

where:

  • \(\mu = \theta k\)
  • \(\sigma = k^{-\frac{1}{2}}\)
Tip

The \(\sigma\) parameter is the coefficient of variation, i.e.

\[\sigma = \frac{\operatorname{Var} X}{\operatorname{E} X},\]

because that mimics the role of \(\sigma\) in the LogNormal(log(μ), σ) where for small values of \(\sigma\)

\[\sigma \approx \\frac{\operatorname{Var} X}{\operatorname{E} X}.\]

So, our previous PK model now becomes:

@random begin
    _CL ~ Gamma(1 / ωCL^2, θCL * ωCL^2)
    _Vc ~ Gamma(1 / ωVc^2, θVc * ωVc^2)
end
@pre begin
    CL = _CL
    Vc = _Vc
end
@dynamics begin
    Central' = -CL / Vc * Central
end

As you can see the mean from the gamma distribution becomes:

\[\operatorname{E}[CL] = \theta k = \frac{1}{\omega^2_{CL}} \theta_{CL} \omega^2_{CL} = \theta_{CL}\]

It does not dependent on the between-subject variability \(\omega\)!

Note

We are avoiding η notation here since we are modeling the subject-specific parameter directly.

1.2 Gamma versus Log-Nogmal Numerical Simulations

Before we dive into our PK examples, let us showcase the case for gamma versus log-normal with some numerical simulations.

First, let’s define a mean μ_PK value for a typical value along with an array of possible standard deviations σ values:

μ_PK = 1.0
σ = [0.1, 0.2, 0.5, 1.0, 1.5, 2.0]

These will serve as the mean and standard deviations for our gamma and log-normal distributions.

Now let’s compare the coefficient of variation (CV) as a function of σ for LogNormal and Gamma:

num_df_gamma = DataFrame(;
    μ = μ_PK,
    σ = σ,
    meanLogNormal = mean.(LogNormal.(log.(μ_PK), σ)),
    cvLogNormal = std.(LogNormal.(log.(μ_PK), σ)) ./ mean.(LogNormal.(log.(μ_PK), σ)),
    meanGamma = mean.(Gamma.(1 ./ σ .^ 2, μ_PK .* σ .^ 2)),
    cvGamma = std.(Gamma.(1 ./ σ .^ 2, μ_PK .* σ .^ 2)) ./
              mean.(Gamma.(1 ./ σ .^ 2, μ_PK .* σ .^ 2)),
)
6×6 DataFrame
Row μ σ meanLogNormal cvLogNormal meanGamma cvGamma
Float64 Float64 Float64 Float64 Float64 Float64
1 1.0 0.1 1.00501 0.100251 1.0 0.1
2 1.0 0.2 1.0202 0.202017 1.0 0.2
3 1.0 0.5 1.13315 0.53294 1.0 0.5
4 1.0 1.0 1.64872 1.31083 1.0 1.0
5 1.0 1.5 3.08022 2.91337 1.0 1.5
6 1.0 2.0 7.38906 7.32108 1.0 2.0
f, ax, plotobj = lines(
    num_df_gamma.σ,
    num_df_gamma.meanLogNormal;
    label = "μ - LogNormal",
    linewidth = 3,
    axis = (; xlabel = L"\sigma", ylabel = "value"),
)
lines!(
    num_df_gamma.σ,
    num_df_gamma.meanGamma;
    label = "μ - Gamma",
    linewidth = 3,
    linestyle = :dashdot,
)
lines!(num_df_gamma.σ, num_df_gamma.cvLogNormal; label = "CV - LogNormal", linewidth = 3)
lines!(
    num_df_gamma.σ,
    num_df_gamma.cvGamma;
    label = "CV - Gamma",
    linewidth = 3,
    linestyle = :dashdot,
)
axislegend(ax; position = :lt)
f

In the graph above, the dashed lines correspond to the mean and CV for the gamma distribution, whereas the solid lines correspond to the log-normal distribution.

There is clearly a bias in both the log-normal’s mean and CV that we don’t see in the gamma distribution.

2 Motivation - Bioavailability

Here is a very common model that can benefit from a non-Gaussian random effects distribution.

The model has one-compartment elimination and oral absorption with modeled bioavailability based on a crossover design.

The following code is a traditional PK model with a Gaussian random effect that needs to be constrained to the unit interval, \([0, 1]\):

@param begin
    θF  RealDomain(lower = 0.0, upper = 1.0)
    ωF  RealDomain(lower = 0.0)
end
@random begin
    ηF ~ Normal(0.0, ωF)
end
@dosecontrol begin
    bioav = (Depot = logistic(logit(θF) + ηF),)
end

The expectation \(\operatorname{E}[F]\) doesn’t have closed form and is generally different from \(\theta_F\). However, we have that:

\[\operatorname{E}[F] \approx \theta_F\]

when \(ωF\) is small. I.e. \(\theta_F\) is only the typical value when \(ωF\) is small.

2.1 Beta versus Logit-Normal Numerical Simulations

Let’s perform the same type of simulations we did before, but now we will be using the numerical integrator quadgk from the QuadGK.jl package. This is because we don’t have a closed form solution for \(\operatorname{E}[F]\) in the logit-normal parameterization.

using QuadGK: quadgk
μ_bioav = 0.7

We’ll also reuse the same σ values for the CVs.

num_df_beta = DataFrame(;
    μ = μ_bioav,
    σ = σ,
    meanLogitNormal = map(
        σ -> quadgk(
            t -> logistic(t) * pdf(Normal(logit(μ_bioav), σ), t),
            -100 * σ,
            100 * σ,
        )[1],
        σ,
    ),
    meanBeta = mean.(Beta.(μ_bioav ./ σ, (1 - μ_bioav) ./ σ)),
)
6×4 DataFrame
Row μ σ meanLogitNormal meanBeta
Float64 Float64 Float64 Float64
1 0.7 0.1 0.699582 0.7
2 0.7 0.2 0.698345 0.7
3 0.7 0.5 0.690393 0.7
4 0.7 1.0 0.668971 0.7
5 0.7 1.5 0.646064 0.7
6 0.7 2.0 0.626038 0.7
f, ax, plotobj = lines(
    num_df_beta.σ,
    num_df_beta.meanLogitNormal;
    label = "μ - LogitNormal",
    linewidth = 3,
    axis = (; xlabel = L"\sigma", ylabel = "value"),
)
lines!(
    num_df_beta.σ,
    num_df_beta.meanBeta;
    label = "μ - Beta",
    linewidth = 3,
    linestyle = :dashdot,
)
axislegend(ax; position = :lb)
f

In the graph above, the dashed lines correspond to the mean for the beta distribution, whereas the solid lines correspond to the logit-normal distribution.

As before, there is clearly a bias in the logit-normal’s mean that we don’t see in the beta distribution.

3 Warfarin data

We’ll demonstrate those intuitions using the Warfarin dataset.

pop = read_pumas(dataset("pumas/warfarin"))
Population
  Subjects: 32
  Observations: dv

4 Models and Simulations

Here we will provide a Gaussian and a non-Gaussian approach for:

  • PK IV 1-compartment model fit for the Warfarin dataset
  • Bioavaliability parallel absorption model simulation

4.1 Warfarin Gaussian and non-Gaussian PK model

The first model is a simple 1-compartment PK IV model with proportional error. This is for the Gaussian versus gamma random effects:

model_lognormal = @model begin
    @param begin
        θCL  RealDomain(; lower = 0)
        θVc  RealDomain(; lower = 0)

        ωCL  RealDomain(; lower = 0)
        ωVc  RealDomain(; lower = 0)

        σ  RealDomain(; lower = 0)
    end
    @random begin
        _CL ~ LogNormal(log(θCL), ωCL)
        _Vc ~ LogNormal(log(θVc), ωVc)
    end
    # This is equivalent to defining
    # CL = θCL*exp(ηCL)
    # with
    # ηCL = Normal(0, ωCL)
    @pre begin
        CL = _CL
        Vc = _Vc
    end
    @dynamics Central1
    @derived begin
        μ := @. Central / Vc
        dv ~ @. Normal(μ, μ * σ)
    end
end
PumasModel
  Parameters: θCL, θVc, ωCL, ωVc, σ
  Random effects: _CL, _Vc
  Covariates:
  Dynamical system variables: Central
  Dynamical system type: Closed form
  Derived: dv
  Observed: dv
model_gamma = @model begin
    @param begin
        θCL  RealDomain(; lower = 0)
        θVc  RealDomain(; lower = 0)

        ωCL  RealDomain(; lower = 0)
        ωVc  RealDomain(; lower = 0)

        σ  RealDomain(; lower = 0)
    end
    @random begin
        _CL ~ Gamma(1 / ωCL^2, θCL * ωCL^2)
        _Vc ~ Gamma(1 / ωVc^2, θVc * ωVc^2)
    end
    @pre begin
        CL = _CL
        Vc = _Vc
    end
    @dynamics begin
        Central' = -CL / Vc * Central
    end
    @derived begin
        μ := @. Central / Vc
        dv ~ @. Normal(μ, μ * σ)
    end
end
PumasModel
  Parameters: θCL, θVc, ωCL, ωVc, σ
  Random effects: _CL, _Vc
  Covariates:
  Dynamical system variables: Central
  Dynamical system type: Matrix exponential
  Derived: dv
  Observed: dv

We also need some initial values for the fitting:

iparams_pk = (; θCL = 1.0, θVc = 5.0, ωCL = 0.1, ωVc = 0.1, σ = 0.2)
(θCL = 1.0,
 θVc = 5.0,
 ωCL = 0.1,
 ωVc = 0.1,
 σ = 0.2,)

We proceed by fitting both models:

fit_lognormal = fit(model_lognormal, pop, iparams_pk, FOCE())
[ Info: Checking the initial parameter values.
[ Info: The initial negative log likelihood and its gradient are finite. Check passed.
Iter     Function value   Gradient norm 
     0     5.770212e+03     7.912060e+03
 * time: 0.04255986213684082
     1     9.433464e+02     6.079483e+02
 * time: 1.6726188659667969
     2     8.189627e+02     4.423725e+02
 * time: 1.6900107860565186
     3     5.917683e+02     1.819248e+02
 * time: 1.704070806503296
     4     5.421783e+02     1.121313e+02
 * time: 1.7198967933654785
     5     5.255651e+02     7.407230e+01
 * time: 1.930284023284912
     6     5.208427e+02     8.699271e+01
 * time: 1.9387547969818115
     7     5.174883e+02     8.974584e+01
 * time: 1.9469799995422363
     8     5.138523e+02     7.328235e+01
 * time: 1.9551928043365479
     9     5.109883e+02     4.155805e+01
 * time: 1.9631929397583008
    10     5.094359e+02     3.170517e+01
 * time: 1.9707908630371094
    11     5.086172e+02     3.327331e+01
 * time: 1.978173017501831
    12     5.080941e+02     2.942077e+01
 * time: 1.9855289459228516
    13     5.074009e+02     2.839941e+01
 * time: 1.9928410053253174
    14     5.059302e+02     3.330093e+01
 * time: 1.9998469352722168
    15     5.036399e+02     3.172884e+01
 * time: 2.007337808609009
    16     5.017004e+02     3.160020e+01
 * time: 2.015075922012329
    17     5.008553e+02     2.599524e+01
 * time: 2.0228159427642822
    18     5.005913e+02     2.139314e+01
 * time: 2.0301828384399414
    19     5.003573e+02     2.134778e+01
 * time: 2.037317991256714
    20     4.997249e+02     2.069868e+01
 * time: 2.0446507930755615
    21     4.984453e+02     1.859010e+01
 * time: 2.052060842514038
    22     4.959584e+02     2.156209e+01
 * time: 2.0598130226135254
    23     4.923347e+02     3.030833e+01
 * time: 2.0679688453674316
    24     4.906916e+02     1.652278e+01
 * time: 2.076528787612915
    25     4.902955e+02     6.360800e+00
 * time: 2.0845508575439453
    26     4.902870e+02     7.028603e+00
 * time: 2.0929059982299805
    27     4.902193e+02     1.176895e+00
 * time: 2.100909948348999
    28     4.902189e+02     1.170642e+00
 * time: 2.1075098514556885
    29     4.902186e+02     1.167624e+00
 * time: 2.113312005996704
    30     4.902145e+02     1.110377e+00
 * time: 2.1202118396759033
    31     4.902079e+02     1.010507e+00
 * time: 2.1284329891204834
    32     4.901917e+02     9.619218e-01
 * time: 2.1368207931518555
    33     4.901683e+02     1.001006e+00
 * time: 2.1444239616394043
    34     4.901473e+02     6.138233e-01
 * time: 2.1528759002685547
    35     4.901412e+02     1.754342e-01
 * time: 2.161327838897705
    36     4.901406e+02     2.617009e-02
 * time: 2.1691079139709473
    37     4.901405e+02     4.585882e-03
 * time: 2.1760528087615967
    38     4.901405e+02     7.668184e-04
 * time: 2.1821129322052
FittedPumasModel

Dynamical system type:                 Closed form

Number of subjects:                             32

Observation records:         Active        Missing
    dv:                         256              0
    Total:                      256              0

Number of parameters:      Constant      Optimized
                                  0              5

Likelihood approximation:                     FOCE
Likelihood optimizer:                         BFGS

Termination Reason:                   GradientNorm
Log-likelihood value:                   -490.14052

---------------
      Estimate
---------------
θCL    0.16025
θVc   10.262
ωCL    0.23505
ωVc    0.10449
σ      0.3582
---------------
fit_gamma = fit(model_gamma, pop, iparams_pk, FOCE())
[ Info: Checking the initial parameter values.
[ Info: The initial negative log likelihood and its gradient are finite. Check passed.
Iter     Function value   Gradient norm 
     0     4.160707e+03     5.960578e+03
 * time: 1.4066696166992188e-5
     1     9.056373e+02     5.541920e+02
 * time: 0.46532106399536133
     2     7.787963e+02     3.988904e+02
 * time: 0.48525500297546387
     3     5.915054e+02     1.495777e+02
 * time: 0.50142502784729
     4     5.533120e+02     8.826583e+01
 * time: 0.5167689323425293
     5     5.389239e+02     9.144086e+01
 * time: 0.5327789783477783
     6     5.323499e+02     1.000933e+02
 * time: 0.548583984375
     7     5.270252e+02     8.423844e+01
 * time: 0.563971996307373
     8     5.233813e+02     5.194402e+01
 * time: 0.7138919830322266
     9     5.213366e+02     3.461331e+01
 * time: 0.7270519733428955
    10     5.200972e+02     3.888113e+01
 * time: 0.7404310703277588
    11     5.191933e+02     3.556605e+01
 * time: 0.7536969184875488
    12     5.181335e+02     3.624436e+01
 * time: 0.7673430442810059
    13     5.161626e+02     4.322775e+01
 * time: 0.7811720371246338
    14     5.133202e+02     3.722515e+01
 * time: 0.7944331169128418
    15     5.107758e+02     3.401586e+01
 * time: 0.8077120780944824
    16     5.095157e+02     2.854997e+01
 * time: 0.8210570812225342
    17     5.090165e+02     2.644560e+01
 * time: 0.8337869644165039
    18     5.085184e+02     2.744429e+01
 * time: 0.8461220264434814
    19     5.074309e+02     2.793918e+01
 * time: 0.8582761287689209
    20     5.053757e+02     2.616169e+01
 * time: 0.8707659244537354
    21     5.018507e+02     2.257667e+01
 * time: 0.8837020397186279
    22     4.942495e+02     3.832878e+01
 * time: 0.898137092590332
    23     4.940229e+02     5.518159e+01
 * time: 0.9170401096343994
    24     4.909110e+02     3.042064e+01
 * time: 0.9338760375976562
    25     4.900234e+02     6.929306e+00
 * time: 0.9491939544677734
    26     4.897974e+02     1.087865e+00
 * time: 0.9645140171051025
    27     4.897942e+02     6.456402e-01
 * time: 0.9783971309661865
    28     4.897940e+02     6.467689e-01
 * time: 0.9914019107818604
    29     4.897939e+02     6.463480e-01
 * time: 1.004384994506836
    30     4.897935e+02     6.408914e-01
 * time: 1.0175809860229492
    31     4.897924e+02     6.208208e-01
 * time: 1.0310909748077393
    32     4.897900e+02     1.035462e+00
 * time: 1.044814109802246
    33     4.897850e+02     1.452099e+00
 * time: 1.058650016784668
    34     4.897776e+02     1.482593e+00
 * time: 1.0727510452270508
    35     4.897718e+02     8.420646e-01
 * time: 1.0885009765625
    36     4.897702e+02     2.023876e-01
 * time: 1.1043369770050049
    37     4.897700e+02     1.885486e-02
 * time: 1.119023084640503
    38     4.897700e+02     2.343932e-03
 * time: 1.1324570178985596
    39     4.897700e+02     4.417566e-04
 * time: 1.1446459293365479
FittedPumasModel

Dynamical system type:          Matrix exponential

Number of subjects:                             32

Observation records:         Active        Missing
    dv:                         256              0
    Total:                      256              0

Number of parameters:      Constant      Optimized
                                  0              5

Likelihood approximation:                     FOCE
Likelihood optimizer:                         BFGS

Termination Reason:                   GradientNorm
Log-likelihood value:                   -489.77002

---------------
      Estimate
---------------
θCL    0.16466
θVc   10.329
ωCL    0.23348
ωVc    0.10661
σ      0.35767
---------------

Finally, let’s compare the estimates:

compare_estimates(; lognormal = fit_lognormal, gamma = fit_gamma)
5×3 DataFrame
Row parameter lognormal gamma
String Float64? Float64?
1 θCL 0.160253 0.164658
2 θVc 10.2617 10.3288
3 ωCL 0.235046 0.233484
4 ωVc 0.10449 0.106611
5 σ 0.358205 0.357667

As mention above, the mean of a log-normal is \(\exp \left\{ \mu + \frac{\sigma^2}{2} \right\}\).

So let’s compare that with the gamma typical values:

DataFrame(;
    parameter = ["θCL", "θVc"],
    θLogNormal = [coef(fit_lognormal).θCL, coef(fit_lognormal).θVc],
    ELogNormal = [
        exp(log(coef(fit_lognormal).θCL) + coef(fit_lognormal).ωCL^2 / 2),
        exp(log(coef(fit_lognormal).θVc) + coef(fit_lognormal).ωVc^2 / 2),
    ],
    θGamma = [coef(fit_gamma).θCL, coef(fit_gamma).θVc],
)
2×4 DataFrame
Row parameter θLogNormal ELogNormal θGamma
String Float64 Float64 Float64
1 θCL 0.160253 0.164741 0.164658
2 θVc 10.2617 10.3178 10.3288

As you can see the Gaussian model has a slight bias in the estimation of both θCL and θVc.

Let’s also plot the two probability density functions (PDF) for θCL:

plotdataPK = @chain DataFrame(; x = range(0, 0.5; length = 1_000)) begin
    @rtransform begin
        :LogNormal =
            pdf(LogNormal(log(coef(fit_lognormal).θCL), coef(fit_lognormal).ωCL), :x)
        :Gamma = pdf(LogNormal(log(coef(fit_gamma).θCL), coef(fit_gamma).ωCL), :x)
    end
end
first(plotdataPK, 5)
5×3 DataFrame
Row x LogNormal Gamma
Float64 Float64 Float64
1 0.0 0.0 0.0
2 0.000500501 5.27258e-128 5.2502e-131
3 0.001001 9.25648e-99 3.24235e-101
4 0.0015015 2.10132e-83 1.45529e-85
5 0.002002 2.71651e-73 2.9785e-75
data(stack(plotdataPK, [:LogNormal, :Gamma])) *
mapping(:x, :value; color = :variable) *
visual(Lines) |> draw

4.2 Bioavaliability Parallel Absorption Simulation

This is a parallel absorption model with bioavaliabity in both the “fast” as the “slow” depots.

First, the traditional approach with a logistic transformation of a Gaussian random variable. This makes the individual relative bioavailibility logit-normally distributed.

model_logitnormal = @model begin
    @param begin
        θkaFast  RealDomain(; lower = 0)
        θkaSlow  RealDomain(; lower = 0)
        θCL  RealDomain(; lower = 0)
        θVc  RealDomain(; lower = 0)
        θbioav  RealDomain(; lower = 0.0, upper = 1.0)

        ωCL  RealDomain(; lower = 0)
        ωVc  RealDomain(; lower = 0)
        # I call this one ξ to distinguish it from ω since the interpretation is NOT a relative error (coefficient of variation)
        ξbioav  RealDomain(; lower = 0, init = 0.1)

        σ  RealDomain(; lower = 0)
    end
    @random begin
        _CL ~ Gamma(1 / ωCL^2, θCL * ωCL^2)
        _Vc ~ Gamma(1 / ωVc^2, θVc * ωVc^2)
        # define the latent Gaussian random effect. Notice the logit transform
        ηbioavLogit ~ Normal(logit(θbioav), ξbioav)
    end
    @pre begin
        kaFast = θkaFast
        kaSlow = θkaSlow
        CL = _CL
        Vc = _Vc
    end
    @dosecontrol begin
        # _bioav is LogitNormal distributed
        _bioav = logistic(ηbioavLogit)
        bioav = (; DepotFast = _bioav, DepotSlow = 1 - _bioav)
    end
    @dynamics begin
        DepotFast' = -kaFast * DepotFast
        DepotSlow' = -kaSlow * DepotSlow
        Central' = kaFast * DepotFast + kaSlow * DepotSlow - CL / Vc * Central
    end
    @derived begin
        μ := @. Central / Vc
        dv ~ @. Normal(μ, abs(μ) * σ)
    end
end
PumasModel
  Parameters: θkaFast, θkaSlow, θCL, θVc, θbioav, ωCL, ωVc, ξbioav, σ
  Random effects: _CL, _Vc, ηbioavLogit
  Covariates:
  Dynamical system variables: DepotFast, DepotSlow, Central
  Dynamical system type: Matrix exponential
  Derived: dv
  Observed: dv

Now the same model but with the non-Gaussian random-effects using a beta distribution instead of the logit parameterization of the Gaussian distribution:

model_beta = @model begin
    @param begin
        θkaFast  RealDomain(; lower = 0)
        θkaSlow  RealDomain(; lower = 0)
        θCL  RealDomain(; lower = 0)
        θVc  RealDomain(; lower = 0)
        θbioav  RealDomain(; lower = 0.0, upper = 1.0)

        ωCL  RealDomain(; lower = 0)
        ωVc  RealDomain(; lower = 0)
        # We call this one n since the interpretation is like the length of a Binomial distribution
        nbioav  RealDomain(; lower = 0, init = 10)

        σ  RealDomain(; lower = 0)
    end
    @random begin
        ηCL ~ Gamma(1 / ωCL^2, θCL * ωCL^2)
        ηVc ~ Gamma(1 / ωVc^2, θVc * ωVc^2)
        # The makes E(_bioav) = θbioav
        # See https://en.wikipedia.org/wiki/Beta_distribution
        ηbioav ~ Beta(θbioav * nbioav, (1 - θbioav) * nbioav)
    end
    @pre begin
        kaFast = θkaFast
        kaSlow = θkaSlow
        CL = ηCL
        Vc = ηVc
    end
    @dosecontrol begin
        bioav = (; DepotFast = ηbioav, DepotSlow = 1 - ηbioav)
    end
    @dynamics begin
        DepotFast' = -kaFast * DepotFast
        DepotSlow' = -kaSlow * DepotSlow
        Central' = kaFast * DepotFast + kaSlow * DepotSlow - CL / Vc * Central
    end
    @derived begin
        μ := @. Central / Vc
        dv ~ @. Normal(μ, abs(μ) * σ)
    end
end
PumasModel
  Parameters: θkaFast, θkaSlow, θCL, θVc, θbioav, ωCL, ωVc, nbioav, σ
  Random effects: ηCL, ηVc, ηbioav
  Covariates:
  Dynamical system variables: DepotFast, DepotSlow, Central
  Dynamical system type: Matrix exponential
  Derived: dv
  Observed: dv

We have two types of random effects here.

First, as you are already familiar from the previous example, the clearance (CL), volume of concentration (Vc), and absorption rate (ka) have typical values (i.e. fixed effects) and between-subject variability (i.e. random effects) modelled as a gamma distribution.

Second, bioavailability Bioav is modelled as a beta distribution. Generally the beta distribution is parametrized as:

\[\text{Beta}(\alpha, \beta)\]

where both parameters \(\alpha\) and \(\beta\) are shape parameters.

One nice thing about the beta distribution is that it only takes values between and including 0 and 1, i.e. \([0, 1]\). This makes it the perfect candidate to model bioavailability parameters which are generally bounded in that interval. So, we don’t need to do a logistic transformation.

Another nice thing about the beta distribution is that we can use the alternative \((\mu, n)\)-parametrization with with \(\mu\) serving as a mean-value parameter:

\[\text{Beta}(\mu, n)\]

where in the original beta parametrization:

  • \(\alpha = \mu n\)
  • \(\beta = (1 - \mu) n\)

Hence, our mean is:

\[\operatorname{E}[F] = \mu = \theta_F\]

which, again, does not depend on any other parameters. The variance is

\[\operatorname{Var}(F) = \frac{\mu(1 - \mu)}{n}\]

so similar to the mean of Bernoulli trials.

Now let’s generate some data for the simulation:

dr = DosageRegimen(
    DosageRegimen(100; cmt = :DepotFast),
    DosageRegimen(100; cmt = :DepotSlow),
)
2×10 DataFrame
Row time cmt amt evid ii addl rate duration ss route
Float64 Symbol Float64 Int8 Float64 Int64 Float64 Float64 Int8 NCA.Route
1 0.0 DepotFast 100.0 1 0.0 0 0.0 0.0 0 NullRoute
2 0.0 DepotSlow 100.0 1 0.0 0 0.0 0.0 0 NullRoute
simtimes = [0.5, 1.0, 2.0, 4.0, 8.0, 24.0]
6-element Vector{Float64}:
  0.5
  1.0
  2.0
  4.0
  8.0
 24.0
trueparam = (;
    θkaFast = 0.9,
    θkaSlow = 0.2,
    θCL = 1.1,
    θVc = 10.0,
    θbioav = 0.7,
    ωCL = 0.1,
    ωVc = 0.1,
    nbioav = 40,
    σ = 0.1,
)
(θkaFast = 0.9,
 θkaSlow = 0.2,
 θCL = 1.1,
 θVc = 10.0,
 θbioav = 0.7,
 ωCL = 0.1,
 ωVc = 0.1,
 nbioav = 40,
 σ = 0.1,)

For simplicity, we just add 20% to the true values for initial values:

initparamBeta = map(t -> 1.2 * t, trueparam)
(θkaFast = 1.08,
 θkaSlow = 0.24,
 θCL = 1.32,
 θVc = 12.0,
 θbioav = 0.84,
 ωCL = 0.12,
 ωVc = 0.12,
 nbioav = 48.0,
 σ = 0.12,)

The initial values for the LogitNormal need to have ξbioav defined instead of nbioav:

initparamLogitNormal =
    (Base.structdiff(initparamBeta, NamedTuple{(:nbioav,)})..., ξbioav = 0.1)
(θkaFast = 1.08,
 θkaSlow = 0.24,
 θCL = 1.32,
 θVc = 12.0,
 θbioav = 0.84,
 ωCL = 0.12,
 ωVc = 0.12,
 σ = 0.12,
 ξbioav = 0.1,)

Setup empty Subjects with the dose information:

skeletonpop = map(i -> Subject(; id = i, events = dr), 1:40)
Population
  Subjects: 40
  Observations: 

Next, we simulate the data (while setting the seed for reprocibility):

Random.seed!(128)
simpop = Subject.(simobs(model_beta, skeletonpop, trueparam; obstimes = simtimes))

Finally let’s fit both models:

fit_logitnormal = fit(model_logitnormal, simpop, initparamLogitNormal, FOCE())
[ Info: Checking the initial parameter values.
[ Info: The initial negative log likelihood and its gradient are finite. Check passed.
Iter     Function value   Gradient norm 
     0     2.512964e+02     3.650794e+02
 * time: 3.790855407714844e-5
     1     2.005689e+02     1.252637e+02
 * time: 1.2691960334777832
     2     1.875728e+02     4.225518e+01
 * time: 1.4516410827636719
     3     1.863803e+02     3.429785e+01
 * time: 1.6247129440307617
     4     1.845581e+02     3.694230e+01
 * time: 1.7739169597625732
     5     1.828416e+02     1.170915e+01
 * time: 1.921889066696167
     6     1.823277e+02     1.004017e+01
 * time: 2.0685770511627197
     7     1.810629e+02     5.326780e+00
 * time: 2.221904993057251
     8     1.810479e+02     1.767987e+00
 * time: 2.37589693069458
     9     1.810272e+02     3.852037e+00
 * time: 2.526179075241089
    10     1.809414e+02     1.196625e+01
 * time: 2.775463104248047
    11     1.806489e+02     1.861441e+01
 * time: 2.923140048980713
    12     1.801984e+02     1.361805e+01
 * time: 3.2822489738464355
    13     1.800427e+02     2.176038e+01
 * time: 3.43080997467041
    14     1.796556e+02     7.078592e+00
 * time: 3.579172134399414
    15     1.795833e+02     1.583768e+01
 * time: 3.7298200130462646
    16     1.795221e+02     6.548930e+00
 * time: 3.8833200931549072
    17     1.794623e+02     6.608680e+00
 * time: 4.034837007522583
    18     1.793643e+02     6.618623e+00
 * time: 4.237936973571777
    19     1.793502e+02     1.024896e+01
 * time: 4.387614965438843
    20     1.793177e+02     1.200354e+01
 * time: 4.535366058349609
    21     1.792426e+02     1.619402e+01
 * time: 4.684567928314209
    22     1.791563e+02     1.125775e+01
 * time: 4.841470956802368
    23     1.790992e+02     7.616758e+00
 * time: 4.995755910873413
    24     1.790757e+02     8.625107e+00
 * time: 5.149465084075928
    25     1.790635e+02     4.847543e+00
 * time: 5.327893018722534
    26     1.790412e+02     5.835623e+00
 * time: 5.478737115859985
    27     1.789731e+02     1.119532e+01
 * time: 5.6313700675964355
    28     1.789079e+02     1.140511e+01
 * time: 5.783903121948242
    29     1.788320e+02     7.103608e+00
 * time: 5.9372289180755615
    30     1.787894e+02     5.550526e+00
 * time: 6.092466115951538
    31     1.787809e+02     7.408701e+00
 * time: 6.24617600440979
    32     1.787673e+02     8.662710e-01
 * time: 6.41758394241333
    33     1.787660e+02     5.235721e-01
 * time: 6.569940090179443
    34     1.787656e+02     3.711878e-01
 * time: 6.719027042388916
    35     1.787640e+02     8.611814e-01
 * time: 6.869327068328857
    36     1.787608e+02     2.111949e+00
 * time: 7.019320011138916
    37     1.787530e+02     3.936251e+00
 * time: 7.183906078338623
    38     1.787361e+02     5.994893e+00
 * time: 7.333142995834351
    39     1.787043e+02     7.164098e+00
 * time: 7.482250928878784
    40     1.786607e+02     5.650144e+00
 * time: 7.631515026092529
    41     1.786284e+02     1.894970e+00
 * time: 7.792985916137695
    42     1.786185e+02     4.455262e-01
 * time: 7.942742109298706
    43     1.786154e+02     1.194909e+00
 * time: 8.099092960357666
    44     1.786126e+02     1.312730e+00
 * time: 8.253530025482178
    45     1.786100e+02     8.172273e-01
 * time: 8.437880039215088
    46     1.786086e+02     1.453543e-01
 * time: 8.582926988601685
    47     1.786082e+02     1.853559e-01
 * time: 8.728703022003174
    48     1.786080e+02     2.747125e-01
 * time: 8.873130083084106
    49     1.786078e+02     1.989652e-01
 * time: 9.0138521194458
    50     1.786077e+02     5.864785e-02
 * time: 9.170355081558228
    51     1.786076e+02     3.327371e-02
 * time: 9.310466051101685
    52     1.786076e+02     5.696473e-02
 * time: 9.455011129379272
    53     1.786076e+02     4.644636e-02
 * time: 9.59791898727417
    54     1.786076e+02     1.344483e-02
 * time: 9.753396034240723
    55     1.786076e+02     5.947516e-03
 * time: 9.896965980529785
    56     1.786076e+02     1.353345e-02
 * time: 10.037405967712402
    57     1.786076e+02     1.008868e-02
 * time: 10.176701068878174
    58     1.786076e+02     3.361428e-03
 * time: 10.331032037734985
    59     1.786076e+02     1.523993e-03
 * time: 10.470469951629639
    60     1.786076e+02     2.928621e-03
 * time: 10.608613967895508
    61     1.786076e+02     2.928621e-03
 * time: 10.790230989456177
    62     1.786076e+02     4.474249e-03
 * time: 11.009462118148804
    63     1.786076e+02     5.070930e-03
 * time: 11.150094032287598
    64     1.786076e+02     5.122940e-03
 * time: 11.291050910949707
    65     1.786076e+02     5.122940e-03
 * time: 11.463757038116455
    66     1.786076e+02     5.121982e-03
 * time: 11.63953709602356
    67     1.786076e+02     5.121982e-03
 * time: 11.81610894203186
    68     1.786076e+02     5.121982e-03
 * time: 11.994873046875
    69     1.786076e+02     5.121982e-03
 * time: 12.19129490852356
    70     1.786076e+02     5.121982e-03
 * time: 12.383546113967896
FittedPumasModel

Dynamical system type:          Matrix exponential

Number of subjects:                             40

Observation records:         Active        Missing
    dv:                         240              0
    Total:                      240              0

Number of parameters:      Constant      Optimized
                                  0              9

Likelihood approximation:                     FOCE
Likelihood optimizer:                         BFGS

Termination Reason:              NoObjectiveChange
Log-likelihood value:                   -178.60759

--------------------
          Estimate
--------------------
θkaFast   0.91098
θkaSlow   0.13112
θCL       1.0854
θVc       7.1008
θbioav    0.4802
ωCL       0.088113
ωVc       0.12133
ξbioav    3.8363e-6
σ         0.10545
--------------------
fit_beta = fit(model_beta, simpop, initparamBeta, FOCE())
[ Info: Checking the initial parameter values.
[ Info: The initial negative log likelihood and its gradient are finite. Check passed.
Iter     Function value   Gradient norm 
     0     2.523111e+02     3.644346e+02
 * time: 2.193450927734375e-5
     1     2.014577e+02     1.265001e+02
 * time: 0.7854571342468262
     2     1.880885e+02     4.190708e+01
 * time: 1.0873100757598877
     3     1.870317e+02     8.825666e+01
 * time: 1.2381060123443604
     4     1.846027e+02     4.400156e+01
 * time: 1.4058871269226074
     5     1.834445e+02     1.906624e+01
 * time: 1.5553081035614014
     6     1.828599e+02     1.113882e+01
 * time: 1.7026090621948242
     7     1.815719e+02     7.449353e+00
 * time: 1.8533010482788086
     8     1.815131e+02     2.164677e+00
 * time: 2.001681089401245
     9     1.814896e+02     2.167318e+00
 * time: 2.201383113861084
    10     1.814458e+02     4.615740e+00
 * time: 2.351012945175171
    11     1.813173e+02     9.576966e+00
 * time: 2.5000901222229004
    12     1.809756e+02     2.052080e+01
 * time: 2.6516220569610596
    13     1.807625e+02     4.553350e+01
 * time: 2.8059980869293213
    14     1.802224e+02     6.550827e+00
 * time: 2.954967975616455
    15     1.800862e+02     2.865669e+00
 * time: 3.119112014770508
    16     1.800780e+02     1.164666e+00
 * time: 3.265622138977051
    17     1.800737e+02     7.952193e-01
 * time: 3.414073944091797
    18     1.800352e+02     4.860366e+00
 * time: 3.5655300617218018
    19     1.800089e+02     5.177171e+00
 * time: 3.7145321369171143
    20     1.799679e+02     4.304716e+00
 * time: 3.8807871341705322
    21     1.799153e+02     4.611393e+00
 * time: 4.032889127731323
    22     1.798423e+02     1.209482e+01
 * time: 4.180021047592163
    23     1.796822e+02     1.712120e+01
 * time: 4.331670045852661
    24     1.794274e+02     1.433687e+01
 * time: 4.492656946182251
    25     1.793768e+02     4.105433e+00
 * time: 4.6355249881744385
    26     1.793486e+02     1.806678e+00
 * time: 4.799288034439087
    27     1.793329e+02     5.459074e+00
 * time: 4.937288999557495
    28     1.793233e+02     2.889752e+00
 * time: 5.08840012550354
    29     1.793117e+02     1.420025e+00
 * time: 5.222069025039673
    30     1.792878e+02     4.116831e+00
 * time: 5.361823081970215
    31     1.792597e+02     4.587919e+00
 * time: 5.504231929779053
    32     1.792384e+02     4.248794e+00
 * time: 5.6646950244903564
    33     1.792250e+02     4.410458e+00
 * time: 5.810955047607422
    34     1.792025e+02     3.186755e+00
 * time: 5.959135055541992
    35     1.791840e+02     3.455052e+00
 * time: 6.106796979904175
    36     1.791706e+02     1.601035e+00
 * time: 6.257178068161011
    37     1.791555e+02     3.485560e+00
 * time: 6.415271043777466
    38     1.791298e+02     5.624719e+00
 * time: 6.560909032821655
    39     1.790705e+02     1.007188e+01
 * time: 6.706924915313721
    40     1.789886e+02     1.153111e+01
 * time: 6.853657960891724
    41     1.788564e+02     1.256040e+01
 * time: 7.014631986618042
    42     1.787385e+02     5.450112e+00
 * time: 7.1617591381073
    43     1.786624e+02     5.174549e+00
 * time: 7.3101959228515625
    44     1.786404e+02     2.969108e+00
 * time: 7.455703973770142
    45     1.786341e+02     2.776513e+00
 * time: 7.61186408996582
    46     1.786242e+02     1.847094e+00
 * time: 7.757029056549072
    47     1.786177e+02     1.331812e+00
 * time: 7.898993968963623
    48     1.786124e+02     5.074904e-01
 * time: 8.036939144134521
    49     1.786099e+02     2.928028e-01
 * time: 8.174222946166992
    50     1.786089e+02     2.292811e-01
 * time: 8.331521987915039
    51     1.786083e+02     1.602144e-01
 * time: 8.472182035446167
    52     1.786079e+02     8.960392e-02
 * time: 8.612591028213501
    53     1.786078e+02     2.250539e-02
 * time: 8.755043983459473
    54     1.786077e+02     3.750070e-02
 * time: 8.906196117401123
    55     1.786076e+02     2.547671e-02
 * time: 9.044104099273682
    56     1.786076e+02     7.939675e-03
 * time: 9.18436312675476
    57     1.786076e+02     5.044277e-03
 * time: 9.323588132858276
    58     1.786076e+02     6.897135e-03
 * time: 9.461491107940674
    59     1.786076e+02     4.595869e-03
 * time: 9.612677097320557
    60     1.786076e+02     4.342682e-03
 * time: 9.758156061172485
    61     1.786076e+02     4.342682e-03
 * time: 9.83341097831726
FittedPumasModel

Dynamical system type:          Matrix exponential

Number of subjects:                             40

Observation records:         Active        Missing
    dv:                         240              0
    Total:                      240              0

Number of parameters:      Constant      Optimized
                                  0              9

Likelihood approximation:                     FOCE
Likelihood optimizer:                         BFGS

Termination Reason:                      NoXChange
Log-likelihood value:                   -178.60759

-------------------
          Estimate
-------------------
θkaFast   0.91099
θkaSlow   0.13112
θCL       1.0854
θVc       7.1007
θbioav    0.48019
ωCL       0.088114
ωVc       0.12134
nbioav    4.3311e7
σ         0.10545
-------------------

As before, let’s compare the estimates:

compare_estimates(; logitnormal = fit_logitnormal, beta = fit_beta)
10×3 DataFrame
Row parameter logitnormal beta
String Float64? Float64?
1 θkaFast 0.910979 0.910988
2 θkaSlow 0.131118 0.131117
3 θCL 1.0854 1.0854
4 θVc 7.10079 7.10072
5 θbioav 0.4802 0.480191
6 ωCL 0.0881133 0.0881138
7 ωVc 0.121334 0.121335
8 σ 0.105448 0.105449
9 ξbioav 3.83627e-6 missing
10 nbioav missing 4.3311e7

Again, we’ll both PDFs from the estimated values:

plotdatabioav = @chain DataFrame(; x = range(0, 1; length = 1_000)) begin
    @rtransform begin
        :logitnormal =
            1 / coef(fit_logitnormal).ξbioav / (2π) / (:x * (1 - :x)) * exp(
                -(logit(:x) - logit(coef(fit_logitnormal).θbioav))^2 /
                (2 * coef(fit_logitnormal).ξbioav^2),
            )
        :beta = pdf(
            Beta(
                coef(fit_beta).θbioav * coef(fit_beta).nbioav,
                (1 - coef(fit_beta).θbioav) * coef(fit_beta).nbioav,
            ),
            :x,
        )
    end
end
first(plotdatabioav, 5)
5×3 DataFrame
Row x logitnormal beta
Float64 Float64 Float64
1 0.0 NaN 0.0
2 0.001001 0.0 0.0
3 0.002002 0.0 0.0
4 0.003003 0.0 0.0
5 0.004004 0.0 0.0
plt_pdf_bioav =
    data(stack(plotdatabioav, [:logitnormal, :beta])) *
    mapping(:x, :value; color = :variable) *
    visual(Lines);
draw(plt_pdf_bioav; axis = (; xticks = 0.1:0.1:1.0))

For this dataset, the two distributions differ significantly with the Beta model producing a distribution much closer to the truth but for other realizations of the simulated data they are closer to each other.