Calculating Parameter Uncertainty

Author

Patrick Kofod Mogensen

1 Introduction

A typical workflow for fitting a Pumas model and deriving parameter precision typically involves:

  1. Preparing the data and the model.
  2. Checking model-data compatibility.
  3. Obtaining initial parameter estimates.
  4. Fitting the model via a chosen estimation method.
  5. Interpreting the fit results.
  6. Computing parameter precision.
  7. (Optionally) proceeding with more advanced techniques like bootstrapping or SIR for robust uncertainty quantification.

The following sections will walk through these steps using a two-compartment PK model for Warfarin, focusing on the PK aspects only. Exploratory data analysis (EDA), although extremely important, is out of scope here. Readers interested in EDA are encouraged to consult other tutorials.

2 Model and Data

2.1 Model Definition

Below is the PK model, named warfarin_pk_model, defined in Pumas. This model contains:

  • Fixed effects (population parameters): pop_CL, pop_Vc, pop_Q, pop_Vp, pop_tabs, pop_lag
  • Inter-individual variability (IIV) components: pk_Ω, lag_ω
  • Residual error model parameters: σ_prop, σ_add
  • Covariates for scaling: FSZCL and FSZV
  • Differential equations describing the PK behavior in the compartments.
using Pumas
using PharmaDatasets
using DataFramesMeta
using PumasUtilities
using StatsBase # for cov2cor
warfarin_pk_model = @model begin
    @metadata begin
        desc = "Warfarin 2-compartment PK model (PD removed)"
        timeu = u"hr"
    end

    @param begin
        # PK parameters
        """
        Clearance (L/hr)
        """
        pop_CL  RealDomain(lower = 0.0, upper = 13.4, init = 0.134)
        """
        Central volume (L)
        """
        pop_Vc  RealDomain(lower = 0.0, upper = 81.1, init = 8.11)
        """
        Inter-compartmental clearance (L/hr)
        """
        pop_Q  RealDomain(lower = 0.0, upper = 5.0, init = 0.5)
        """
        Peripheral volume (L)
        """
        pop_Vp  RealDomain(lower = 0.0, upper = 200.0, init = 20.0)
        """
        Absorption lag time (hr)
        """
        pop_tabs  RealDomain(lower = 0.0, upper = 5.23, init = 0.523)
        """
        Lag time (hr)
        """
        pop_lag  RealDomain(lower = 0.0, upper = 5.0, init = 0.1)

        # Inter-individual variability
        """
          - ΩCL: Clearance
        """
        pk_Ω  PDiagDomain([0.01])
        """
        lag_ω: Lag time
        """
        lag_ω  RealDomain(lower = 0.0, upper = 2.0, init = 0.1)

        # Residual variability
        """
        σ_prop: Proportional error
        """
        σ_prop  RealDomain(lower = 0.0, init = 0.00752)
        """
        σ_add: Additive error
        """
        σ_add  RealDomain(lower = 0.0, init = 0.0661)
    end

    @random begin
        pk_η ~ MvNormal(pk_Ω)    # mean = 0, covariance = pk_Ω
        lag_η ~ Normal(0.0, lag_ω)
    end

    @covariates begin
        """
        FSZCL: Clearance scaling factor
        """
        FSZCL
        """
        FSZV: Volume scaling factor
        """
        FSZV
    end

    @pre begin
        CL = FSZCL * pop_CL
        Vc = FSZV * pop_Vc * exp(pk_η[1])
        Q = FSZCL * pop_Q
        Vp = FSZV * pop_Vp

        tabs = pop_tabs
        Ka = log(2) / tabs
    end

    @dosecontrol begin
        lags = (Depot = pop_lag * exp(lag_η),)
    end

    @vars begin
        cp := Central / Vc
    end

    @dynamics begin
        Depot' = -Ka * Depot
        Central' =
            Ka * Depot - (CL / Vc) * Central - (Q / Vc) * Central + (Q / Vp) * Peripheral
        Peripheral' = (Q / Vc) * Central - (Q / Vp) * Peripheral
    end

    @derived begin
        """
        Concentration (ng/mL)
        """
        conc ~ @. Normal(cp, sqrt((σ_prop * cp)^2 + σ_add^2))
    end
end
PumasModel
  Parameters: pop_CL, pop_Vc, pop_Q, pop_Vp, pop_tabs, pop_lag, pk_Ω, lag_ω, σ_prop, σ_add
  Random effects: pk_η, lag_η
  Covariates: FSZCL, FSZV
  Dynamical system variables: Depot, Central, Peripheral
  Dynamical system type: Matrix exponential
  Derived: conc
  Observed: conc

2.2 Data Preparation

The Warfarin data used in this tutorial is pulled from PharmaDatasets for demonstration purposes. Note how the code reshapes and prepares the data in “wide” format for reading into Pumas. Only the conc column is treated as observations for the PK model.

warfarin_data = dataset("paganz2024/warfarin_long")

# Minor numeric adjustment to avoid duplicate time points
@. warfarin_data[[133, 135, 137, 139], :TIME] += 1e-6

# Transform the data in a single chain of operations
warfarin_data_wide = @chain warfarin_data begin
    @rsubset !contains(:ID, "#")
    @rtransform begin
        # Scaling factors
        :FSZV = :WEIGHT / 70            # volume scaling
        :FSZCL = (:WEIGHT / 70)^0.75     # clearance scaling (allometric)
        # Column name for the DV
        :DVNAME = "DV$(:DVID)"
        # Dosing indicator columns
        :CMT = ismissing(:AMOUNT) ? missing : 1
        :EVID = ismissing(:AMOUNT) ? 0 : 1
    end
    unstack(Not([:DVID, :DVNAME, :DV]), :DVNAME, :DV)
    rename!(:DV1 => :conc, :DV2 => :pca)
end
317×13 DataFrame
292 rows omitted
Row ID TIME WEIGHT AGE SEX AMOUNT FSZV FSZCL CMT EVID DV0 pca conc
String3 Float64 Float64 Int64 Int64 Float64? Float64 Float64 Int64? Int64 Float64? Float64? Float64?
1 1 0.0 66.7 50 1 100.0 0.952857 0.96443 1 1 missing missing missing
2 1 0.0 66.7 50 1 missing 0.952857 0.96443 missing 0 missing 100.0 missing
3 1 24.0 66.7 50 1 missing 0.952857 0.96443 missing 0 missing 49.0 9.2
4 1 36.0 66.7 50 1 missing 0.952857 0.96443 missing 0 missing 32.0 8.5
5 1 48.0 66.7 50 1 missing 0.952857 0.96443 missing 0 missing 26.0 6.4
6 1 72.0 66.7 50 1 missing 0.952857 0.96443 missing 0 missing 22.0 4.8
7 1 96.0 66.7 50 1 missing 0.952857 0.96443 missing 0 missing 28.0 3.1
8 1 120.0 66.7 50 1 missing 0.952857 0.96443 missing 0 missing 33.0 2.5
9 2 0.0 66.7 31 1 100.0 0.952857 0.96443 1 1 missing missing missing
10 2 0.0 66.7 31 1 missing 0.952857 0.96443 missing 0 missing 100.0 missing
11 2 0.5 66.7 31 1 missing 0.952857 0.96443 missing 0 missing missing 0.0
12 2 2.0 66.7 31 1 missing 0.952857 0.96443 missing 0 missing missing 8.4
13 2 3.0 66.7 31 1 missing 0.952857 0.96443 missing 0 missing missing 9.7
306 31 48.0 83.3 24 1 missing 1.19 1.13936 missing 0 missing 24.0 6.4
307 31 72.0 83.3 24 1 missing 1.19 1.13936 missing 0 missing 22.0 4.5
308 31 96.0 83.3 24 1 missing 1.19 1.13936 missing 0 missing 28.0 3.4
309 31 120.0 83.3 24 1 missing 1.19 1.13936 missing 0 missing 42.0 2.5
310 32 0.0 62.0 21 1 93.0 0.885714 0.912999 1 1 missing missing missing
311 32 0.0 62.0 21 1 missing 0.885714 0.912999 missing 0 missing 100.0 missing
312 32 24.0 62.0 21 1 missing 0.885714 0.912999 missing 0 missing 36.0 8.9
313 32 36.0 62.0 21 1 missing 0.885714 0.912999 missing 0 missing 27.0 7.7
314 32 48.0 62.0 21 1 missing 0.885714 0.912999 missing 0 missing 24.0 6.9
315 32 72.0 62.0 21 1 missing 0.885714 0.912999 missing 0 missing 23.0 4.4
316 32 96.0 62.0 21 1 missing 0.885714 0.912999 missing 0 missing 20.0 3.5
317 32 120.0 62.0 21 1 missing 0.885714 0.912999 missing 0 missing 22.0 2.5

2.3 Creating a Pumas Population

Below is the creation of a population object in Pumas using read_pumas. Only the conc data are treated as the observation variable:

pop = read_pumas(
    warfarin_data_wide;
    id = :ID,
    time = :TIME,
    amt = :AMOUNT,
    cmt = :CMT,
    evid = :EVID,
    covariates = [:SEX, :WEIGHT, :FSZV, :FSZCL],
    observations = [:conc],
)
Population
  Subjects: 31
  Covariates: SEX, WEIGHT, FSZV, FSZCL
  Observations: conc
Note

The same data can contain multiple endpoints or PD observations. In this tutorial, the focus is solely on PK fitting. PKPD modeling on this warfarin dataset will be introduced later.

Also note, that parameter inference can be expensive and for that reason we simplified the model for this tutorial to decrease overall runtime.

2.4 Checking Model-Data Compatibility

Before performing any fit, it is recommended to verify whether the defined model can handle the provided dataset. Pumas offers functions such as loglikelihood and findinfluential for these checks.

2.4.1 The loglikelihood Function

The loglikelihood function computes the log-likelihood of the model given data and parameters. In Pumas, the signature typically looks like:

loglikelihood(model, population, params, approx)

where:

  • model: The Pumas model definition (e.g., warfarin_pk_model).
  • population: A Pumas population object (e.g., pop).
  • params: A named tuple or dictionary containing parameter values.
  • approx: The approximation method to use. Common options include FOCE(), FO(), LaplaceI(), etc.

For example, one might write:

# A named tuple of parameter values
param_vals = (
    pop_CL = 0.12,
    pop_Vc = 7.3,
    pop_Q = 0.2,
    pop_Vp = 2.0,
    pop_tabs = 0.523,
    pop_lag = 0.5,
    pk_Ω = Diagonal([0.01]),
    lag_ω = 0.1,
    σ_prop = 0.00752,
    σ_add = 0.0661,
)
foce_fit = fit(warfarin_pk_model, pop, param_vals, FOCE())
[ Info: Checking the initial parameter values.
[ Info: The initial negative log likelihood and its gradient are finite. Check passed.
Iter     Function value   Gradient norm 
     0     1.835895e+04     2.290908e+04
 * time: 0.03062605857849121
     1     7.292509e+03     1.690878e+04
 * time: 1.1808969974517822
     2     5.965156e+02     7.949923e+02
 * time: 1.2769811153411865
     3     5.688454e+02     6.737595e+02
 * time: 1.366847038269043
     4     4.573190e+02     1.371446e+02
 * time: 1.4520111083984375
     5     4.373891e+02     7.452681e+01
 * time: 1.5820810794830322
     6     4.314284e+02     6.260497e+01
 * time: 1.663438081741333
     7     4.300889e+02     7.624158e+01
 * time: 1.7445361614227295
     8     4.291775e+02     7.965160e+01
 * time: 1.8287160396575928
     9     4.275923e+02     7.228227e+01
 * time: 1.9110829830169678
    10     4.253032e+02     4.773878e+01
 * time: 1.9934639930725098
    11     4.228584e+02     4.532856e+01
 * time: 2.082880973815918
    12     4.215298e+02     3.914030e+01
 * time: 2.1735661029815674
    13     4.210396e+02     4.799680e+01
 * time: 2.291038990020752
    14     4.202779e+02     5.954420e+01
 * time: 2.3759500980377197
    15     4.187278e+02     6.710929e+01
 * time: 2.4654059410095215
    16     4.159568e+02     5.919902e+01
 * time: 2.5511491298675537
    17     4.112522e+02     2.431511e+01
 * time: 2.6377949714660645
    18     4.099062e+02     3.585627e+01
 * time: 2.725059986114502
    19     4.079242e+02     2.140592e+01
 * time: 2.8105101585388184
    20     4.077880e+02     2.086277e+01
 * time: 2.8931641578674316
    21     4.077130e+02     2.092335e+01
 * time: 2.9746530055999756
    22     4.075991e+02     2.099161e+01
 * time: 3.0756869316101074
    23     4.072771e+02     2.083122e+01
 * time: 3.1630699634552
    24     4.065853e+02     1.984999e+01
 * time: 3.251995086669922
    25     4.052522e+02     3.122539e+01
 * time: 3.339354991912842
    26     4.035823e+02     5.304491e+01
 * time: 3.4269299507141113
    27     4.023986e+02     6.508012e+01
 * time: 3.511726140975952
    28     4.017056e+02     6.447037e+01
 * time: 3.59666109085083
    29     4.011417e+02     5.749992e+01
 * time: 3.682171106338501
    30     4.004386e+02     5.186503e+01
 * time: 3.780456066131592
    31     3.987434e+02     3.579828e+01
 * time: 3.8686089515686035
    32     3.966281e+02     1.612106e+01
 * time: 3.9572060108184814
    33     3.951052e+02     8.313099e+00
 * time: 4.042083978652954
    34     3.945559e+02     7.029589e+00
 * time: 4.128461122512817
    35     3.942975e+02     5.889635e+00
 * time: 4.217126131057739
    36     3.941937e+02     6.115154e+00
 * time: 4.30329704284668
    37     3.941777e+02     6.315510e+00
 * time: 4.385453939437866
    38     3.941592e+02     6.540235e+00
 * time: 4.471365928649902
    39     3.941426e+02     6.668052e+00
 * time: 4.576354026794434
    40     3.940799e+02     6.964695e+00
 * time: 4.672075033187866
    41     3.939458e+02     7.408186e+00
 * time: 4.769268035888672
    42     3.936058e+02     1.169435e+01
 * time: 4.867789030075073
    43     3.929128e+02     1.570074e+01
 * time: 4.962630987167358
    44     3.917898e+02     1.648672e+01
 * time: 5.055016040802002
    45     3.906803e+02     1.212164e+01
 * time: 5.146291017532349
    46     3.902337e+02     5.709740e+00
 * time: 5.2394421100616455
    47     3.901807e+02     3.329299e+00
 * time: 5.348740100860596
    48     3.901766e+02     3.392606e+00
 * time: 5.449527978897095
    49     3.901747e+02     3.407238e+00
 * time: 5.548814058303833
    50     3.901711e+02     3.410919e+00
 * time: 5.657371997833252
    51     3.901627e+02     3.399260e+00
 * time: 5.762876033782959
    52     3.901408e+02     3.409710e+00
 * time: 5.867242097854614
    53     3.900855e+02     5.867207e+00
 * time: 5.971941947937012
    54     3.899510e+02     9.320886e+00
 * time: 6.068196058273315
    55     3.896752e+02     1.274900e+01
 * time: 6.163855075836182
    56     3.892579e+02     1.341768e+01
 * time: 6.293164014816284
    57     3.888174e+02     9.998268e+00
 * time: 6.412065029144287
    58     3.886853e+02     4.883105e+00
 * time: 6.536194086074829
    59     3.886619e+02     2.105634e+00
 * time: 6.65932297706604
    60     3.886589e+02     2.104546e+00
 * time: 6.782073974609375
    61     3.886585e+02     2.105201e+00
 * time: 6.893738031387329
    62     3.886576e+02     2.108794e+00
 * time: 7.002038955688477
    63     3.886560e+02     2.116212e+00
 * time: 7.10846209526062
    64     3.886517e+02     2.136333e+00
 * time: 7.214749097824097
    65     3.886411e+02     2.181259e+00
 * time: 7.348499059677124
    66     3.886122e+02     2.291304e+00
 * time: 7.4788219928741455
    67     3.885314e+02     2.731616e+00
 * time: 7.626352071762085
    68     3.882546e+02     5.313368e+00
 * time: 7.780339956283569
    69     3.880721e+02     6.435869e+00
 * time: 7.953803062438965
    70     3.877819e+02     8.003764e+00
 * time: 8.10882306098938
    71     3.875324e+02     8.835166e+00
 * time: 8.211177110671997
    72     3.867128e+02     8.354478e+00
 * time: 8.328505992889404
    73     3.859294e+02     5.009199e+00
 * time: 8.458390951156616
    74     3.856436e+02     1.046433e+01
 * time: 8.559417009353638
    75     3.853968e+02     6.781825e+00
 * time: 8.66032099723816
    76     3.853443e+02     1.126947e+00
 * time: 8.758721113204956
    77     3.853408e+02     1.024868e+00
 * time: 8.855348110198975
    78     3.853370e+02     1.022619e+00
 * time: 8.953561067581177
    79     3.853362e+02     1.007107e+00
 * time: 9.050931930541992
    80     3.853360e+02     9.949462e-01
 * time: 9.16311502456665
    81     3.853358e+02     9.900341e-01
 * time: 9.259371042251587
    82     3.853353e+02     9.834347e-01
 * time: 9.357239007949829
    83     3.853343e+02     9.749404e-01
 * time: 9.453855037689209
    84     3.853316e+02     1.280659e+00
 * time: 9.551794052124023
    85     3.853251e+02     2.177866e+00
 * time: 9.648829936981201
    86     3.853098e+02     3.380715e+00
 * time: 9.745789051055908
    87     3.852786e+02     4.629237e+00
 * time: 9.844831943511963
    88     3.852296e+02     5.201303e+00
 * time: 9.961122035980225
    89     3.851770e+02     4.139482e+00
 * time: 10.057606935501099
    90     3.851499e+02     1.363359e+00
 * time: 10.15296196937561
    91     3.851470e+02     9.175379e-02
 * time: 10.243948936462402
    92     3.851468e+02     5.049011e-02
 * time: 10.333117961883545
    93     3.851468e+02     5.279968e-02
 * time: 10.416093111038208
    94     3.851468e+02     5.294356e-02
 * time: 10.497369050979614
    95     3.851468e+02     5.294356e-02
 * time: 10.620511054992676
    96     3.851468e+02     5.291805e-02
 * time: 10.746169090270996
    97     3.851468e+02     5.288013e-02
 * time: 10.84384799003601
    98     3.851468e+02     5.264154e-02
 * time: 10.931471109390259
    99     3.851468e+02     5.264154e-02
 * time: 11.068500995635986
   100     3.851468e+02     5.174985e-02
 * time: 11.173434972763062
   101     3.851468e+02     5.183662e-02
 * time: 11.257236957550049
   102     3.851464e+02     2.351912e-01
 * time: 11.348469018936157
   103     3.851464e+02     2.719795e-01
 * time: 11.437340021133423
   104     3.851462e+02     3.202132e-01
 * time: 11.540651082992554
   105     3.851459e+02     3.357046e-01
 * time: 11.62959599494934
   106     3.851454e+02     2.922910e-01
 * time: 11.721426010131836
   107     3.851450e+02     2.316805e-01
 * time: 11.812660932540894
   108     3.851449e+02     2.848428e-01
 * time: 11.904603958129883
   109     3.851448e+02     2.793247e-01
 * time: 11.993396997451782
   110     3.851448e+02     2.687629e-01
 * time: 12.081670999526978
   111     3.851448e+02     2.437732e-01
 * time: 12.168030023574829
   112     3.851447e+02     1.976272e-01
 * time: 12.258527040481567
   113     3.851446e+02     1.155016e-01
 * time: 12.364496946334839
   114     3.851444e+02     4.201230e-02
 * time: 12.455963134765625
   115     3.851443e+02     3.192048e-02
 * time: 12.545577049255371
   116     3.851443e+02     1.834850e-02
 * time: 12.632020950317383
   117     3.851443e+02     3.016608e-03
 * time: 12.716598987579346
   118     3.851443e+02     2.730012e-03
 * time: 12.802331924438477
   119     3.851443e+02     2.730012e-03
 * time: 12.9360990524292
   120     3.851443e+02     2.730012e-03
 * time: 13.074465990066528
   121     3.851443e+02     2.730012e-03
 * time: 13.225244998931885
   122     3.851443e+02     2.730012e-03
 * time: 13.39491605758667
   123     3.851443e+02     2.730012e-03
 * time: 13.535709142684937
   124     3.851443e+02     2.730012e-03
 * time: 13.616513967514038
FittedPumasModel

Successful minimization:                      true

Likelihood approximation:                     FOCE
Likelihood Optimizer:                         BFGS
Dynamical system type:          Matrix exponential

Log-likelihood value:                   -385.14428
Number of subjects:                             31
Number of parameters:         Fixed      Optimized
                                  0             10
Observation records:         Active        Missing
    conc:                       239             47
    Total:                      239             47

-----------------------
             Estimate
-----------------------
pop_CL        0.12366
pop_Vc        7.3031
pop_Q         0.12275
pop_Vp        1.0977
pop_tabs      0.53254
pop_lag       0.9608
pk_Ω₁,₁       0.028906
lag_ω         0.41871
σ_prop        0.10049
σ_add         0.83524
-----------------------

2.5 Computing Parameter Precision with infer

The infer function in Pumas estimates the uncertainty (precision) of parameter estimates. Depending on the chosen method, infer can provide standard errors, confidence intervals, and correlation matrices.

The signature for infer often looks like:

infer(
    fpm::FittedPumasModel;
    level = 0.95,
    rethrow_error::Bool = false,
    sandwich_estimator::Bool = true,
)

where:

  • fpm::FittedPumasModel: The result of fit (e.g., foce_fit).
  • level: The confidence interval level (e.g., 0.95). The confidence intervals are calculated as the (1-level)/2 and (1+level)/2 quantiles of the estimated parameters
  • rethrow_error: If rethrow_error is false (the default value), no error will be thrown if the variance-covariance matrix estimator fails. If it is true, an error will be thrown if the estimator fails.
  • sandwich_estimator: Whether to use the sandwich estimator. If set to true (the default value), the sandwich estimator will be used. If set to false, the standard error will be calculated using the inverse of the Hessian matrix.

An example usage:

inference_results = infer(foce_fit; level = 0.95)
[ Info: Calculating: variance-covariance matrix.
[ Info: Done.
Asymptotic inference results using sandwich estimator

Successful minimization:                      true

Likelihood approximation:                     FOCE
Likelihood Optimizer:                         BFGS
Dynamical system type:          Matrix exponential

Log-likelihood value:                   -385.14428
Number of subjects:                             31
Number of parameters:         Fixed      Optimized
                                  0             10
Observation records:         Active        Missing
    conc:                       239             47
    Total:                      239             47

---------------------------------------------------------------------
            Estimate           SE                      95.0% C.I.
---------------------------------------------------------------------
pop_CL       0.12366         0.0048503        [ 0.11415  ; 0.13316 ]
pop_Vc       7.3031          0.54808          [ 6.2289   ; 8.3773  ]
pop_Q        0.12275         0.067804         [-0.010138 ; 0.25565 ]
pop_Vp       1.0977          0.44969          [ 0.21629  ; 1.979   ]
pop_tabs     0.53254         0.10221          [ 0.33222  ; 0.73287 ]
pop_lag      0.9608          0.17529          [ 0.61725  ; 1.3044  ]
pk_Ω₁,₁      0.028906        0.015478         [-0.0014308; 0.059243]
lag_ω        0.41871         0.099704         [ 0.22329  ; 0.61412 ]
σ_prop       0.10049         0.021933         [ 0.057507 ; 0.14348 ]
σ_add        0.83524         0.1227           [ 0.59475  ; 1.0757  ]
---------------------------------------------------------------------

This is the usual asymptotic variance-covariance estimator and we already saw this previous tutorials.

To get a matrix representation of this, use vcov()

vcov(inference_results)
10×10 Symmetric{Float64, Matrix{Float64}}:
  2.35251e-5   -0.000102312   0.000100713  …   5.45947e-5    0.000209148
 -0.000102312   0.300394     -0.0296822       -0.00550472   -0.00691567
  0.000100713  -0.0296822     0.00459734       0.000836642   0.000296097
  0.000361104  -0.218684      0.024185         0.00365746    0.0173442
  6.23038e-5   -0.0140864     0.00208861       0.000920447  -0.00103078
  0.000195498   0.0200596    -0.000272529  …   0.000427188  -0.00241991
 -3.33791e-5    0.00126169   -0.000153034     -0.000220081  -0.000118974
  1.95124e-6   -0.00706462    0.00086209       0.000508345  -0.00134152
  5.45947e-5   -0.00550472    0.000836642      0.000481042  -1.90123e-5
  0.000209148  -0.00691567    0.000296097     -1.90123e-5    0.0150553

and to get the condition number of the correlation matrix implied by vcov use

cond(inference_results)
79.08840258329506

Some users request the condition number of the covariance matrix itself and even if the use is often misguided it can be calculated as well.

cond(inference_results; correlation = false)
51438.04487227657

It is also possible to calculate the correlation matrix from the vcov output using the cov2cor function from StatsBase

cor_from_cov = cov2cor(Matrix(vcov(inference_results)))
10×10 Matrix{Float64}:
  1.0         -0.038487   0.306244    0.16556    …   0.513208     0.351433
 -0.038487     1.0       -0.798725   -0.88728       -0.457929    -0.102836
  0.306244    -0.798725   1.0         0.7932         0.562593     0.0355906
  0.16556     -0.88728    0.7932      1.0            0.370832     0.31434
  0.125679    -0.251459   0.301382    0.163542       0.410601    -0.0821928
  0.229949     0.2088    -0.0229304  -0.239941   …   0.111117    -0.112514
 -0.444614     0.148724  -0.145817   -0.020102      -0.648285    -0.0626442
  0.00403489  -0.129279   0.127522    0.0252609      0.232462    -0.109657
  0.513208    -0.457929   0.562593    0.370832       1.0         -0.00706475
  0.351433    -0.102836   0.0355906   0.31434       -0.00706475   1.0

And we see that the cond call above matches the condition number of the correlation matrix

cond(cor_from_cov)
79.08840258329506

2.5.1 Failure of the asymptotic variance-covariance matrix

It is well-known that the asymptotic variance-covariance matrix can sometimes fail to compute. This can happen for a variety of reasons including:

  1. There are parameters very close to a bound (often 0)
  2. The parameter vector does not represent a local minimum (optimization failed)
  3. The parameter vector does represent a local minimum but it’s not the global solution

The first one is often easy to check. The problematic parameters can be ones than have lower or upper bounds set. Often this will be a variance of standard deviation that has moved very close to the lower boundary. Consider removing the associated random effect if the problematic parameter is a variance in its specification or error model component if a combined additive and proportional error model is used and a standard deviation is close to zero.

It is also possible that the parameters do not represent a local minimum. In other words, they come from a failed fit. In this case, it can often be hard to perform the associated calculations in a stable way, but most importantly the results would not be interpretable even if they can be calculated in this case. The formulas are only valid for parameters that represent the actual (maximum likelihood) estimates. Please try to restart the optimization at different starting points in this case.

If you have reasons to believe that the model should indeed be a combined error model or if the random effect should be present it is also possible that the model converged to a local minimum that is not the global minimum. If the optimization happened to move to a region of the parameter state space that is hard to get out of you will often have to restart the fit at different parameter values. It is not possible to verify if the minimum is global in general, but it is always advised to try out more than one set of initial parameters when fitting models.

2.5.2 Bootstrap

Sometimes it is appropriate to use a different method to calculate estimate the uncertainty of the estimated parameters. Bootstrapping is a very popular approach that is simply but can often come a quite significant computational cost. Researchers often perform a bootstrapping step if their computational budget allows it or if the asymptotic variance-covariance estimator fails. Bootstrapping is advantageous because it does not rely on any invertability of matrices and it cannot produce negative variance confidence intervals because the resampled estimator respects the bounds of the model.

The signature for bootstrapping in infer looks as follows.

infer(fpm::FittedPumasModel, bts::Bootstrap; level = 0.95)

This does not help much before also looking at the interface for Bootstrap itself.

Bootstrap(;
    rng = Random.default_rng,
    samples = 200,
    stratify_by = nothing,
    ensemblealg = EnsembleThreads(),
)

Bootstrap accepts a random number generator rng, the number of resampled datasets to produce samples, if sampling should be stratified according to the covariates in stratify_by, and finally the ensemble algorithm to control parallelization across fits. On the JuliaHub platform this can be used together with distributed computing to perform many resampled estimations in a short time.

bootstrap_results = infer(foce_fit, Bootstrap(samples = 50); level = 0.95)
┌ Warning: Terminated early due to NaN in gradient.
└ @ Optim ~/run/_work/PumasTutorials.jl/PumasTutorials.jl/custom_julia_depot/packages/Optim/ZhuZN/src/multivariate/optimize/optimize.jl:98
┌ Info: Bootstrap inference finished.
│   Total resampled fits = 50
│   Success rate = 1.0
└   Unique resampled populations = 50
Bootstrap inference results

Successful minimization:                      true

Likelihood approximation:                     FOCE
Likelihood Optimizer:                         BFGS
Dynamical system type:          Matrix exponential

Log-likelihood value:                   -385.14428
Number of subjects:                             31
Number of parameters:         Fixed      Optimized
                                  0             10
Observation records:         Active        Missing
    conc:                       239             47
    Total:                      239             47

------------------------------------------------------------------
            Estimate          SE                     95.0% C.I.
------------------------------------------------------------------
pop_CL       0.12366        0.010125        [0.11106   ; 0.13797]
pop_Vc       7.3031         2.8853          [0.73301   ; 8.5659 ]
pop_Q        0.12275        0.30849         [0.049315  ; 0.89045]
pop_Vp       1.0977         6.8475          [0.4313    ; 7.0928 ]
pop_tabs     0.53254        1.9309          [0.047358  ; 5.23   ]
pop_lag      0.9608         0.37605         [0.3939    ; 1.6459 ]
pk_Ω₁,₁      0.028906       0.31858         [0.00091873; 0.60072]
lag_ω        0.41871        0.47598         [7.4566e-8 ; 1.9434 ]
σ_prop       0.10049        0.045366        [0.046031  ; 0.21184]
σ_add        0.83524        0.15362         [0.44253   ; 1.0066 ]
------------------------------------------------------------------
Successful fits: 50 out of 50
Unique resampled populations: 50
No stratification.

Again, we can calculate a covariance matrix based on the samples with vcov

vcov(bootstrap_results)
10×10 Matrix{Float64}:
  0.000102507  -0.00586239   0.00077559  …   0.000138881  -0.000342387
 -0.00586239    8.32485     -0.780348       -0.073393      0.205427
  0.00077559   -0.780348     0.0951677       0.00853684   -0.0223975
 -0.0534076    -3.56144      0.369428        0.00458453   -0.0593177
  0.00441233   -5.10107      0.467024        0.0361843    -0.125019
  0.000679858   0.0268326   -0.0033733   …  -0.00293692   -0.00621323
  0.000101106  -0.445253     0.0398041       0.000352758  -0.00750646
  0.00113881   -0.199771     0.0292119       0.00384677   -0.0310601
  0.000138881  -0.073393     0.00853684      0.00205807   -0.00407079
 -0.000342387   0.205427    -0.0223975      -0.00407079    0.0235992

and we can even get a DataFrame that includes all the estimated parameters from the sampled population fits

DataFrame(bootstrap_results.vcov)
50×10 DataFrame
25 rows omitted
Row pop_CL pop_Vc pop_Q pop_Vp pop_tabs pop_lag pk_Ω₁,₁ lag_ω σ_prop σ_add
Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64
1 0.120557 6.4577 0.199234 2.04539 0.444819 0.865639 0.0641857 0.269349 0.0639825 1.07727
2 0.127326 7.61269 0.0765175 0.828385 0.413194 1.02171 0.0319049 0.327721 0.0780306 0.894551
3 0.125148 5.79 0.380711 2.62787 1.22524 0.381494 0.0282837 0.798105 0.0958695 0.867472
4 0.127667 6.91563 0.397515 1.73248 0.728582 0.80891 5.49156e-9 3.87514e-14 0.194506 0.711823
5 0.139487 1.26349 0.661383 6.27044 5.22998 1.69134 0.298052 2.0 0.119511 0.483218
6 0.121249 7.37552 0.0943698 1.34029 0.692588 0.675815 0.0228526 0.487139 0.0798571 0.897624
7 0.127719 0.780702 1.20801 7.22525 3.58378 1.02986 0.166685 0.308838 0.216874 0.60563
8 0.124109 7.29044 0.220874 1.03724 0.944841 0.792992 0.0447704 0.500921 0.0955452 0.752949
9 0.124383 7.43103 0.16777 1.04022 0.512051 0.747488 0.0111888 0.350478 0.150554 0.660052
10 0.118758 7.52689 0.0988427 0.760195 0.932482 0.567801 0.049224 0.609705 0.0359675 0.659254
11 0.119892 6.64343 0.196923 1.52487 0.649178 0.68864 0.022323 0.260617 0.115618 0.653642
12 0.122103 1.23247 0.892998 6.44023 4.39628 0.728559 0.262875 0.355931 0.111655 0.670742
13 0.138626 7.55712 0.218458 1.3808 0.60626 2.58697 0.0283555 1.74832 0.0917372 0.440418
39 0.119194 8.44752 0.0194031 0.451408 0.515703 1.25297 0.0657994 0.329208 0.0556347 0.795263
40 0.120983 6.98201 0.105433 1.75105 0.0293232 1.45884 0.0522959 0.000117064 0.0925209 1.00712
41 0.116736 6.69018 0.186505 1.63966 0.354256 0.834854 0.048532 0.25561 0.0502336 0.867861
42 0.118616 7.21822 0.0758776 1.36281 0.402106 1.02282 0.0449987 0.246397 0.0803229 0.892164
43 0.121406 0.454008 0.754203 6.2481 5.23 1.48934 2.17527 3.05243e-21 0.0994137 0.66036
44 0.118802 7.79741 0.0517954 0.90519 0.343702 1.13328 0.0362455 0.241086 0.0448612 0.869297
45 0.116858 6.56498 0.213017 1.58344 1.02857 0.537274 0.0343308 0.521422 0.0736202 0.814461
46 0.121946 7.6343 0.0762105 1.19186 0.489409 1.0806 0.0260378 0.299237 0.0765756 1.005
47 0.125514 7.41819 0.0708068 1.00399 0.109477 1.47399 0.0215761 0.0128917 0.0929907 0.885367
48 0.135721 5.50171 0.713154 2.90404 1.51585 0.474578 0.0295195 1.47744 0.160802 0.703552
49 0.122884 1.09562 0.69433 6.30673 5.23 0.847306 0.0927949 0.363307 0.146633 0.521977
50 0.063161 8.78094 0.0647714 48.2674 0.491849 0.97931 0.0462727 0.183935 0.0542663 0.838545

This is very useful for histogram plotting of parameter distributions.

2.5.3 Sampling Importance Re-sampling

Pumas has support for inference through Sampling Importance Re-sampling through the SIR() input to infer. The signature for SIR in infer looks as follows.

infer(fpm::FittedPumasModel, sir::SIR; level = 0.95, ensemblealg = EnsembleThreads())

This performs sampling importance re-sampling for the population in fpm. The confidence intervals are calculated as the (1-level)/2 and (1+level)/2 quantiles of the sampled parameters. ensemblealg can be EnsembleThreads() (the default value) to use multi-threading or EnsembleSerial() to use a single thread.

The signature for the SIR specification is

SIR(; rng, samples, resamples)

SIR accepts a random number generator rng, the number of samples from the proposal, samples, can be set and to complete the specification the resample has to be set. It is suggested that samples is at least 5 times larger than resamples in practice to have sufficient samples to resample from.

sir_results = infer(foce_fit, SIR(samples = 1000, resamples = 200); level = 0.95)
[ Info: Calculating: variance-covariance matrix.
[ Info: Done.
[ Info: Running SIR.
[ Info: Resampling.
Simulated inference results

Successful minimization:                      true

Likelihood approximation:                     FOCE
Likelihood Optimizer:                         BFGS
Dynamical system type:          Matrix exponential

Log-likelihood value:                   -385.14428
Number of subjects:                             31
Number of parameters:         Fixed      Optimized
                                  0             10
Observation records:         Active        Missing
    conc:                       239             47
    Total:                      239             47

--------------------------------------------------------------------
            Estimate           SE                     95.0% C.I.
--------------------------------------------------------------------
pop_CL       0.12366         0.0035902         [0.11734 ; 0.13051 ]
pop_Vc       7.3031          0.50312           [6.2812  ; 8.1268  ]
pop_Q        0.12275         0.064973          [0.014028; 0.26124 ]
pop_Vp       1.0977          0.3639            [0.4781  ; 1.935   ]
pop_tabs     0.53254         0.091747          [0.42915 ; 0.79164 ]
pop_lag      0.9608          0.13207           [0.49744 ; 0.99674 ]
pk_Ω₁,₁      0.028906        0.013871          [0.010166; 0.060285]
lag_ω        0.41871         0.081167          [0.37941 ; 0.69921 ]
σ_prop       0.10049         0.022577          [0.055662; 0.14483 ]
σ_add        0.83524         0.10222           [0.65035 ; 1.0776  ]
--------------------------------------------------------------------

Notice, that SIR bases its first samples number of samples from a truncated multivariate normal distribution with mean of the maximum likelihood population level parameters and covariance matrix that is the asymptotic matrix calculated by infer(fpm). This means that to use SIR the matrix is question has to be successfully calculated by infer(fpm) under the hood.

The methods for vcov and DataFrame(sir_results.vcov) that we saw for Bootstrap also applies here

vcov(sir_results)
10×10 Matrix{Float64}:
  1.28894e-5   2.70649e-5   5.274e-5     …   3.911e-5     -1.87843e-6
  2.70649e-5   0.253131    -0.0257315       -0.00588294    0.0141594
  5.274e-5    -0.0257315    0.00422149       0.000855764  -0.00233357
 -3.6031e-5   -0.154767     0.0167583        0.00259599   -0.00147255
  4.4129e-5   -0.0190394    0.00254131       0.000995957  -0.00318513
  8.53863e-5  -0.00304593   0.00137696   …   0.000718823  -0.00246933
 -2.19133e-5   0.00162398  -0.000182412     -0.000200852   0.000279956
  2.28063e-5  -0.00493316   0.000828298      0.000514035  -0.0023004
  3.911e-5    -0.00588294   0.000855764      0.000509709  -0.00112062
 -1.87843e-6   0.0141594   -0.00233357      -0.00112062    0.0104489

and

DataFrame(sir_results.vcov)
200×10 DataFrame
175 rows omitted
Row pop_CL pop_Vc pop_Q pop_Vp pop_tabs pop_lag pk_Ω₁,₁ lag_ω σ_prop σ_add
Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64 Float64
1 0.120452 6.33817 0.227315 1.84366 0.738502 0.969038 0.041874 0.565502 0.120138 0.724461
2 0.121319 7.61521 0.13369 1.11958 0.59516 0.819195 0.0383821 0.486078 0.0621475 0.904747
3 0.128086 7.27365 0.164885 1.00112 0.773617 1.01215 0.0320492 0.559006 0.133827 0.652045
4 0.126595 7.703 0.0923832 1.14081 0.801104 0.636096 0.0420994 0.661075 0.0966072 0.938117
5 0.128139 7.72191 0.196377 0.811411 0.519029 0.792641 0.0625774 0.491349 0.0789868 0.793253
6 0.123952 8.08135 0.0430609 0.821958 0.569438 0.755458 0.0428529 0.55705 0.105087 0.878993
7 0.117459 7.3162 0.0499608 1.30251 0.708196 0.656217 0.0437035 0.598285 0.0913149 0.889262
8 0.121096 7.50486 0.137965 1.28249 0.553584 0.769124 0.0789792 0.445278 0.0573354 0.951855
9 0.126205 6.7427 0.198909 1.48821 0.675411 0.497277 0.0347115 0.590175 0.0973765 0.843432
10 0.124098 7.89136 0.0558241 0.700769 0.559757 0.758432 0.0469582 0.531335 0.0971841 0.889568
11 0.121515 7.87714 0.0554836 0.479318 0.578338 0.61916 0.0392379 0.664918 0.0989668 0.808713
12 0.122873 7.27312 0.120638 1.00513 0.671161 0.421269 0.0373674 0.703267 0.1059 0.938679
13 0.126574 7.99423 0.091001 0.78432 0.493283 0.558988 0.0399548 0.60235 0.0850221 0.894139
189 0.125874 7.25386 0.155533 1.11484 0.528108 0.992596 0.0217301 0.397432 0.124037 0.86935
190 0.128931 7.31949 0.0994734 1.06307 0.646142 0.876359 0.0356783 0.490678 0.0928399 0.949515
191 0.11799 7.51179 0.0489818 0.868392 0.634647 0.807152 0.0231789 0.555591 0.094862 0.78834
192 0.123349 7.68835 0.130598 0.706719 0.543319 0.947635 0.0473085 0.416655 0.08845 0.851074
193 0.117229 6.96453 0.109829 1.30848 0.653853 0.726296 0.0460554 0.489996 0.0859431 0.935173
194 0.12112 7.24597 0.115413 1.26242 0.621571 0.853633 0.0399529 0.484128 0.0961005 0.822067
195 0.122174 6.83457 0.159878 1.5827 0.576497 0.969794 0.0295114 0.401337 0.109391 0.901286
196 0.129375 7.75966 0.132249 0.920036 0.643227 0.973947 0.0204307 0.492483 0.119423 0.770768
197 0.129705 6.37076 0.257739 1.84336 0.632963 0.993233 0.00919332 0.440514 0.154944 0.752252
198 0.122227 6.5618 0.170741 1.48927 0.550102 0.961237 0.0225875 0.358677 0.0923567 0.950669
199 0.121774 8.12433 0.0342786 0.879075 0.54661 0.903724 0.0548346 0.384889 0.0680357 1.07067
200 0.130318 7.12203 0.116892 1.08476 0.507377 0.725607 0.0302364 0.560484 0.123345 0.758802

2.5.4 Marginal MCMC

An alternative to Bootstrap and SIR is to simply use the MarginalMCMC sampler which is a Hamiltonian Monte Carlo (HMC) No-U-Turn Sampler (NUTS) that will sample from the marginal loglikelihood. This means that individual effects are marginalized out and then we sample the population level parameters. This does not resample populations like Bootstrap so inference may be more stable if many resampled populations lead to extreme estimates and it differs from SIR in that it does not need the asymptotic covariance matrix to be calculated and sampled from.

This method requires slightly more understanding from the user when setting the options that can be found through the docstring of MarginalMCMC. Some knowledge of Bayesian inference is advised.

inference_results = infer(foce_fit, MarginalMCMC(); level = 0.95)

As sampling based inference can be computationally intensive we exclude the actual invocation of this method from this tutorial.

3 Concluding Remarks

This tutorial showcased a typical Pumas workflow for parameter inference in Pumas models. We showed the different methods supported by Pumas for calculating parameter uncertainty.