Fitting and Inferring a Compartmental PK Model

Author

Vijay Ivaturi

1 Introduction

A typical workflow for fitting a Pumas model and deriving parameter precision typically involves:

  1. Preparing the data and the model.
  2. Checking model-data compatibility.
  3. Obtaining initial parameter estimates.
  4. Fitting the model via a chosen estimation method.
  5. Interpreting the fit results.
  6. Computing parameter precision.
  7. (Optionally) proceeding with more advanced techniques like bootstrapping or SIR for robust uncertainty quantification.

The following sections will walk through these steps using a one-compartment PK model for Warfarin, focusing on the PK aspects only. Exploratory data analysis (EDA), although extremely important, is out of scope here. Readers interested in EDA are encouraged to consult other tutorials.

2 Model and Data

2.1 Model Definition

Below is the PK model, named warfarin_pk_model, defined in Pumas. This model contains:

  • Fixed effects (population parameters): pop_CL, pop_Vc, pop_tabs, pop_lag
  • Inter-individual variability (IIV) components: pk_Ω
  • Residual error model parameters: σ_prop, σ_add
  • Covariates for scaling: FSZCL and FSZV
  • Differential equations describing the PK behavior in the compartments.
using Pumas
using PharmaDatasets
using DataFramesMeta
using CairoMakie
using AlgebraOfGraphics
using PumasUtilities
using Markdown
warfarin_pk_model = @model begin

    @metadata begin
        desc = "Warfarin 1-compartment PK model (PD removed)"
        timeu = u"hr"
    end

    @param begin
        # PK parameters
        """
        Clearance (L/hr)
        """
        pop_CL  RealDomain(lower = 0.0, init = 0.134)
        """
        Central volume (L)
        """
        pop_Vc  RealDomain(lower = 0.0, init = 8.11)
        """
        Absorption lag time (hr)
        """
        pop_tabs  RealDomain(lower = 0.0, init = 0.523)
        """
        Lag time (hr)
        """
        pop_lag  RealDomain(lower = 0.0, init = 0.1)

        # Inter-individual variability
        """
          - ΩCL: Clearance
          - ΩVc: Central volume
          - Ωtabs: Absorption lag time
        """
        pk_Ω  PDiagDomain([0.01, 0.01, 0.01])
        # Residual variability
        """
        σ_prop: Proportional error
        """
        σ_prop  RealDomain(lower = 0.0, init = 0.00752)
        """
        σ_add: Additive error
        """
        σ_add  RealDomain(lower = 0.0, init = 0.0661)
    end

    @random begin
        pk_η ~ MvNormal(pk_Ω)    # mean = 0, covariance = pk_Ω
    end

    @covariates begin
        """
        FSZCL: Clearance scaling factor
        """
        FSZCL
        """
        FSZV: Volume scaling factor
        """
        FSZV
    end

    @pre begin
        CL = FSZCL * pop_CL * exp(pk_η[1])
        Vc = FSZV * pop_Vc * exp(pk_η[2])

        tabs = pop_tabs * exp(pk_η[3])
        Ka = log(2) / tabs
    end

    @dosecontrol begin
        lags = (Depot = pop_lag,)
    end

    @vars begin
        cp := Central / Vc
    end

    @dynamics begin
        Depot' = -Ka * Depot
        Central' = Ka * Depot - (CL / Vc) * Central
    end

    @derived begin
        """
        Concentration (ng/mL)
        """
        conc ~ @. Normal(cp, sqrt((σ_prop * cp)^2 + σ_add^2))
    end
end
PumasModel
  Parameters: pop_CL, pop_Vc, pop_tabs, pop_lag, pk_Ω, σ_prop, σ_add
  Random effects: pk_η
  Covariates: FSZCL, FSZV
  Dynamical system variables: Depot, Central
  Dynamical system type: Matrix exponential
  Derived: conc
  Observed: conc

2.2 Data Preparation

The Warfarin data used in this tutorial is pulled from PharmaDatasets for demonstration purposes. Note how the code reshapes and prepares the data in “wide” format for reading into Pumas. Only the conc column is treated as observations for the PK model.

warfarin_data = dataset("pumas/warfarin_nonmem")

# Step 2: Fix Duplicate Time Points
# -------------------------------
# Some subjects have duplicate time points for dvid = 1
# For this dataset, the triple (id, time, dvid) should define
# a row uniquely, but
nrow(warfarin_data)
nrow(unique(warfarin_data, ["id", "time", "dvid"]))

# We can identify the problematic rows by grouping on the index variables
@chain warfarin_data begin
    @groupby :id :time :dvid
    @transform :tmp = length(:id)
    @rsubset :tmp > 1
end

# Transform the data in a single chain of operations
warfarin_data_wide = @chain warfarin_data begin
    @rtransform begin
        # Scaling factors
        :FSZV = :wtbl / 70            # volume scaling
        :FSZCL = (:wtbl / 70)^0.75     # clearance scaling (allometric)
        # Column name for the dv
        :dvname = "dv$(:dvid)"
        # Dosing indicator columns
        :cmt = ismissing(:amt) ? missing : 1
        :evid = iszero(:amt) ? 0 : 1
    end
    unstack(Not([:dvid, :dvname, :dv]), :dvname, :dv)
    rename!(:dv1 => :conc, :dv2 => :pca)
    select!(Not([:dv0]))
end
330×12 DataFrame
305 rows omitted
Row id time evid amt wtbl age sex FSZV FSZCL cmt conc pca
Int64 Float64 Int64 Float64 Float64 Int64 String1 Float64 Float64 Int64 Float64? Float64?
1 1 0.0 1 100.0 66.7 50 M 0.952857 0.96443 1 missing missing
2 1 0.5 0 0.0 66.7 50 M 0.952857 0.96443 1 0.0 missing
3 1 1.0 0 0.0 66.7 50 M 0.952857 0.96443 1 1.9 missing
4 1 2.0 0 0.0 66.7 50 M 0.952857 0.96443 1 3.3 missing
5 1 3.0 0 0.0 66.7 50 M 0.952857 0.96443 1 6.6 missing
6 1 6.0 0 0.0 66.7 50 M 0.952857 0.96443 1 9.1 missing
7 1 9.0 0 0.0 66.7 50 M 0.952857 0.96443 1 10.8 missing
8 1 12.0 0 0.0 66.7 50 M 0.952857 0.96443 1 8.6 missing
9 1 24.0 0 0.0 66.7 50 M 0.952857 0.96443 1 5.6 44.0
10 1 36.0 0 0.0 66.7 50 M 0.952857 0.96443 1 4.0 27.0
11 1 48.0 0 0.0 66.7 50 M 0.952857 0.96443 1 2.7 28.0
12 1 72.0 0 0.0 66.7 50 M 0.952857 0.96443 1 0.8 31.0
13 1 96.0 0 0.0 66.7 50 M 0.952857 0.96443 1 missing 60.0
319 32 48.0 0 0.0 62.0 21 M 0.885714 0.912999 1 6.9 24.0
320 32 72.0 0 0.0 62.0 21 M 0.885714 0.912999 1 4.4 23.0
321 32 96.0 0 0.0 62.0 21 M 0.885714 0.912999 1 3.5 20.0
322 32 120.0 0 0.0 62.0 21 M 0.885714 0.912999 1 2.5 22.0
323 33 0.0 1 100.0 66.7 50 M 0.952857 0.96443 1 missing missing
324 33 0.0 0 0.0 66.7 50 M 0.952857 0.96443 1 missing 100.0
325 33 24.0 0 0.0 66.7 50 M 0.952857 0.96443 1 9.2 49.0
326 33 36.0 0 0.0 66.7 50 M 0.952857 0.96443 1 8.5 32.0
327 33 48.0 0 0.0 66.7 50 M 0.952857 0.96443 1 6.4 26.0
328 33 72.0 0 0.0 66.7 50 M 0.952857 0.96443 1 4.8 22.0
329 33 96.0 0 0.0 66.7 50 M 0.952857 0.96443 1 3.1 28.0
330 33 120.0 0 0.0 66.7 50 M 0.952857 0.96443 1 2.5 33.0

2.3 Creating a Pumas Population

Below is the creation of a population object in Pumas using read_pumas. Only the conc data are treated as the observation variable:

pop = read_pumas(
    warfarin_data_wide;
    id = :id,
    time = :time,
    amt = :amt,
    cmt = :cmt,
    evid = :evid,
    covariates = [:sex, :wtbl, :FSZV, :FSZCL],
    observations = [:conc],
)
Population
  Subjects: 32
  Covariates: sex, wtbl, FSZV, FSZCL
  Observations: conc
Note

The same data can contain multiple endpoints or PD observations. In this tutorial, the focus is solely on PK fitting. PKPD modeling on this warfarin dataset will be introduced later.

2.4 Checking Model-Data Compatibility

Before performing any fit, it is recommended to verify whether the defined model is consistent with the provided dataset. Pumas offers functions such as loglikelihood and findinfluential for these checks.

2.4.1 The loglikelihood Function

The loglikelihood function computes the log-likelihood of the model given data and parameters. In Pumas, the signature typically looks like:

loglikelihood(model, population, params, approx)

where:

  • model: The Pumas model definition (e.g., warfarin_pk_model).
  • population: A Pumas population object (e.g., pop).
  • params: A named tuple or dictionary containing parameter values.
  • approx: The approximation method to use. Common options include FOCE(), FO(), LaplaceI(), etc.

For example, one might write:

# A named tuple of parameter values
param_vals = (
    pop_CL = 0.134,
    pop_Vc = 8.11,
    pop_tabs = 0.523,
    pop_lag = 0.1,
    pk_Ω = Diagonal([0.01, 0.01, 0.01]),
    σ_prop = 0.00752,
    σ_add = 0.0661,
)

ll_value = loglikelihood(warfarin_pk_model, pop, param_vals, FOCE())
-12090.636664027372

The initial loglikelihood of the warfarin PK model given the data and parameter values is `LL = -12090.636664027372 `.

If the model and data are incompatible (e.g., missing doses for compartments, or out-of-range parameter values), loglikelihood might return an error or produce a warning. A successful computation is a good sign that the model can handle the data.

2.4.2 The findinfluential Function

The findinfluential function helps identify observations that disproportionately influence the fit. It can be used before or after fitting, but even with initial guesses, it can highlight potentially problematic data points. The most notable case is when the loglikelihood cannot be evaluated in which case the data returned for the problematic subject does not return a finite value.

Typical usage is:

findinfluential(warfarin_pk_model, pop, params; approx)

This will list the individual loglikelihoods for each subject. Readers can then inspect these observations further, potentially re-checking the data quality or adjusting the model assumptions if needed for each subject.

fi = findinfluential(warfarin_pk_model, pop, param_vals, FOCE())
32-element Vector{@NamedTuple{id::String, nll::Float64}}:
 (id = "12", nll = 3179.162620631005)
 (id = "14", nll = 2113.146717564112)
 (id = "13", nll = 1627.4380743700433)
 (id = "9", nll = 1050.6701195571484)
 (id = "2", nll = 780.9109417396234)
 (id = "1", nll = 699.9267530243993)
 (id = "7", nll = 512.7488860179473)
 (id = "11", nll = 470.0423353858021)
 (id = "6", nll = 311.9666456226985)
 (id = "3", nll = 192.83907815907168)
 ⋮
 (id = "16", nll = 22.434033734725986)
 (id = "5", nll = 18.475981181576064)
 (id = "32", nll = 17.87557264312157)
 (id = "17", nll = 17.341505598982042)
 (id = "26", nll = 8.191625356662307)
 (id = "24", nll = 6.4001549755272755)
 (id = "31", nll = 6.363630970170725)
 (id = "23", nll = 5.48597914877157)
 (id = "29", nll = 4.343343958293616)

The default output is a vector of NamedTuples, which can easily be converted into a DataFrame for further analysis. One can then plot the distribution of the individual loglikelihoods to identify any potential problematic subjects.

fidf = DataFrame(fi)
hist(
    fidf.nll,
    axis = (xlabel = "Log-likelihood", ylabel = "Frequency"),
    color = (:blue, 0.5),
)

The plot above shows that there are some subjects with very high initial log-likelihoods, which might be worth investigating. Notice, that a high initial log-likelihood does not necessarily imply a model or data problem. It can simply be that for the chosen initial population parameters the model does not fit the observations well for the subject in question.

2.5 Getting Good Initial Estimates Using Naive Pooled

A reliable starting point for nonlinear mixed-effects modeling is to get reasonable initial parameter estimates. Pumas provides approaches such as:

  • Naive Pooled: Treats all data as if there were no inter-individual variability.
  • NCA (Non-compartmental Analysis): Uses PK metrics (CL, V, etc.) from a non-compartmental approach to seed the parameter values (not showcased in this tutorial).

Here, the Naive Pooled approach is demonstrated. This method estimates population parameters by ignoring inter-individual differences, effectively pooling data as if from a single “super-subject.”

Below is the signature and documentation for the Pumas fit function. For more details, see the Fitting in Pumas Documentation.

fit(
  model::PumasModel,
  population::Population,
  param::NamedTuple,
  approx::Union{LikelihoodApproximation, MAP};
  optim_alg = Optim.BFGS(linesearch = Optim.LineSearches.BackTracking(), initial_stepnorm = 1.0),
  optim_options::NamedTuple = NamedTuple(),
  optimize_fn = nothing,
  constantcoef::Tuple = (),
  ensemblealg::SciMLBase.EnsembleAlgorithm = EnsembleThreads(),
  checkidentification = true,
  diffeq_options = NamedTuple(),
  verbose = true,
  ignore_numerical_error = true,
)

Fit the Pumas model model to the dataset population with starting values param using the estimation method approx. Currently supported values for the `approx` argument are `FO`, `FOCE`,
`LaplaceI`, `NaivePooled`, and `BayesMCMC`. See the online documentation for more details about the different methods.

To control the optimization procedure for finding population parameters (fixed effects), use the `optim_alg` and `optim_options` keyword arguments. In previous versions of Pumas the argument
`optimize_fn` was used, but is now discouraged and will be removed in a later version of Pumas. These options control the optimization of all `approx` methods except BayesMCMC. The default
optimization function uses the quasi-Newton routine BFGS method from the `Optim` package. It can be changed by setting the `optim_alg` to an algorithm implemented in Optim.jl as long as it
does not use second order derivatives. Optimization specific options can be passed in using the `optim_options` keyword and has to be a `NamedTuple` with names and values that match the
`Optim.Options` type. For example, the optimization trace can be disabled and the algorithm can be changed to L-BFGS by passing `optim_alg=Optim.LBFGS(), optim_options = ;(show_trace=false)`
to fit. See [Optim](https://docs.pumas.ai/stable/basics/estimation/#Optimization-option) for more defails.

It is possible to fix one or more parameters of the fit by passing a Tuple of Symbols as the `constantcoef` argument with elements corresponding to the names of the fixed parameters, e.g.
`constantcoef=(:σ,)`.

When models include an `@random` block and fitting with NaivePooled is requested, it is required that the user sets all variability parameters to zero with `constantcoef` such that these can
be ignored in the optimization, e.g. `constantcoef = (:Ω,)` while overwriting the corresponding values in `param` with `(; init_params..., Ω = zeros(3, 3))`.


Parallelization of the optimization is supported for most estimation methods via the ensemble interface of DifferentialEquations.jl. Currently, the only supported options are:

    •  `EnsembleThreads()`: the default. Accelerate fits by using multiple threads.

    •  `EnsembleSerial()`: fit using a single thread.

    •  `EnsembleDistributed()`: fit by using multiple worker processes.

The `fit` function will check if any gradients and throw an exception if any of the elements are exactly zero unless `checkidentification` is set to false.

Further keyword arguments can be passed via the `diffeq_options` argument. This allows for passing arguments to the differential equations solver such as `alg`, `abstol`, and `reltol`.
See the [documentation](https://docs.pumas.ai/stable/basics/simulation/#Options-and-settings-for-simobs) for more details.

The keyword `verbose` controls if info statements about initial evaluations of the loglikelihood function and gradient should be printed or not. Defaults to true.

Since the numerical optimization routine can sometimes visit extreme regions of the parameter space during it's exploration we automatically handle the situation where the input
parameters result in errors when solving the dynamical system. This allows the algorithm to recover and continue in many situations that would have otherwise stalled early. Sometimes, it
is useful to turn this error handling off when debugging a model fit, and this can be done by setting `ignore_numerical_error = false`.

Below is how to run a Naive Pooled analysis:

naive_fit = fit(warfarin_pk_model, pop, param_vals, NaivePooled(); constantcoef = (:pk_Ω,))
[ Info: Checking the initial parameter values.
[ Info: The initial negative log likelihood and its gradient are finite. Check passed.
Iter     Function value   Gradient norm 
     0     6.698433e+04     7.339547e+04
 * time: 0.06017494201660156
     1     1.608646e+04     2.976762e+04
 * time: 2.2393579483032227
     2     5.868548e+03     1.035750e+04
 * time: 2.244560956954956
     3     2.939718e+03     4.142057e+03
 * time: 2.250001907348633
     4     1.911162e+03     2.672629e+03
 * time: 2.2557849884033203
     5     1.258125e+03     1.666208e+03
 * time: 2.2610299587249756
     6     8.872473e+02     1.004002e+03
 * time: 2.266676902770996
     7     6.854678e+02     5.653206e+02
 * time: 2.2718749046325684
     8     5.981939e+02     4.298111e+02
 * time: 2.2770140171051025
     9     5.697119e+02     3.298714e+02
 * time: 2.282541036605835
    10     5.630895e+02     2.782014e+02
 * time: 2.288094997406006
    11     5.610542e+02     2.546132e+02
 * time: 2.293267011642456
    12     5.583559e+02     2.306961e+02
 * time: 2.2986090183258057
    13     5.519849e+02     1.916256e+02
 * time: 2.3038928508758545
    14     5.385646e+02     1.280939e+02
 * time: 2.308781862258911
    15     5.213242e+02     7.353540e+01
 * time: 2.313570976257324
    16     5.171523e+02     5.151365e+01
 * time: 2.3178958892822266
    17     5.151262e+02     5.020215e+01
 * time: 2.3218770027160645
    18     5.145245e+02     4.127464e+01
 * time: 2.325556993484497
    19     5.144783e+02     4.250454e+01
 * time: 2.328809976577759
    20     5.144258e+02     4.182999e+01
 * time: 2.3319668769836426
    21     5.141981e+02     3.496010e+01
 * time: 2.334972858428955
    22     5.138854e+02     2.245596e+01
 * time: 2.337779998779297
    23     5.135343e+02     1.780783e+01
 * time: 2.3405239582061768
    24     5.133723e+02     1.971138e+01
 * time: 2.3432159423828125
    25     5.133304e+02     2.298999e+01
 * time: 2.3457839488983154
    26     5.133150e+02     2.217028e+01
 * time: 2.3482730388641357
    27     5.132848e+02     2.023817e+01
 * time: 2.3507699966430664
    28     5.132093e+02     1.788705e+01
 * time: 2.3532509803771973
    29     5.130231e+02     1.853999e+01
 * time: 2.3556630611419678
    30     5.126118e+02     1.979296e+01
 * time: 2.358229875564575
    31     5.119112e+02     2.368559e+01
 * time: 2.483628034591675
    32     5.112562e+02     1.621839e+01
 * time: 2.4863829612731934
    33     5.110240e+02     5.757384e+00
 * time: 2.489081859588623
    34     5.109963e+02     4.298973e+00
 * time: 2.491896867752075
    35     5.109950e+02     4.288314e+00
 * time: 2.4945068359375
    36     5.109943e+02     4.286157e+00
 * time: 2.497144937515259
    37     5.109913e+02     4.284737e+00
 * time: 2.499763011932373
    38     5.109845e+02     4.289628e+00
 * time: 2.5023670196533203
    39     5.109654e+02     4.935513e+00
 * time: 2.505002975463867
    40     5.109155e+02     9.365998e+00
 * time: 2.507643938064575
    41     5.107757e+02     1.725988e+01
 * time: 2.5102880001068115
    42     5.103371e+02     3.350312e+01
 * time: 2.512946844100952
    43     5.075433e+02     8.627240e+01
 * time: 2.5156538486480713
    44     5.050172e+02     8.799170e+01
 * time: 2.5184099674224854
    45     4.962069e+02     9.509664e+01
 * time: 2.521340847015381
    46     4.761013e+02     3.635622e+01
 * time: 2.5245959758758545
    47     4.755510e+02     1.838628e+02
 * time: 2.5272159576416016
    48     4.667561e+02     3.725269e+01
 * time: 2.5300989151000977
    49     4.655530e+02     3.789792e+01
 * time: 2.532845973968506
    50     4.625171e+02     6.336759e+01
 * time: 2.5354208946228027
    51     4.601300e+02     9.460813e+01
 * time: 2.538033962249756
    52     4.564463e+02     2.519378e+01
 * time: 2.5407068729400635
    53     4.543494e+02     8.101728e+00
 * time: 2.5433130264282227
    54     4.514042e+02     3.041037e+01
 * time: 2.5459718704223633
    55     4.504885e+02     3.540669e+01
 * time: 2.549142837524414
    56     4.484865e+02     4.508032e+01
 * time: 2.5523550510406494
    57     4.480667e+02     4.948631e+01
 * time: 2.555572032928467
    58     4.471799e+02     8.417188e+01
 * time: 2.558276891708374
    59     4.461134e+02     4.529769e+01
 * time: 2.5609519481658936
    60     4.452253e+02     7.551449e+00
 * time: 2.5638089179992676
    61     4.447828e+02     1.762000e+01
 * time: 2.5665059089660645
    62     4.446123e+02     4.642462e+01
 * time: 2.5692059993743896
    63     4.443932e+02     2.755456e+01
 * time: 2.571923017501831
    64     4.439630e+02     1.675038e+01
 * time: 2.574863910675049
    65     4.434601e+02     1.160119e+02
 * time: 2.5778119564056396
    66     4.432802e+02     7.713017e+00
 * time: 2.5805599689483643
    67     4.428348e+02     7.635308e+00
 * time: 2.5833799839019775
    68     4.424368e+02     8.110696e+01
 * time: 2.5862159729003906
    69     4.420727e+02     9.342020e+01
 * time: 2.5890719890594482
    70     4.403707e+02     1.405464e+02
 * time: 2.5921130180358887
    71     4.400903e+02     6.770072e+01
 * time: 2.597245931625366
    72     4.398476e+02     2.660505e+02
 * time: 2.6008689403533936
    73     4.394886e+02     9.746305e+01
 * time: 2.6039068698883057
    74     4.392992e+02     3.934368e+02
 * time: 2.607301950454712
    75     4.390273e+02     6.220876e+01
 * time: 2.6103639602661133
    76     4.384455e+02     1.200034e+02
 * time: 2.613656997680664
    77     4.381655e+02     7.536095e+02
 * time: 2.6177029609680176
    78     4.376569e+02     2.716359e+03
 * time: 2.620971918106079
    79     4.371396e+02     1.427975e+02
 * time: 2.624842882156372
    80     4.365253e+02     2.291738e+01
 * time: 2.6280970573425293
    81     4.362967e+02     3.650541e+02
 * time: 2.632037878036499
    82     4.360851e+02     9.683767e+02
 * time: 2.6359338760375977
    83     4.359698e+02     1.188451e+03
 * time: 2.6398298740386963
    84     4.356883e+02     1.529329e+03
 * time: 2.643765926361084
    85     4.353129e+02     2.477125e+03
 * time: 2.6478519439697266
    86     4.347641e+02     4.445794e+02
 * time: 2.6512279510498047
    87     4.346491e+02     3.346309e+03
 * time: 2.6554529666900635
    88     4.340737e+02     5.621341e+03
 * time: 2.6593289375305176
    89     4.330218e+02     5.253844e+03
 * time: 2.663037061691284
    90     4.329277e+02     1.466953e+04
 * time: 2.667430877685547
    91     4.321405e+02     1.988316e+03
 * time: 2.6709489822387695
    92     4.316377e+02     1.175474e+04
 * time: 2.6744589805603027
    93     4.315243e+02     1.457175e+04
 * time: 2.6786739826202393
    94     4.313393e+02     1.434983e+04
 * time: 2.6830079555511475
    95     4.311457e+02     1.466406e+04
 * time: 2.6874358654022217
    96     4.309173e+02     1.397546e+04
 * time: 2.6921398639678955
    97     4.306791e+02     1.484925e+04
 * time: 2.6965198516845703
    98     4.303810e+02     1.223827e+04
 * time: 2.701066017150879
    99     4.300841e+02     2.606666e+04
 * time: 2.7054338455200195
   100     4.296818e+02     3.263127e+04
 * time: 2.709894895553589
   101     4.294220e+02     2.953363e+04
 * time: 2.7144980430603027
   102     4.291854e+02     3.241563e+04
 * time: 2.7190909385681152
   103     4.288485e+02     2.199741e+04
 * time: 2.7236740589141846
   104     4.285033e+02     7.169887e+04
 * time: 2.7283239364624023
   105     4.282675e+02     8.209757e+04
 * time: 2.733043909072876
   106     4.278118e+02     6.496780e+04
 * time: 2.737596035003662
   107     4.276005e+02     7.412336e+04
 * time: 2.742374897003174
   108     4.272582e+02     5.876273e+04
 * time: 2.7470669746398926
   109     4.269661e+02     9.679311e+04
 * time: 2.7516539096832275
   110     4.265732e+02     1.131204e+05
 * time: 2.7564189434051514
   111     4.262809e+02     1.100267e+05
 * time: 2.7611310482025146
   112     4.259916e+02     1.131859e+05
 * time: 2.765972852706909
   113     4.256448e+02     9.590546e+04
 * time: 2.7710609436035156
   114     4.252941e+02     2.497391e+05
 * time: 2.776190996170044
   115     4.247396e+02     3.667522e+05
 * time: 2.781247854232788
   116     4.244417e+02     3.177088e+05
 * time: 2.786289930343628
   117     4.242009e+02     3.549387e+05
 * time: 2.7911880016326904
   118     4.238293e+02     2.317690e+05
 * time: 2.796009063720703
   119     4.234645e+02     5.444075e+05
 * time: 2.80102801322937
   120     4.231832e+02     6.626479e+05
 * time: 2.80600905418396
   121     4.229000e+02     6.248714e+05
 * time: 2.8110108375549316
   122     4.226437e+02     6.202141e+05
 * time: 2.816074848175049
   123     4.223476e+02     5.763202e+05
 * time: 2.8214099407196045
   124     4.220353e+02     6.722184e+05
 * time: 2.8266918659210205
   125     4.215553e+02     3.868167e+05
 * time: 2.8320279121398926
   126     4.211613e+02     2.917101e+06
 * time: 2.8372650146484375
   127     4.204823e+02     3.685050e+06
 * time: 2.8422398567199707
   128     4.187401e+02     6.951196e+06
 * time: 2.846544027328491
   129     4.174679e+02     5.788368e+06
 * time: 2.852602005004883
   130     4.171014e+02     1.511670e+07
 * time: 2.8598480224609375
   131     4.166690e+02     2.943324e+06
 * time: 2.865466833114624
   132     4.157380e+02     4.686169e+05
 * time: 2.8703229427337646
   133     4.154194e+02     9.348694e+06
 * time: 2.8756120204925537
   134     4.152062e+02     1.421900e+07
 * time: 2.880849838256836
   135     4.149796e+02     1.671737e+07
 * time: 2.886137008666992
   136     4.147019e+02     1.802229e+07
 * time: 2.8916890621185303
   137     4.144039e+02     1.847006e+07
 * time: 2.8971290588378906
   138     4.140881e+02     1.850972e+07
 * time: 2.9025139808654785
   139     4.137488e+02     1.861463e+07
 * time: 2.9078729152679443
   140     4.133741e+02     1.860858e+07
 * time: 2.913383960723877
   141     4.129603e+02     1.907969e+07
 * time: 2.9188079833984375
   142     4.124802e+02     1.741585e+07
 * time: 2.9244840145111084
   143     4.119884e+02     4.218883e+07
 * time: 2.9300780296325684
   144     4.109091e+02     1.714223e+08
 * time: 2.9356749057769775
   145     4.104487e+02     1.441888e+08
 * time: 2.941215991973877
   146     4.096514e+02     1.763079e+08
 * time: 2.9467859268188477
   147     4.091312e+02     5.946881e+08
 * time: 2.9515340328216553
   148     4.052097e+02     3.624908e+09
 * time: 2.9577958583831787
   149     4.046037e+02     5.109422e+08
 * time: 2.964761972427368
   150     4.040337e+02     5.926690e+08
 * time: 2.969679832458496
   151     4.032925e+02     1.529026e+08
 * time: 2.974529981613159
   152     4.029895e+02     1.133947e+09
 * time: 2.9844210147857666
   153     4.028606e+02     2.150461e+09
 * time: 2.989366054534912
   154     4.023712e+02     2.847341e+09
 * time: 2.9943580627441406
   155     4.013024e+02     3.636985e+09
 * time: 3.0005059242248535
   156     3.988076e+02     9.859466e+09
 * time: 3.0057640075683594
   157     3.976552e+02     1.714441e+10
 * time: 3.0123229026794434
   158     3.955430e+02     2.256291e+10
 * time: 3.0189950466156006
   159     3.940546e+02     1.193260e+10
 * time: 3.0278470516204834
   160     3.937247e+02     9.073932e+09
 * time: 3.0369369983673096
   161     3.936886e+02     4.858347e+10
 * time: 3.042590856552124
   162     3.929759e+02     1.467714e+10
 * time: 3.0481739044189453
   163     3.917377e+02     6.528655e+10
 * time: 3.053915023803711
   164     3.911035e+02     8.289130e+10
 * time: 3.060699939727783
   165     3.888147e+02     1.098190e+12
 * time: 3.066859006881714
   166     3.862471e+02     8.637045e+10
 * time: 3.0739660263061523
   167     3.847153e+02     6.632016e+10
 * time: 3.081164836883545
   168     3.847153e+02     6.632016e+10
 * time: 3.1059038639068604
   169     3.847153e+02     6.632016e+10
 * time: 3.1302578449249268
FittedPumasModel

Dynamical system type:          Matrix exponential

Number of subjects:                             32

Observation records:         Active        Missing
    conc:                       251             47
    Total:                      251             47

Number of parameters:      Constant      Optimized
                                  1              8

Likelihood approximation:              NaivePooled
Likelihood optimizer:                         BFGS

Termination Reason:              NoObjectiveChange
Log-likelihood value:                   -384.71534

------------------------
             Estimate
------------------------
  pop_CL     0.12659
  pop_Vc     8.7367
  pop_tabs   1.4476e-11
  pop_lag    1.0
† pk_Ω₁,₁    0.01
† pk_Ω₂,₂    0.01
† pk_Ω₃,₃    0.01
  σ_prop     0.25974
  σ_add      2.8554e-9
------------------------
† indicates constant parameters

As you can see, the constantcoef argument is used to fix the inter-individual variability parameters to zero. The parameter estimates from the naive pooled fit seem out of place for some parameters such as pop_tabs. Perhaps the initial parameter values are not close to the true values and some adjustments are needed. It is also possible that the data cannot support a model with lags and first order absorption if some subjects very quickly have large concentration values or we really need inter-individual variability to fit the data. In Pumas, you can rerun the fit with new initial parameter values very easily. Suppose, we want to run the naive pooled fit again with 4 new initial parameter values that are within 10% of the original values.

We can use the power of the Julia language to generate 4 new parameter values that are within 10% of the original values.

# Generate parameter sets by scaling original values by ±10%
param_sets = [
    NamedTuple(k => v * scale for (k, v) in pairs(param_vals)) for
    scale in [0.9, 0.95, 1.05, 1.1]
]
4-element Vector{@NamedTuple{pop_CL::Float64, pop_Vc::Float64, pop_tabs::Float64, pop_lag::Float64, pk_Ω::Diagonal{Float64, Vector{Float64}}, σ_prop::Float64, σ_add::Float64}}:
 (pop_CL = 0.12060000000000001, pop_Vc = 7.2989999999999995, pop_tabs = 0.4707, pop_lag = 0.09000000000000001, pk_Ω = [0.009000000000000001 0.0 0.0; 0.0 0.009000000000000001 0.0; 0.0 0.0 0.009000000000000001], σ_prop = 0.006768, σ_add = 0.05949000000000001)
 (pop_CL = 0.1273, pop_Vc = 7.7044999999999995, pop_tabs = 0.49685, pop_lag = 0.095, pk_Ω = [0.0095 0.0 0.0; 0.0 0.0095 0.0; 0.0 0.0 0.0095], σ_prop = 0.007143999999999999, σ_add = 0.062795)
 (pop_CL = 0.14070000000000002, pop_Vc = 8.5155, pop_tabs = 0.54915, pop_lag = 0.10500000000000001, pk_Ω = [0.0105 0.0 0.0; 0.0 0.0105 0.0; 0.0 0.0 0.0105], σ_prop = 0.007896, σ_add = 0.06940500000000001)
 (pop_CL = 0.14740000000000003, pop_Vc = 8.921, pop_tabs = 0.5753, pop_lag = 0.11000000000000001, pk_Ω = [0.011000000000000001 0.0 0.0; 0.0 0.011000000000000001 0.0; 0.0 0.0 0.011000000000000001], σ_prop = 0.008272, σ_add = 0.07271000000000001)

Now, we can run the fit again with the new parameter values.

new_fits = [
    fit(warfarin_pk_model, pop, param_set, NaivePooled(), constantcoef = (:pk_Ω,)) for
    param_set in param_sets
];

You can compare the results of the new fits with perturbed initial parameter values.

compare_estimates(;
    p1 = new_fits[1],
    p2 = new_fits[2],
    p3 = new_fits[3],
    p4 = new_fits[4],
    original = naive_fit,
)
9×6 DataFrame
Row parameter p1 p2 p3 p4 original
String Float64? Float64? Float64? Float64? Float64?
1 pop_CL 0.123625 0.192774 0.123403 0.0235632 0.126594
2 pop_Vc 9.28085 16.1102 9.13101 49.454 8.73673
3 pop_tabs 4.10644e-12 0.60404 0.972573 0.154961 1.44759e-11
4 pop_lag 1.0 0.706171 0.5 0.648555 1.0
5 pk_Ω₁,₁ 0.009 0.0095 0.0105 0.011 0.01
6 pk_Ω₂,₂ 0.009 0.0095 0.0105 0.011 0.01
7 pk_Ω₃,₃ 0.009 0.0095 0.0105 0.011 0.01
8 σ_prop 0.293987 0.778574 0.283517 2.15052 0.259744
9 σ_add 1.25063e-85 1.6711e-173 4.76955e-12 1.57173e-162 2.8554e-9

The compare_estimates function is a convenience function that is part of the PumasUtilities package. It is used to compare the estimates of the new fits with the original fit. The p3 results seem to be reasonable amongst all the fits and these results can be taken forward for the FOCE() fit.

2.6 Estimating Parameters via Maximum Likelihood (FOCE)

After obtaining initial estimates, the next step is fitting the model to data using a maximum likelihood approach. One of the most commonly used methods in Pumas is the First-Order Conditional Estimation (FOCE) method. This is invoked by specifying FOCE() as the estimation algorithm using the initial parameter values specified in the model definition:

foce_fit = fit(warfarin_pk_model, pop, init_params(warfarin_pk_model), FOCE())

The init_params function is a convenience function that extracts the initial parameter values from the model definition.

Note

By default, the fit function will use all the cores of your machine to parallelize the optimization. You can control this using the ensemblealg argument. For more details, see the fit options section in the Pumas Documentation.

2.7 Fixing Parameters with constantcoef

Sometimes, it is desirable to hold certain parameters constant during estimation. For instance, suppose one wants to fix tabs at a particular value:

foce_fit_fixedtabs = fit(
    warfarin_pk_model,
    pop,
    (; param_vals..., pop_tabs = 0.5),
    FOCE();
    constantcoef = (:pop_tabs,),
);   # This fixes the pop_tabs parameter
foce_fit_fixedtabs
FittedPumasModel

Dynamical system type:          Matrix exponential

Number of subjects:                             32

Observation records:         Active        Missing
    conc:                       251             47
    Total:                      251             47

Number of parameters:      Constant      Optimized
                                  1              8

Likelihood approximation:                     FOCE
Likelihood optimizer:                         BFGS

Termination Reason:                      NoXChange
Log-likelihood value:                   -350.54228

----------------------
             Estimate
----------------------
  pop_CL     0.13455
  pop_Vc     8.0517
† pop_tabs   0.5
  pop_lag    0.8751
  pk_Ω₁,₁    0.07066
  pk_Ω₂,₂    0.01826
  pk_Ω₃,₃    0.95054
  σ_prop     0.089942
  σ_add      0.39217
----------------------
† indicates constant parameters

This approach is useful for sensitivity analysis, exploring different values, or simplifying the model.

2.8 Changing Tolerance During Fitting

The fit function accepts additional keyword arguments to tweak the fitting process, such as tolerances for convergence:

foce_fit_fixedtabs = fit(
    warfarin_pk_model,
    pop,
    (; param_vals..., pop_tabs = 0.5),
    FOCE();
    constantcoef = (:pop_tabs,),
    optim_options = (; x_reltol = 1e-6, x_abstol = 1e-8, iterations = 200),
);

These arguments (x_reltol, x_abstol, iterations) help fine-tune the solver’s stopping criteria. For more details, see the Optimization Options during Fitting.

2.9 Interpreting the Printed Results

Once the model fit is complete, Pumas will print a summary of the fit to the console. This typically includes:

  1. Objective Function Value (OFV): The minimized negative log-likelihood (or similar objective). Lower is better.
  2. Convergence Status: An indication of whether the solver found a local optimum or if it ran into issues (e.g., maximum iterations reached).
  3. Parameter Estimates: A table listing final estimates of fixed effects, random effects (variances or standard deviations), and residual variability components.
  4. Standard Errors (if computed): By default, these might not be computed until using infer (discussed in the next section).

The printed output might look like:

julia> foce_fit_fixedtabs
FittedPumasModel

Successful minimization:                      true

Likelihood approximation:                     FOCE
Likelihood Optimizer:                         BFGS
Dynamical system type:          Matrix exponential

Log-likelihood value:                    -320.2235
Number of subjects:                             31
Number of parameters:         Fixed      Optimized
                                  1              8
Observation records:         Active        Missing
    conc:                       239             47
    Total:                      239             47

-----------------------
              Estimate
-----------------------
pop_CL        0.13091
pop_Vc        8.08
pop_tabs      0.5
pop_lag       0.91842
pk_Ω₁,₁       0.050929
pk_Ω₂,₂       0.019161
pk_Ω₃,₃       1.056
σ_prop        0.090974
σ_add         0.33189
-----------------------

We can break down the output of the fit function into the following components:

2.9.1 FittedPumasModel

FittedPumasModel

The FittedPumasModel object contains the following fields:

  • model: The original model definition
foce_fit_fixedtabs.model
PumasModel
  Parameters: pop_CL, pop_Vc, pop_tabs, pop_lag, pk_Ω, σ_prop, σ_add
  Random effects: pk_η
  Covariates: FSZCL, FSZV
  Dynamical system variables: Depot, Central
  Dynamical system type: Matrix exponential
  Derived: conc
  Observed: conc
  • data: The population data
foce_fit_fixedtabs.data
Population
  Subjects: 32
  Covariates: sex, wtbl, FSZV, FSZCL
  Observations: conc
  • optim: The optimization results
foce_fit_fixedtabs.optim
 * Status: success

 * Candidate solution
    Final objective value:     3.505425e+02

 * Found with
    Algorithm:     BFGS

 * Convergence measures
    |x - x'|               = 1.38e-06 ≰ 1.0e-08
    |x - x'|/|x'|          = 3.46e-07 ≤ 1.0e-06
    |f(x) - f(x')|         = 5.58e-07 ≰ 0.0e+00
    |f(x) - f(x')|/|f(x')| = 1.59e-09 ≰ 0.0e+00
    |g(x)|                 = 6.41e-02 ≰ 1.0e-03

 * Work counters
    Seconds run:   6  (vs limit Inf)
    Iterations:    74
    f(x) calls:    82
    ∇f(x) calls:   75
  • approx: The likelihood approximation method
foce_fit_fixedtabs.approx
FOCE
  • kwargs: Additional keyword arguments
foce_fit_fixedtabs.kwargs
  • fixedparamset: The fixed parameter set
foce_fit_fixedtabs.fixedparamset
  • optim_state: The optimization state
foce_fit_fixedtabs.optim_state

2.9.2 Minimization Status

Successful minimization:                      true

The Successful minimization field indicates whether the minimization was successful. It depends on the convergence of the optimizer and rules set in the optim_options argument, for example x_reltol and x_abstol.

2.9.3 Likelihood Approximation

  Likelihood approximation:                     FOCE

The Likelihood approximation field indicates the likelihood approximation method used.

2.9.4 Likelihood Optimizer

  Likelihood Optimizer:                         BFGS

The Likelihood Optimizer field indicates the optimizer used which by default is BFGS from the Optim package. A user can change the optimizer by setting the optim_alg argument in the fit function.

2.9.5 Dynamical system type

  Dynamical system type:          Matrix exponential

The Dynamical system type field indicates the type of dynamical system used. In the example here, even though the model is a differential equation model, the Matrix exponential indicates that the model is solved using a matrix exponential solver as the system has been deemed linear. The user can check the linear nature of the system as follows: foce_fit.model.prob which results in this case to Pumas.LinearODE(). If the user does not want the automatic check for linearity, they can turn it off in the @options block of the model definition.

@model begin
    @param begin
        ...
    end
    @random begin
        ...
    end
    @dynamics begin
        ...
    end
    @pre begin
        ...
    end
    @options begin
        checklinear = false
    end
end

The checklinear option is a boolean that determines whether the solver should check if the system defined in the @dynamics block is linear. If the system is linear, setting this option to true (default) enables calculating the solution through matrix exponentials. If it is set to false or time (t) appears in the @pre block, this optimization is disabled. This option can be useful when the matrix exponential solver is not superior to general numerical integrators or for debugging purposes.

2.9.6 Log-likelihood value

  Log-likelihood value:                    -320.2235

The Log-likelihood value field indicates the value of the log-likelihood function at the maximum likelihood estimate.

2.9.7 Number of subjects

  Number of subjects:                             31

The Number of subjects field indicates the number of subjects from the population data used in the fit.

2.9.8 Number of parameters

  Number of parameters:         Fixed      Optimized
                                  1              8

The Number of parameters field indicates the number of fixed and optimized parameters. The Fixed column indicates the number of parameters that were fixed using the constantcoef argument during the fit and the Optimized column indicates the number of parameters that were optimized.

2.9.9 Observation records

  Observation records:         Active        Missing
    conc:                       239             47
    Total:                      239             47

The Observation records field indicates the number of active and missing observations in the population data. The Active column indicates the number of observations that were used in the fit and the Missing column indicates the number of observations that were missing and not used in the fit. missing in this case is due to the missingness of the conc observations, which is the collection of all the conc observations from all the subjects.

2.9.10 Parameter estimates

-----------------------
              Estimate
-----------------------
pop_CL        0.13091
pop_Vc        8.08
pop_tabs      0.5
pop_lag       0.91842
pk_Ω₁,₁       0.050929
pk_Ω₂,₂       0.019161
pk_Ω₃,₃       1.056
σ_prop        0.090974
σ_add         0.33189
-----------------------

The Parameter estimates field indicates the final parameter estimates. The Estimate column indicates the maximum likelihood estimates of the parameter.

2.10 Computing Parameter Precision with infer

The infer function in Pumas estimates the uncertainty (precision) of parameter estimates. Depending on the chosen method, infer can provide standard errors, confidence intervals, and correlation matrices.

The signature for infer often looks like:

infer(
    fpm::FittedPumasModel;
    level = 0.95,
    rethrow_error::Bool = false,
    sandwich_estimator::Bool = true,
)

where:

  • fpm::FittedPumasModel: The result of fit (e.g., foce_fit).
  • level: The confidence interval level (e.g., 0.95). The confidence intervals are calculated as the (1-level)/2 and (1+level)/2 quantiles of the estimated parameters
  • rethrow_error: If rethrow_error is false (the default value), no error will be thrown if the variance-covariance matrix estimator fails. If it is true, an error will be thrown if the estimator fails.
  • sandwich_estimator: Whether to use the sandwich estimator. If set to true (the default value), the sandwich estimator will be used. If set to false, the standard error will be calculated using the inverse of the Hessian matrix.

An example usage:

inference_results = infer(foce_fit_fixedtabs; level = 0.95)
[ Info: Calculating: variance-covariance matrix.
[ Info: Done.
Asymptotic inference results using sandwich estimator

Dynamical system type:          Matrix exponential

Number of subjects:                             32

Observation records:         Active        Missing
    conc:                       251             47
    Total:                      251             47

Number of parameters:      Constant      Optimized
                                  1              8

Likelihood approximation:                     FOCE
Likelihood optimizer:                         BFGS

Termination Reason:                   SmallXChange
Log-likelihood value:                   -350.54254

----------------------------------------------------------
             Estimate   SE          95.0% C.I.
----------------------------------------------------------
  pop_CL     0.13455    0.0065675   [ 0.12168  ; 0.14742]
  pop_Vc     8.0517     0.21972     [ 7.6211   ; 8.4824 ]
† pop_tabs   0.5        NaN         [ NaN      ; NaN    ]
  pop_lag    0.87498    0.051763    [ 0.77353  ; 0.97644]
  pk_Ω₁,₁    0.070958   0.024781    [ 0.022387 ; 0.11953]
  pk_Ω₂,₂    0.01827    0.0051531   [ 0.0081698; 0.02837]
  pk_Ω₃,₃    0.94384    0.38455     [ 0.19013  ; 1.6975 ]
  σ_prop     0.089942   0.014455    [ 0.061611 ; 0.11827]
  σ_add      0.39221    0.066849    [ 0.26119  ; 0.52323]
----------------------------------------------------------
† indicates constant parameters

We can use the sandwich_estimator argument to get a more robust estimate of the standard errors.

inference_results = infer(foce_fit_fixedtabs; level = 0.95, sandwich_estimator = false)
[ Info: Calculating: variance-covariance matrix.
[ Info: Done.
Asymptotic inference results using negative inverse Hessian

Dynamical system type:          Matrix exponential

Number of subjects:                             32

Observation records:         Active        Missing
    conc:                       251             47
    Total:                      251             47

Number of parameters:      Constant      Optimized
                                  1              8

Likelihood approximation:                     FOCE
Likelihood optimizer:                         BFGS

Termination Reason:                   SmallXChange
Log-likelihood value:                   -350.54254

-----------------------------------------------------------
             Estimate   SE          95.0% C.I.
-----------------------------------------------------------
  pop_CL     0.13455    0.0064882   [ 0.12183  ; 0.14727 ]
  pop_Vc     8.0517     0.22744     [ 7.606    ; 8.4975  ]
† pop_tabs   0.5        NaN         [ NaN      ; NaN     ]
  pop_lag    0.87498    0.038082    [ 0.80034  ; 0.94962 ]
  pk_Ω₁,₁    0.070958   0.018798    [ 0.034114 ; 0.1078  ]
  pk_Ω₂,₂    0.01827    0.0062287   [ 0.0060618; 0.030478]
  pk_Ω₃,₃    0.94384    0.4181      [ 0.12438  ; 1.7633  ]
  σ_prop     0.089942   0.0089122   [ 0.072474 ; 0.10741 ]
  σ_add      0.39221    0.046786    [ 0.30051  ; 0.48391 ]
-----------------------------------------------------------
† indicates constant parameters

This result above indicates that both with the sandwich estimator and the inverse of the Hessian matrix on the fixed parameter, the inference worked. We will pursue other methods to obtain parameter precision in later tutorials, such as bootstrap and SIR.

3 Concluding Remarks

This tutorial showcased a typical Pumas workflow for PK model fitting using a Warfarin dataset:

  1. Model Definition and Data Preparation.
  2. Checking Compatibility via loglikelihood and findinfluential.
  3. Initial Parameter Estimation using Naive Pooled.
  4. Fitting with FOCE, demonstrating how to fix parameters or change solver tolerances.
  5. Interpreting Fit Results, exploring the components of a fittedpumasmodel.
  6. Computing Precision of estimates with infer using different methods.

Readers are encouraged to refine their understanding by:

  • Performing thorough Exploratory Data Analysis prior to modeling.
  • Exploring Bootstrap or SIR methods for deeper uncertainty quantification (covered in later tutorials).
  • Validating the final model through visual diagnostics (e.g., VPCs, GOF plots).

More advanced topics can be found throughout the Pumas Documentation and Pumas Tutorials.

Tip

Real-world workflows are iterative. Analysts often revisit model assumptions, re-check data, and explore alternative parameterizations before finalizing any one model as “best.”