Fitting and Inferring a Compartmental PK Model

Author

Vijay Ivaturi

1 Introduction

A typical workflow for fitting a Pumas model and deriving parameter precision typically involves:

  1. Preparing the data and the model.
  2. Checking model-data compatibility.
  3. Obtaining initial parameter estimates.
  4. Fitting the model via a chosen estimation method.
  5. Interpreting the fit results.
  6. Computing parameter precision.
  7. (Optionally) proceeding with more advanced techniques like bootstrapping or SIR for robust uncertainty quantification.

The following sections will walk through these steps using a one-compartment PK model for Warfarin, focusing on the PK aspects only. Exploratory data analysis (EDA), although extremely important, is out of scope here. Readers interested in EDA are encouraged to consult other tutorials.

2 Model and Data

2.1 Model Definition

Below is the PK model, named warfarin_pk_model, defined in Pumas. This model contains:

  • Fixed effects (population parameters): pop_CL, pop_Vc, pop_tabs, pop_lag
  • Inter-individual variability (IIV) components: pk_Ω
  • Residual error model parameters: σ_prop, σ_add
  • Covariates for scaling: FSZCL and FSZV
  • Differential equations describing the PK behavior in the compartments.
using Pumas
using PharmaDatasets
using DataFramesMeta
using CairoMakie
using AlgebraOfGraphics
using PumasUtilities
using Markdown
warfarin_pk_model = @model begin

    @metadata begin
        desc = "Warfarin 1-compartment PK model (PD removed)"
        timeu = u"hr"
    end

    @param begin
        # PK parameters
        """
        Clearance (L/hr)
        """
        pop_CL  RealDomain(lower = 0.0, init = 0.134)
        """
        Central volume (L)
        """
        pop_Vc  RealDomain(lower = 0.0, init = 8.11)
        """
        Absorption lag time (hr)
        """
        pop_tabs  RealDomain(lower = 0.0, init = 0.523)
        """
        Lag time (hr)
        """
        pop_lag  RealDomain(lower = 0.0, init = 0.1)

        # Inter-individual variability
        """
          - ΩCL: Clearance
          - ΩVc: Central volume
          - Ωtabs: Absorption lag time
        """
        pk_Ω  PDiagDomain([0.01, 0.01, 0.01])
        # Residual variability
        """
        σ_prop: Proportional error
        """
        σ_prop  RealDomain(lower = 0.0, init = 0.00752)
        """
        σ_add: Additive error
        """
        σ_add  RealDomain(lower = 0.0, init = 0.0661)
    end

    @random begin
        pk_η ~ MvNormal(pk_Ω)    # mean = 0, covariance = pk_Ω
    end

    @covariates begin
        """
        FSZCL: Clearance scaling factor
        """
        FSZCL
        """
        FSZV: Volume scaling factor
        """
        FSZV
    end

    @pre begin
        CL = FSZCL * pop_CL * exp(pk_η[1])
        Vc = FSZV * pop_Vc * exp(pk_η[2])

        tabs = pop_tabs * exp(pk_η[3])
        Ka = log(2) / tabs
    end

    @dosecontrol begin
        lags = (Depot = pop_lag,)
    end

    @vars begin
        cp := Central / Vc
    end

    @dynamics begin
        Depot' = -Ka * Depot
        Central' = Ka * Depot - (CL / Vc) * Central
    end

    @derived begin
        """
        Concentration (ng/mL)
        """
        conc ~ @. Normal(cp, sqrt((σ_prop * cp)^2 + σ_add^2))
    end
end
PumasModel
  Parameters: pop_CL, pop_Vc, pop_tabs, pop_lag, pk_Ω, σ_prop, σ_add
  Random effects: pk_η
  Covariates: FSZCL, FSZV
  Dynamical system variables: Depot, Central
  Dynamical system type: Matrix exponential
  Derived: conc
  Observed: conc

2.2 Data Preparation

The Warfarin data used in this tutorial is pulled from PharmaDatasets for demonstration purposes. Note how the code reshapes and prepares the data in “wide” format for reading into Pumas. Only the conc column is treated as observations for the PK model.

warfarin_data = dataset("pumas/warfarin_nonmem")

# Step 2: Fix Duplicate Time Points
# -------------------------------
# Some subjects have duplicate time points for dvid = 1
# For this dataset, the triple (id, time, dvid) should define
# a row uniquely, but
nrow(warfarin_data)
nrow(unique(warfarin_data, ["id", "time", "dvid"]))

# We can identify the problematic rows by grouping on the index variables
@chain warfarin_data begin
    @groupby :id :time :dvid
    @transform :tmp = length(:id)
    @rsubset :tmp > 1
end

# Transform the data in a single chain of operations
warfarin_data_wide = @chain warfarin_data begin
    @rtransform begin
        # Scaling factors
        :FSZV = :wtbl / 70            # volume scaling
        :FSZCL = (:wtbl / 70)^0.75     # clearance scaling (allometric)
        # Column name for the dv
        :dvname = "dv$(:dvid)"
        # Dosing indicator columns
        :cmt = ismissing(:amt) ? missing : 1
        :evid = iszero(:amt) ? 0 : 1
    end
    unstack(Not([:dvid, :dvname, :dv]), :dvname, :dv)
    rename!(:dv1 => :conc, :dv2 => :pca)
    select!(Not([:dv0]))
end
330×12 DataFrame
305 rows omitted
Row id time evid amt wtbl age sex FSZV FSZCL cmt conc pca
Int64 Float64 Int64 Float64 Float64 Int64 String1 Float64 Float64 Int64 Float64? Float64?
1 1 0.0 1 100.0 66.7 50 M 0.952857 0.96443 1 missing missing
2 1 0.5 0 0.0 66.7 50 M 0.952857 0.96443 1 0.0 missing
3 1 1.0 0 0.0 66.7 50 M 0.952857 0.96443 1 1.9 missing
4 1 2.0 0 0.0 66.7 50 M 0.952857 0.96443 1 3.3 missing
5 1 3.0 0 0.0 66.7 50 M 0.952857 0.96443 1 6.6 missing
6 1 6.0 0 0.0 66.7 50 M 0.952857 0.96443 1 9.1 missing
7 1 9.0 0 0.0 66.7 50 M 0.952857 0.96443 1 10.8 missing
8 1 12.0 0 0.0 66.7 50 M 0.952857 0.96443 1 8.6 missing
9 1 24.0 0 0.0 66.7 50 M 0.952857 0.96443 1 5.6 44.0
10 1 36.0 0 0.0 66.7 50 M 0.952857 0.96443 1 4.0 27.0
11 1 48.0 0 0.0 66.7 50 M 0.952857 0.96443 1 2.7 28.0
12 1 72.0 0 0.0 66.7 50 M 0.952857 0.96443 1 0.8 31.0
13 1 96.0 0 0.0 66.7 50 M 0.952857 0.96443 1 missing 60.0
319 32 48.0 0 0.0 62.0 21 M 0.885714 0.912999 1 6.9 24.0
320 32 72.0 0 0.0 62.0 21 M 0.885714 0.912999 1 4.4 23.0
321 32 96.0 0 0.0 62.0 21 M 0.885714 0.912999 1 3.5 20.0
322 32 120.0 0 0.0 62.0 21 M 0.885714 0.912999 1 2.5 22.0
323 33 0.0 1 100.0 66.7 50 M 0.952857 0.96443 1 missing missing
324 33 0.0 0 0.0 66.7 50 M 0.952857 0.96443 1 missing 100.0
325 33 24.0 0 0.0 66.7 50 M 0.952857 0.96443 1 9.2 49.0
326 33 36.0 0 0.0 66.7 50 M 0.952857 0.96443 1 8.5 32.0
327 33 48.0 0 0.0 66.7 50 M 0.952857 0.96443 1 6.4 26.0
328 33 72.0 0 0.0 66.7 50 M 0.952857 0.96443 1 4.8 22.0
329 33 96.0 0 0.0 66.7 50 M 0.952857 0.96443 1 3.1 28.0
330 33 120.0 0 0.0 66.7 50 M 0.952857 0.96443 1 2.5 33.0

2.3 Creating a Pumas Population

Below is the creation of a population object in Pumas using read_pumas. Only the conc data are treated as the observation variable:

pop = read_pumas(
    warfarin_data_wide;
    id = :id,
    time = :time,
    amt = :amt,
    cmt = :cmt,
    evid = :evid,
    covariates = [:sex, :wtbl, :FSZV, :FSZCL],
    observations = [:conc],
)
Population
  Subjects: 32
  Covariates: sex, wtbl, FSZV, FSZCL
  Observations: conc
Note

The same data can contain multiple endpoints or PD observations. In this tutorial, the focus is solely on PK fitting. PKPD modeling on this warfarin dataset will be introduced later.

2.4 Checking Model-Data Compatibility

Before performing any fit, it is recommended to verify whether the defined model is consistent with the provided dataset. Pumas offers functions such as loglikelihood and findinfluential for these checks.

2.4.1 The loglikelihood Function

The loglikelihood function computes the log-likelihood of the model given data and parameters. In Pumas, the signature typically looks like:

loglikelihood(model, population, params, approx)

where:

  • model: The Pumas model definition (e.g., warfarin_pk_model).
  • population: A Pumas population object (e.g., pop).
  • params: A named tuple or dictionary containing parameter values.
  • approx: The approximation method to use. Common options include FOCE(), FO(), LaplaceI(), etc.

For example, one might write:

# A named tuple of parameter values
param_vals = (
    pop_CL = 0.134,
    pop_Vc = 8.11,
    pop_tabs = 0.523,
    pop_lag = 0.1,
    pk_Ω = Diagonal([0.01, 0.01, 0.01]),
    σ_prop = 0.00752,
    σ_add = 0.0661,
)

ll_value = loglikelihood(warfarin_pk_model, pop, param_vals, FOCE())
-12090.636664027372

The initial loglikelihood of the warfarin PK model given the data and parameter values is `LL = -12090.636664027372 `.

If the model and data are incompatible (e.g., missing doses for compartments, or out-of-range parameter values), loglikelihood might return an error or produce a warning. A successful computation is a good sign that the model can handle the data.

2.4.2 The findinfluential Function

The findinfluential function helps identify observations that disproportionately influence the fit. It can be used before or after fitting, but even with initial guesses, it can highlight potentially problematic data points. The most notable case is when the loglikelihood cannot be evaluated in which case the data returned for the problematic subject does not return a finite value.

Typical usage is:

findinfluential(warfarin_pk_model, pop, params; approx)

This will list the individual loglikelihoods for each subject. Readers can then inspect these observations further, potentially re-checking the data quality or adjusting the model assumptions if needed for each subject.

fi = findinfluential(warfarin_pk_model, pop, param_vals, FOCE())
32-element Vector{@NamedTuple{id::String, nll::Float64}}:
 (id = "12", nll = 3179.162620631005)
 (id = "14", nll = 2113.146717564112)
 (id = "13", nll = 1627.4380743700433)
 (id = "9", nll = 1050.6701195571484)
 (id = "2", nll = 780.9109417396234)
 (id = "1", nll = 699.9267530243993)
 (id = "7", nll = 512.7488860179473)
 (id = "11", nll = 470.0423353858021)
 (id = "6", nll = 311.9666456226985)
 (id = "3", nll = 192.83907815907168)
 ⋮
 (id = "16", nll = 22.434033734725986)
 (id = "5", nll = 18.475981181576064)
 (id = "32", nll = 17.87557264312157)
 (id = "17", nll = 17.341505598982042)
 (id = "26", nll = 8.191625356662307)
 (id = "24", nll = 6.4001549755272755)
 (id = "31", nll = 6.363630970170725)
 (id = "23", nll = 5.48597914877157)
 (id = "29", nll = 4.343343958293616)

The default output is a vector of NamedTuples, which can easily be converted into a DataFrame for further analysis. One can then plot the distribution of the individual loglikelihoods to identify any potential problematic subjects.

fidf = DataFrame(fi)
hist(
    fidf.nll,
    axis = (xlabel = "Log-likelihood", ylabel = "Frequency"),
    color = (:blue, 0.5),
)

The plot above shows that there are some subjects with very high initial log-likelihoods, which might be worth investigating. Notice, that a high initial log-likelihood does not necessarily imply a model or data problem. It can simply be that for the chosen initial population parameters the model does not fit the observations well for the subject in question.

2.5 Getting Good Initial Estimates Using Naive Pooled

A reliable starting point for nonlinear mixed-effects modeling is to get reasonable initial parameter estimates. Pumas provides approaches such as:

  • Naive Pooled: Treats all data as if there were no inter-individual variability.
  • NCA (Non-compartmental Analysis): Uses PK metrics (CL, V, etc.) from a non-compartmental approach to seed the parameter values (not showcased in this tutorial).

Here, the Naive Pooled approach is demonstrated. This method estimates population parameters by ignoring inter-individual differences, effectively pooling data as if from a single “super-subject.”

Below is the signature and documentation for the Pumas fit function. For more details, see the Fitting in Pumas Documentation.

fit(
  model::PumasModel,
  population::Population,
  param::NamedTuple,
  approx::Union{LikelihoodApproximation, MAP};
  optim_alg = Optim.BFGS(linesearch = Optim.LineSearches.BackTracking(), initial_stepnorm = 1.0),
  optim_options::NamedTuple = NamedTuple(),
  optimize_fn = nothing,
  constantcoef::Tuple = (),
  ensemblealg::SciMLBase.EnsembleAlgorithm = EnsembleThreads(),
  checkidentification = true,
  diffeq_options = NamedTuple(),
  verbose = true,
  ignore_numerical_error = true,
)

Fit the Pumas model model to the dataset population with starting values param using the estimation method approx. Currently supported values for the `approx` argument are `FO`, `FOCE`,
`LaplaceI`, `NaivePooled`, and `BayesMCMC`. See the online documentation for more details about the different methods.

To control the optimization procedure for finding population parameters (fixed effects), use the `optim_alg` and `optim_options` keyword arguments. In previous versions of Pumas the argument
`optimize_fn` was used, but is now discouraged and will be removed in a later version of Pumas. These options control the optimization of all `approx` methods except BayesMCMC. The default
optimization function uses the quasi-Newton routine BFGS method from the `Optim` package. It can be changed by setting the `optim_alg` to an algorithm implemented in Optim.jl as long as it
does not use second order derivatives. Optimization specific options can be passed in using the `optim_options` keyword and has to be a `NamedTuple` with names and values that match the
`Optim.Options` type. For example, the optimization trace can be disabled and the algorithm can be changed to L-BFGS by passing `optim_alg=Optim.LBFGS(), optim_options = ;(show_trace=false)`
to fit. See [Optim](https://docs.pumas.ai/stable/basics/estimation/#Optimization-option) for more defails.

It is possible to fix one or more parameters of the fit by passing a Tuple of Symbols as the `constantcoef` argument with elements corresponding to the names of the fixed parameters, e.g.
`constantcoef=(:σ,)`.

When models include an `@random` block and fitting with NaivePooled is requested, it is required that the user sets all variability parameters to zero with `constantcoef` such that these can
be ignored in the optimization, e.g. `constantcoef = (:Ω,)` while overwriting the corresponding values in `param` with `(; init_params..., Ω = zeros(3, 3))`.


Parallelization of the optimization is supported for most estimation methods via the ensemble interface of DifferentialEquations.jl. Currently, the only supported options are:

    •  `EnsembleThreads()`: the default. Accelerate fits by using multiple threads.

    •  `EnsembleSerial()`: fit using a single thread.

    •  `EnsembleDistributed()`: fit by using multiple worker processes.

The `fit` function will check if any gradients and throw an exception if any of the elements are exactly zero unless `checkidentification` is set to false.

Further keyword arguments can be passed via the `diffeq_options` argument. This allows for passing arguments to the differential equations solver such as `alg`, `abstol`, and `reltol`. The
default values for these are `AutoVern7(Rodas5P(autodiff=true)), 1e-12, and 1e-8` respectively. See the [documentation](https://docs.pumas.ai/stable/basics/simulation/#Options-and-settings-for-simobs) for more details.

The keyword `verbose` controls if info statements about initial evaluations of the loglikelihood function and gradient should be printed or not. Defaults to true.

Since the numerical optimization routine can sometimes visit extreme regions of the parameter space during it's exploration we automatically handle the situation where the input
parameters result in errors when solving the dynamical system. This allows the algorithm to recover and continue in many situations that would have otherwise stalled early. Sometimes, it
is useful to turn this error handling off when debugging a model fit, and this can be done by setting `ignore_numerical_error = false`.

Below is how to run a Naive Pooled analysis:

naive_fit = fit(warfarin_pk_model, pop, param_vals, NaivePooled(); constantcoef = (:pk_Ω,))
[ Info: Checking the initial parameter values.
[ Info: The initial negative log likelihood and its gradient are finite. Check passed.
Iter     Function value   Gradient norm 
     0     6.698433e+04     7.339547e+04
 * time: 0.06300187110900879
     1     1.608646e+04     2.976762e+04
 * time: 2.3654468059539795
     2     5.868548e+03     1.035750e+04
 * time: 2.368582010269165
     3     2.939718e+03     4.142057e+03
 * time: 2.3716118335723877
     4     1.911162e+03     2.672629e+03
 * time: 2.374521017074585
     5     1.258125e+03     1.666208e+03
 * time: 2.377405881881714
     6     8.872473e+02     1.004002e+03
 * time: 2.380250930786133
     7     6.854678e+02     5.653206e+02
 * time: 2.3832578659057617
     8     5.981939e+02     4.298111e+02
 * time: 2.3860819339752197
     9     5.697119e+02     3.298714e+02
 * time: 2.3888649940490723
    10     5.630895e+02     2.782014e+02
 * time: 2.391618013381958
    11     5.610542e+02     2.546132e+02
 * time: 2.3944058418273926
    12     5.583559e+02     2.306961e+02
 * time: 2.3971920013427734
    13     5.519849e+02     1.916256e+02
 * time: 2.4000489711761475
    14     5.385646e+02     1.280939e+02
 * time: 2.4029488563537598
    15     5.213242e+02     7.353540e+01
 * time: 2.405838966369629
    16     5.171523e+02     5.151365e+01
 * time: 2.408723831176758
    17     5.151262e+02     5.020215e+01
 * time: 2.411641836166382
    18     5.145245e+02     4.127464e+01
 * time: 2.4145379066467285
    19     5.144783e+02     4.250454e+01
 * time: 2.417534828186035
    20     5.144258e+02     4.182999e+01
 * time: 2.421085834503174
    21     5.141981e+02     3.496010e+01
 * time: 2.4257609844207764
    22     5.138854e+02     2.245596e+01
 * time: 2.4293408393859863
    23     5.135343e+02     1.780783e+01
 * time: 2.4326329231262207
    24     5.133723e+02     1.971138e+01
 * time: 2.4358808994293213
    25     5.133304e+02     2.298999e+01
 * time: 2.43913197517395
    26     5.133150e+02     2.217028e+01
 * time: 2.4423508644104004
    27     5.132848e+02     2.023817e+01
 * time: 2.445596933364868
    28     5.132093e+02     1.788705e+01
 * time: 2.448824882507324
    29     5.130231e+02     1.853999e+01
 * time: 2.4520599842071533
    30     5.126118e+02     1.979296e+01
 * time: 2.455293893814087
    31     5.119112e+02     2.368559e+01
 * time: 2.4585280418395996
    32     5.112562e+02     1.621839e+01
 * time: 2.4617810249328613
    33     5.110240e+02     5.757384e+00
 * time: 2.464829921722412
    34     5.109963e+02     4.298973e+00
 * time: 2.46807599067688
    35     5.109950e+02     4.288314e+00
 * time: 2.4713070392608643
    36     5.109943e+02     4.286157e+00
 * time: 2.47442889213562
    37     5.109913e+02     4.284737e+00
 * time: 2.477496862411499
    38     5.109845e+02     4.289628e+00
 * time: 2.480592966079712
    39     5.109654e+02     4.935513e+00
 * time: 2.483675003051758
    40     5.109155e+02     9.365998e+00
 * time: 2.486764907836914
    41     5.107757e+02     1.725988e+01
 * time: 2.490056037902832
    42     5.103371e+02     3.350312e+01
 * time: 2.493114948272705
    43     5.075433e+02     8.627240e+01
 * time: 2.496354818344116
    44     5.050172e+02     8.799170e+01
 * time: 2.4996488094329834
    45     4.962069e+02     9.509664e+01
 * time: 2.5038368701934814
    46     4.761013e+02     3.635622e+01
 * time: 2.5097458362579346
    47     4.755510e+02     1.838628e+02
 * time: 2.513279914855957
    48     4.667561e+02     3.725269e+01
 * time: 2.5165798664093018
    49     4.655530e+02     3.789792e+01
 * time: 2.519847869873047
    50     4.625171e+02     6.336759e+01
 * time: 2.5231118202209473
    51     4.601300e+02     9.460813e+01
 * time: 2.526366949081421
    52     4.564463e+02     2.519378e+01
 * time: 2.5295088291168213
    53     4.543494e+02     8.101728e+00
 * time: 2.532634973526001
    54     4.514042e+02     3.041037e+01
 * time: 2.535784959793091
    55     4.504885e+02     3.540669e+01
 * time: 2.5395588874816895
    56     4.484865e+02     4.508032e+01
 * time: 2.543351888656616
    57     4.480667e+02     4.948631e+01
 * time: 2.5471458435058594
    58     4.471799e+02     8.417188e+01
 * time: 2.5510239601135254
    59     4.461134e+02     4.529769e+01
 * time: 2.556380033493042
    60     4.452253e+02     7.551449e+00
 * time: 2.561663866043091
    61     4.447828e+02     1.762000e+01
 * time: 2.5670628547668457
    62     4.446123e+02     4.642462e+01
 * time: 2.572300910949707
    63     4.443932e+02     2.755456e+01
 * time: 2.577713966369629
    64     4.439630e+02     1.675038e+01
 * time: 2.5830068588256836
    65     4.434601e+02     1.160119e+02
 * time: 2.5877888202667236
    66     4.432802e+02     7.713017e+00
 * time: 2.5912039279937744
    67     4.428348e+02     7.635308e+00
 * time: 2.595047950744629
    68     4.424368e+02     8.110696e+01
 * time: 2.5985050201416016
    69     4.420727e+02     9.342020e+01
 * time: 2.6019489765167236
    70     4.403707e+02     1.405464e+02
 * time: 2.6055989265441895
    71     4.400903e+02     6.770072e+01
 * time: 2.6117918491363525
    72     4.398476e+02     2.660505e+02
 * time: 2.615924835205078
    73     4.394886e+02     9.746305e+01
 * time: 2.619365930557251
    74     4.392992e+02     3.934368e+02
 * time: 2.622999906539917
    75     4.390273e+02     6.220876e+01
 * time: 2.6264748573303223
    76     4.384455e+02     1.200034e+02
 * time: 2.6301510334014893
    77     4.381655e+02     7.536095e+02
 * time: 2.635066032409668
    78     4.376569e+02     2.716359e+03
 * time: 2.638692855834961
    79     4.371396e+02     1.427975e+02
 * time: 2.779093027114868
    80     4.365253e+02     2.291738e+01
 * time: 2.782742977142334
    81     4.362967e+02     3.650541e+02
 * time: 2.7888548374176025
    82     4.360851e+02     9.683767e+02
 * time: 2.793313980102539
    83     4.359698e+02     1.188451e+03
 * time: 2.7976009845733643
    84     4.356883e+02     1.529329e+03
 * time: 2.8018739223480225
    85     4.353129e+02     2.477125e+03
 * time: 2.806185007095337
    86     4.347641e+02     4.445794e+02
 * time: 2.8098440170288086
    87     4.346491e+02     3.346309e+03
 * time: 2.814093828201294
    88     4.340737e+02     5.621341e+03
 * time: 2.81768798828125
    89     4.330218e+02     5.253844e+03
 * time: 2.821431875228882
    90     4.329277e+02     1.466953e+04
 * time: 2.825824022293091
    91     4.321405e+02     1.988316e+03
 * time: 2.8294918537139893
    92     4.316377e+02     1.175474e+04
 * time: 2.833195924758911
    93     4.315243e+02     1.457175e+04
 * time: 2.8376200199127197
    94     4.313393e+02     1.434983e+04
 * time: 2.84212589263916
    95     4.311457e+02     1.466406e+04
 * time: 2.846637010574341
    96     4.309173e+02     1.397546e+04
 * time: 2.8513028621673584
    97     4.306791e+02     1.484925e+04
 * time: 2.8574960231781006
    98     4.303810e+02     1.223827e+04
 * time: 2.862381935119629
    99     4.300841e+02     2.606666e+04
 * time: 2.86742901802063
   100     4.296818e+02     3.263127e+04
 * time: 2.8725388050079346
   101     4.294220e+02     2.953363e+04
 * time: 2.877699851989746
   102     4.291854e+02     3.241563e+04
 * time: 2.8827898502349854
   103     4.288485e+02     2.199741e+04
 * time: 2.8882369995117188
   104     4.285033e+02     7.169887e+04
 * time: 2.893751859664917
   105     4.282675e+02     8.209757e+04
 * time: 2.8990888595581055
   106     4.278118e+02     6.496780e+04
 * time: 2.904326915740967
   107     4.276005e+02     7.412336e+04
 * time: 2.9094719886779785
   108     4.272582e+02     5.876273e+04
 * time: 2.9145798683166504
   109     4.269661e+02     9.679311e+04
 * time: 2.919595956802368
   110     4.265732e+02     1.131204e+05
 * time: 2.9247310161590576
   111     4.262809e+02     1.100267e+05
 * time: 2.929771900177002
   112     4.259916e+02     1.131859e+05
 * time: 2.9348440170288086
   113     4.256448e+02     9.590546e+04
 * time: 2.940081834793091
   114     4.252941e+02     2.497391e+05
 * time: 2.945194959640503
   115     4.247396e+02     3.667522e+05
 * time: 2.950361967086792
   116     4.244417e+02     3.177088e+05
 * time: 2.9575328826904297
   117     4.242009e+02     3.549387e+05
 * time: 2.9635260105133057
   118     4.238293e+02     2.317690e+05
 * time: 2.969058036804199
   119     4.234645e+02     5.444075e+05
 * time: 2.9746909141540527
   120     4.231832e+02     6.626479e+05
 * time: 2.9803709983825684
   121     4.229000e+02     6.248714e+05
 * time: 2.9857609272003174
   122     4.226437e+02     6.202141e+05
 * time: 2.990994930267334
   123     4.223476e+02     5.763202e+05
 * time: 2.9965310096740723
   124     4.220353e+02     6.722184e+05
 * time: 3.00197696685791
   125     4.215553e+02     3.868167e+05
 * time: 3.007438898086548
   126     4.211613e+02     2.917101e+06
 * time: 3.0128488540649414
   127     4.204823e+02     3.685050e+06
 * time: 3.0185089111328125
   128     4.187401e+02     6.951196e+06
 * time: 3.023772954940796
   129     4.174679e+02     5.788368e+06
 * time: 3.033639907836914
   130     4.171014e+02     1.511670e+07
 * time: 3.042224884033203
   131     4.166690e+02     2.943324e+06
 * time: 3.04762601852417
   132     4.157380e+02     4.686169e+05
 * time: 3.05302095413208
   133     4.154194e+02     9.348694e+06
 * time: 3.059499979019165
   134     4.152062e+02     1.421900e+07
 * time: 3.0659890174865723
   135     4.149796e+02     1.671737e+07
 * time: 3.072237968444824
   136     4.147019e+02     1.802229e+07
 * time: 3.0784168243408203
   137     4.144039e+02     1.847006e+07
 * time: 3.0855748653411865
   138     4.140881e+02     1.850972e+07
 * time: 3.0940539836883545
   139     4.137488e+02     1.861463e+07
 * time: 3.1003499031066895
   140     4.133741e+02     1.860858e+07
 * time: 3.1069040298461914
   141     4.129603e+02     1.907969e+07
 * time: 3.113184928894043
   142     4.124802e+02     1.741585e+07
 * time: 3.119328022003174
   143     4.119884e+02     4.218883e+07
 * time: 3.1252989768981934
   144     4.109091e+02     1.714223e+08
 * time: 3.1314330101013184
   145     4.104487e+02     1.441888e+08
 * time: 3.137526035308838
   146     4.096514e+02     1.763079e+08
 * time: 3.1436610221862793
   147     4.091312e+02     5.946881e+08
 * time: 3.148981809616089
   148     4.052097e+02     3.624908e+09
 * time: 3.1557059288024902
   149     4.046037e+02     5.109422e+08
 * time: 3.1651828289031982
   150     4.040337e+02     5.926690e+08
 * time: 3.1714208126068115
   151     4.032925e+02     1.529026e+08
 * time: 3.1771860122680664
   152     4.029895e+02     1.133947e+09
 * time: 3.188459873199463
   153     4.028606e+02     2.150461e+09
 * time: 3.1940479278564453
   154     4.023712e+02     2.847341e+09
 * time: 3.19952392578125
   155     4.013024e+02     3.636985e+09
 * time: 3.2060868740081787
   156     3.988076e+02     9.859466e+09
 * time: 3.211897850036621
   157     3.976552e+02     1.714441e+10
 * time: 3.2188289165496826
   158     3.955430e+02     2.256291e+10
 * time: 3.225735902786255
   159     3.940546e+02     1.193260e+10
 * time: 3.2344160079956055
   160     3.937247e+02     9.073932e+09
 * time: 3.2455058097839355
   161     3.936886e+02     4.858347e+10
 * time: 3.251490831375122
   162     3.929759e+02     1.467714e+10
 * time: 3.2573838233947754
   163     3.917377e+02     6.528655e+10
 * time: 3.263216018676758
   164     3.911035e+02     8.289130e+10
 * time: 3.2699899673461914
   165     3.888147e+02     1.098190e+12
 * time: 3.2760119438171387
   166     3.862471e+02     8.637045e+10
 * time: 3.2829949855804443
   167     3.847153e+02     6.632016e+10
 * time: 3.290367841720581
   168     3.847153e+02     6.632016e+10
 * time: 3.317539930343628
   169     3.847153e+02     6.632016e+10
 * time: 3.34230899810791
FittedPumasModel

Dynamical system type:          Matrix exponential

Number of subjects:                             32

Observation records:         Active        Missing
    conc:                       251             47
    Total:                      251             47

Number of parameters:      Constant      Optimized
                                  1              8

Likelihood approximation:              NaivePooled
Likelihood optimizer:                         BFGS

Termination Reason:              NoObjectiveChange
Log-likelihood value:                   -384.71534

------------------------
             Estimate
------------------------
  pop_CL     0.12659
  pop_Vc     8.7367
  pop_tabs   1.4476e-11
  pop_lag    1.0
† pk_Ω₁,₁    0.01
† pk_Ω₂,₂    0.01
† pk_Ω₃,₃    0.01
  σ_prop     0.25974
  σ_add      2.8554e-9
------------------------
† indicates constant parameters

As you can see, the constantcoef argument is used to fix the inter-individual variability parameters to zero. The parameter estimates from the naive pooled fit seem out of place for some parameters such as pop_tabs. Perhaps the initial parameter values are not close to the true values and some adjustments are needed. It is also possible that the data cannot support a model with lags and first order absorption if some subjects very quickly have large concentration values or we really need inter-individual variability to fit the data. In Pumas, you can rerun the fit with new initial parameter values very easily. Suppose, we want to run the naive pooled fit again with 4 new initial parameter values that are within 10% of the original values.

We can use the power of the Julia language to generate 4 new parameter values that are within 10% of the original values.

# Generate parameter sets by scaling original values by ±10%
param_sets = [
    NamedTuple(k => v * scale for (k, v) in pairs(param_vals)) for
    scale in [0.9, 0.95, 1.05, 1.1]
]
4-element Vector{@NamedTuple{pop_CL::Float64, pop_Vc::Float64, pop_tabs::Float64, pop_lag::Float64, pk_Ω::Diagonal{Float64, Vector{Float64}}, σ_prop::Float64, σ_add::Float64}}:
 (pop_CL = 0.12060000000000001, pop_Vc = 7.2989999999999995, pop_tabs = 0.4707, pop_lag = 0.09000000000000001, pk_Ω = [0.009000000000000001 0.0 0.0; 0.0 0.009000000000000001 0.0; 0.0 0.0 0.009000000000000001], σ_prop = 0.006768, σ_add = 0.05949000000000001)
 (pop_CL = 0.1273, pop_Vc = 7.7044999999999995, pop_tabs = 0.49685, pop_lag = 0.095, pk_Ω = [0.0095 0.0 0.0; 0.0 0.0095 0.0; 0.0 0.0 0.0095], σ_prop = 0.007143999999999999, σ_add = 0.062795)
 (pop_CL = 0.14070000000000002, pop_Vc = 8.5155, pop_tabs = 0.54915, pop_lag = 0.10500000000000001, pk_Ω = [0.0105 0.0 0.0; 0.0 0.0105 0.0; 0.0 0.0 0.0105], σ_prop = 0.007896, σ_add = 0.06940500000000001)
 (pop_CL = 0.14740000000000003, pop_Vc = 8.921, pop_tabs = 0.5753, pop_lag = 0.11000000000000001, pk_Ω = [0.011000000000000001 0.0 0.0; 0.0 0.011000000000000001 0.0; 0.0 0.0 0.011000000000000001], σ_prop = 0.008272, σ_add = 0.07271000000000001)

Now, we can run the fit again with the new parameter values.

new_fits = [
    fit(warfarin_pk_model, pop, param_set, NaivePooled(), constantcoef = (:pk_Ω,)) for
    param_set in param_sets
];

You can compare the results of the new fits with perturbed initial parameter values.

compare_estimates(;
    p1 = new_fits[1],
    p2 = new_fits[2],
    p3 = new_fits[3],
    p4 = new_fits[4],
    original = naive_fit,
)
9×6 DataFrame
Row parameter p1 p2 p3 p4 original
String Float64? Float64? Float64? Float64? Float64?
1 pop_CL 0.123625 0.192774 0.123403 0.0235632 0.126594
2 pop_Vc 9.28085 16.1102 9.13101 49.454 8.73673
3 pop_tabs 4.10644e-12 0.60404 0.972573 0.154961 1.44759e-11
4 pop_lag 1.0 0.706171 0.5 0.648555 1.0
5 pk_Ω₁,₁ 0.009 0.0095 0.0105 0.011 0.01
6 pk_Ω₂,₂ 0.009 0.0095 0.0105 0.011 0.01
7 pk_Ω₃,₃ 0.009 0.0095 0.0105 0.011 0.01
8 σ_prop 0.293987 0.778574 0.283517 2.15052 0.259744
9 σ_add 1.25063e-85 1.6711e-173 4.76955e-12 1.57173e-162 2.8554e-9

The compare_estimates function is a convenience function that is part of the PumasUtilities package. It is used to compare the estimates of the new fits with the original fit. The p3 results seem to be reasonable amongst all the fits and these results can be taken forward for the FOCE() fit.

2.6 Estimating Parameters via Maximum Likelihood (FOCE)

After obtaining initial estimates, the next step is fitting the model to data using a maximum likelihood approach. One of the most commonly used methods in Pumas is the First-Order Conditional Estimation (FOCE) method. This is invoked by specifying FOCE() as the estimation algorithm using the initial parameter values specified in the model definition:

foce_fit = fit(warfarin_pk_model, pop, init_params(warfarin_pk_model), FOCE())

The init_params function is a convenience function that extracts the initial parameter values from the model definition.

Note

By default, the fit function will use all the cores of your machine to parallelize the optimization. You can control this using the ensemblealg argument. For more details, see the fit options section in the Pumas Documentation.

2.7 Fixing Parameters with constantcoef

Sometimes, it is desirable to hold certain parameters constant during estimation. For instance, suppose one wants to fix tabs at a particular value:

foce_fit_fixedtabs = fit(
    warfarin_pk_model,
    pop,
    (; param_vals..., pop_tabs = 0.5),
    FOCE();
    constantcoef = (:pop_tabs,),
);   # This fixes the pop_tabs parameter
foce_fit_fixedtabs
FittedPumasModel

Dynamical system type:          Matrix exponential

Number of subjects:                             32

Observation records:         Active        Missing
    conc:                       251             47
    Total:                      251             47

Number of parameters:      Constant      Optimized
                                  1              8

Likelihood approximation:                     FOCE
Likelihood optimizer:                         BFGS

Termination Reason:                      NoXChange
Log-likelihood value:                   -350.54228

----------------------
             Estimate
----------------------
  pop_CL     0.13455
  pop_Vc     8.0517
† pop_tabs   0.5
  pop_lag    0.8751
  pk_Ω₁,₁    0.07066
  pk_Ω₂,₂    0.01826
  pk_Ω₃,₃    0.95054
  σ_prop     0.089942
  σ_add      0.39217
----------------------
† indicates constant parameters

This approach is useful for sensitivity analysis, exploring different values, or simplifying the model.

2.8 Changing Tolerance During Fitting

The fit function accepts additional keyword arguments to tweak the fitting process, such as tolerances for convergence:

foce_fit_fixedtabs = fit(
    warfarin_pk_model,
    pop,
    (; param_vals..., pop_tabs = 0.5),
    FOCE();
    constantcoef = (:pop_tabs,),
    optim_options = (; x_reltol = 1e-6, x_abstol = 1e-8, iterations = 200),
);

These arguments (x_reltol, x_abstol, iterations) help fine-tune the solver’s stopping criteria. For more details, see the Optimization Options during Fitting.

2.9 Interpreting the Printed Results

Once the model fit is complete, Pumas will print a summary of the fit to the console. This typically includes:

  1. Objective Function Value (OFV): The minimized negative log-likelihood (or similar objective). Lower is better.
  2. Convergence Status: An indication of whether the solver found a local optimum or if it ran into issues (e.g., maximum iterations reached).
  3. Parameter Estimates: A table listing final estimates of fixed effects, random effects (variances or standard deviations), and residual variability components.
  4. Standard Errors (if computed): By default, these might not be computed until using infer (discussed in the next section).

The printed output might look like:

julia> foce_fit_fixedtabs
FittedPumasModel

Successful minimization:                      true

Likelihood approximation:                     FOCE
Likelihood Optimizer:                         BFGS
Dynamical system type:          Matrix exponential

Log-likelihood value:                    -320.2235
Number of subjects:                             31
Number of parameters:         Fixed      Optimized
                                  1              8
Observation records:         Active        Missing
    conc:                       239             47
    Total:                      239             47

-----------------------
              Estimate
-----------------------
pop_CL        0.13091
pop_Vc        8.08
pop_tabs      0.5
pop_lag       0.91842
pk_Ω₁,₁       0.050929
pk_Ω₂,₂       0.019161
pk_Ω₃,₃       1.056
σ_prop        0.090974
σ_add         0.33189
-----------------------

We can break down the output of the fit function into the following components:

2.9.1 FittedPumasModel

FittedPumasModel

The FittedPumasModel object contains the following fields:

  • model: The original model definition
foce_fit_fixedtabs.model
PumasModel
  Parameters: pop_CL, pop_Vc, pop_tabs, pop_lag, pk_Ω, σ_prop, σ_add
  Random effects: pk_η
  Covariates: FSZCL, FSZV
  Dynamical system variables: Depot, Central
  Dynamical system type: Matrix exponential
  Derived: conc
  Observed: conc
  • data: The population data
foce_fit_fixedtabs.data
Population
  Subjects: 32
  Covariates: sex, wtbl, FSZV, FSZCL
  Observations: conc
  • optim: The optimization results
foce_fit_fixedtabs.optim
 * Status: success

 * Candidate solution
    Final objective value:     3.505425e+02

 * Found with
    Algorithm:     BFGS

 * Convergence measures
    |x - x'|               = 1.38e-06 ≰ 1.0e-08
    |x - x'|/|x'|          = 3.46e-07 ≤ 1.0e-06
    |f(x) - f(x')|         = 5.58e-07 ≰ 0.0e+00
    |f(x) - f(x')|/|f(x')| = 1.59e-09 ≰ 0.0e+00
    |g(x)|                 = 6.41e-02 ≰ 1.0e-03

 * Work counters
    Seconds run:   6  (vs limit Inf)
    Iterations:    74
    f(x) calls:    82
    ∇f(x) calls:   75
  • approx: The likelihood approximation method
foce_fit_fixedtabs.approx
FOCE
  • kwargs: Additional keyword arguments
foce_fit_fixedtabs.kwargs
  • fixedparamset: The fixed parameter set
foce_fit_fixedtabs.fixedparamset
  • optim_state: The optimization state
foce_fit_fixedtabs.optim_state

2.9.2 Minimization Status

Successful minimization:                      true

The Successful minimization field indicates whether the minimization was successful. It depends on the convergence of the optimizer and rules set in the optim_options argument, for example x_reltol and x_abstol.

2.9.3 Likelihood Approximation

  Likelihood approximation:                     FOCE

The Likelihood approximation field indicates the likelihood approximation method used.

2.9.4 Likelihood Optimizer

  Likelihood Optimizer:                         BFGS

The Likelihood Optimizer field indicates the optimizer used which by default is BFGS from the Optim package. A user can change the optimizer by setting the optim_alg argument in the fit function.

2.9.5 Dynamical system type

  Dynamical system type:          Matrix exponential

The Dynamical system type field indicates the type of dynamical system used. In the example here, even though the model is a differential equation model, the Matrix exponential indicates that the model is solved using a matrix exponential solver as the system has been deemed linear. The user can check the linear nature of the system as follows: foce_fit.model.prob which results in this case to Pumas.LinearODE(). If the user does not want the automatic check for linearity, they can turn it off in the @options block of the model definition.

@model begin
    @param begin
        ...
    end
    @random begin
        ...
    end
    @dynamics begin
        ...
    end
    @pre begin
        ...
    end
    @options begin
        checklinear = false
    end
end

The checklinear option is a boolean that determines whether the solver should check if the system defined in the @dynamics block is linear. If the system is linear, setting this option to true (default) enables calculating the solution through matrix exponentials. If it is set to false or time (t) appears in the @pre block, this optimization is disabled. This option can be useful when the matrix exponential solver is not superior to general numerical integrators or for debugging purposes.

2.9.6 Log-likelihood value

  Log-likelihood value:                    -320.2235

The Log-likelihood value field indicates the value of the log-likelihood function at the maximum likelihood estimate.

2.9.7 Number of subjects

  Number of subjects:                             31

The Number of subjects field indicates the number of subjects from the population data used in the fit.

2.9.8 Number of parameters

  Number of parameters:         Fixed      Optimized
                                  1              8

The Number of parameters field indicates the number of fixed and optimized parameters. The Fixed column indicates the number of parameters that were fixed using the constantcoef argument during the fit and the Optimized column indicates the number of parameters that were optimized.

2.9.9 Observation records

  Observation records:         Active        Missing
    conc:                       239             47
    Total:                      239             47

The Observation records field indicates the number of active and missing observations in the population data. The Active column indicates the number of observations that were used in the fit and the Missing column indicates the number of observations that were missing and not used in the fit. missing in this case is due to the missingness of the conc observations, which is the collection of all the conc observations from all the subjects.

2.9.10 Parameter estimates

-----------------------
              Estimate
-----------------------
pop_CL        0.13091
pop_Vc        8.08
pop_tabs      0.5
pop_lag       0.91842
pk_Ω₁,₁       0.050929
pk_Ω₂,₂       0.019161
pk_Ω₃,₃       1.056
σ_prop        0.090974
σ_add         0.33189
-----------------------

The Parameter estimates field indicates the final parameter estimates. The Estimate column indicates the maximum likelihood estimates of the parameter.

2.10 Computing Parameter Precision with infer

The infer function in Pumas estimates the uncertainty (precision) of parameter estimates. Depending on the chosen method, infer can provide standard errors, confidence intervals, and correlation matrices.

The signature for infer often looks like:

infer(
    fpm::FittedPumasModel;
    level = 0.95,
    rethrow_error::Bool = false,
    sandwich_estimator::Bool = true,
)

where:

  • fpm::FittedPumasModel: The result of fit (e.g., foce_fit).
  • level: The confidence interval level (e.g., 0.95). The confidence intervals are calculated as the (1-level)/2 and (1+level)/2 quantiles of the estimated parameters
  • rethrow_error: If rethrow_error is false (the default value), no error will be thrown if the variance-covariance matrix estimator fails. If it is true, an error will be thrown if the estimator fails.
  • sandwich_estimator: Whether to use the sandwich estimator. If set to true (the default value), the sandwich estimator will be used. If set to false, the standard error will be calculated using the inverse of the Hessian matrix.

An example usage:

inference_results = infer(foce_fit_fixedtabs; level = 0.95)
[ Info: Calculating: variance-covariance matrix.
[ Info: Done.
Asymptotic inference results using sandwich estimator

Dynamical system type:          Matrix exponential

Number of subjects:                             32

Observation records:         Active        Missing
    conc:                       251             47
    Total:                      251             47

Number of parameters:      Constant      Optimized
                                  1              8

Likelihood approximation:                     FOCE
Likelihood optimizer:                         BFGS

Termination Reason:                   SmallXChange
Log-likelihood value:                   -350.54254

----------------------------------------------------------
             Estimate   SE          95.0% C.I.
----------------------------------------------------------
  pop_CL     0.13455    0.0065675   [ 0.12168  ; 0.14742]
  pop_Vc     8.0517     0.21972     [ 7.6211   ; 8.4824 ]
† pop_tabs   0.5        NaN         [ NaN      ; NaN    ]
  pop_lag    0.87498    0.051763    [ 0.77353  ; 0.97644]
  pk_Ω₁,₁    0.070958   0.024781    [ 0.022387 ; 0.11953]
  pk_Ω₂,₂    0.01827    0.0051531   [ 0.0081698; 0.02837]
  pk_Ω₃,₃    0.94384    0.38455     [ 0.19013  ; 1.6975 ]
  σ_prop     0.089942   0.014455    [ 0.061611 ; 0.11827]
  σ_add      0.39221    0.066849    [ 0.26119  ; 0.52323]
----------------------------------------------------------
† indicates constant parameters

We can use the sandwich_estimator argument to get a more robust estimate of the standard errors.

inference_results = infer(foce_fit_fixedtabs; level = 0.95, sandwich_estimator = false)
[ Info: Calculating: variance-covariance matrix.
[ Info: Done.
Asymptotic inference results using negative inverse Hessian

Dynamical system type:          Matrix exponential

Number of subjects:                             32

Observation records:         Active        Missing
    conc:                       251             47
    Total:                      251             47

Number of parameters:      Constant      Optimized
                                  1              8

Likelihood approximation:                     FOCE
Likelihood optimizer:                         BFGS

Termination Reason:                   SmallXChange
Log-likelihood value:                   -350.54254

-----------------------------------------------------------
             Estimate   SE          95.0% C.I.
-----------------------------------------------------------
  pop_CL     0.13455    0.0064882   [ 0.12183  ; 0.14727 ]
  pop_Vc     8.0517     0.22744     [ 7.606    ; 8.4975  ]
† pop_tabs   0.5        NaN         [ NaN      ; NaN     ]
  pop_lag    0.87498    0.038082    [ 0.80034  ; 0.94962 ]
  pk_Ω₁,₁    0.070958   0.018798    [ 0.034114 ; 0.1078  ]
  pk_Ω₂,₂    0.01827    0.0062287   [ 0.0060618; 0.030478]
  pk_Ω₃,₃    0.94384    0.4181      [ 0.12438  ; 1.7633  ]
  σ_prop     0.089942   0.0089122   [ 0.072474 ; 0.10741 ]
  σ_add      0.39221    0.046786    [ 0.30051  ; 0.48391 ]
-----------------------------------------------------------
† indicates constant parameters

This result above indicates that both with the sandwich estimator and the inverse of the Hessian matrix on the fixed parameter, the inference worked. We will pursue other methods to obtain parameter precision in later tutorials, such as bootstrap and SIR.

3 Concluding Remarks

This tutorial showcased a typical Pumas workflow for PK model fitting using a Warfarin dataset:

  1. Model Definition and Data Preparation.
  2. Checking Compatibility via loglikelihood and findinfluential.
  3. Initial Parameter Estimation using Naive Pooled.
  4. Fitting with FOCE, demonstrating how to fix parameters or change solver tolerances.
  5. Interpreting Fit Results, exploring the components of a fittedpumasmodel.
  6. Computing Precision of estimates with infer using different methods.

Readers are encouraged to refine their understanding by:

  • Performing thorough Exploratory Data Analysis prior to modeling.
  • Exploring Bootstrap or SIR methods for deeper uncertainty quantification (covered in later tutorials).
  • Validating the final model through visual diagnostics (e.g., VPCs, GOF plots).

More advanced topics can be found throughout the Pumas Documentation and Pumas Tutorials.

Tip

Real-world workflows are iterative. Analysts often revisit model assumptions, re-check data, and explore alternative parameterizations before finalizing any one model as “best.”