using Pumas
using PharmaDatasets
using DataFramesMeta
using CairoMakie
using AlgebraOfGraphics
using PumasUtilities
using Markdown
Fitting and Inferring a Compartmental PK Model
1 Introduction
A typical workflow for fitting a Pumas model and deriving parameter precision typically involves:
- Preparing the data and the model.
- Checking model-data compatibility.
- Obtaining initial parameter estimates.
- Fitting the model via a chosen estimation method.
- Interpreting the fit results.
- Computing parameter precision.
- (Optionally) proceeding with more advanced techniques like bootstrapping or SIR for robust uncertainty quantification.
The following sections will walk through these steps using a one-compartment PK model for Warfarin, focusing on the PK aspects only. Exploratory data analysis (EDA), although extremely important, is out of scope here. Readers interested in EDA are encouraged to consult other tutorials.
2 Model and Data
2.1 Model Definition
Below is the PK model, named warfarin_pk_model
, defined in Pumas. This model contains:
- Fixed effects (population parameters):
pop_CL, pop_Vc, pop_tabs, pop_lag
- Inter-individual variability (IIV) components:
pk_Ω
- Residual error model parameters:
σ_prop, σ_add
- Covariates for scaling:
FSZCL
andFSZV
- Differential equations describing the PK behavior in the compartments.
= @model begin
warfarin_pk_model
@metadata begin
= "Warfarin 1-compartment PK model (PD removed)"
desc = u"hr"
timeu end
@param begin
# PK parameters
"""
Clearance (L/hr)
"""
∈ RealDomain(lower = 0.0, init = 0.134)
pop_CL """
Central volume (L)
"""
∈ RealDomain(lower = 0.0, init = 8.11)
pop_Vc """
Absorption lag time (hr)
"""
∈ RealDomain(lower = 0.0, init = 0.523)
pop_tabs """
Lag time (hr)
"""
∈ RealDomain(lower = 0.0, init = 0.1)
pop_lag
# Inter-individual variability
"""
- ΩCL: Clearance
- ΩVc: Central volume
- Ωtabs: Absorption lag time
"""
∈ PDiagDomain([0.01, 0.01, 0.01])
pk_Ω # Residual variability
"""
σ_prop: Proportional error
"""
∈ RealDomain(lower = 0.0, init = 0.00752)
σ_prop """
σ_add: Additive error
"""
∈ RealDomain(lower = 0.0, init = 0.0661)
σ_add end
@random begin
~ MvNormal(pk_Ω) # mean = 0, covariance = pk_Ω
pk_η end
@covariates begin
"""
FSZCL: Clearance scaling factor
"""
FSZCL"""
FSZV: Volume scaling factor
"""
FSZVend
@pre begin
= FSZCL * pop_CL * exp(pk_η[1])
CL = FSZV * pop_Vc * exp(pk_η[2])
Vc
= pop_tabs * exp(pk_η[3])
tabs = log(2) / tabs
Ka end
@dosecontrol begin
= (Depot = pop_lag,)
lags end
@vars begin
:= Central / Vc
cp end
@dynamics begin
' = -Ka * Depot
Depot' = Ka * Depot - (CL / Vc) * Central
Centralend
@derived begin
"""
Concentration (ng/mL)
"""
~ @. Normal(cp, sqrt((σ_prop * cp)^2 + σ_add^2))
conc end
end
PumasModel
Parameters: pop_CL, pop_Vc, pop_tabs, pop_lag, pk_Ω, σ_prop, σ_add
Random effects: pk_η
Covariates: FSZCL, FSZV
Dynamical system variables: Depot, Central
Dynamical system type: Matrix exponential
Derived: conc
Observed: conc
2.2 Data Preparation
The Warfarin data used in this tutorial is pulled from PharmaDatasets
for demonstration purposes. Note how the code reshapes and prepares the data in “wide” format for reading into Pumas. Only the conc
column is treated as observations for the PK model.
= dataset("pumas/warfarin_nonmem")
warfarin_data
# Step 2: Fix Duplicate Time Points
# -------------------------------
# Some subjects have duplicate time points for dvid = 1
# For this dataset, the triple (id, time, dvid) should define
# a row uniquely, but
nrow(warfarin_data)
nrow(unique(warfarin_data, ["id", "time", "dvid"]))
# We can identify the problematic rows by grouping on the index variables
@chain warfarin_data begin
@groupby :id :time :dvid
@transform :tmp = length(:id)
@rsubset :tmp > 1
end
# Transform the data in a single chain of operations
= @chain warfarin_data begin
warfarin_data_wide @rtransform begin
# Scaling factors
:FSZV = :wtbl / 70 # volume scaling
:FSZCL = (:wtbl / 70)^0.75 # clearance scaling (allometric)
# Column name for the dv
:dvname = "dv$(:dvid)"
# Dosing indicator columns
:cmt = ismissing(:amt) ? missing : 1
:evid = iszero(:amt) ? 0 : 1
end
unstack(Not([:dvid, :dvname, :dv]), :dvname, :dv)
rename!(:dv1 => :conc, :dv2 => :pca)
select!(Not([:dv0]))
end
Row | id | time | evid | amt | wtbl | age | sex | FSZV | FSZCL | cmt | conc | pca |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Int64 | Float64 | Int64 | Float64 | Float64 | Int64 | String1 | Float64 | Float64 | Int64 | Float64? | Float64? | |
1 | 1 | 0.0 | 1 | 100.0 | 66.7 | 50 | M | 0.952857 | 0.96443 | 1 | missing | missing |
2 | 1 | 0.5 | 0 | 0.0 | 66.7 | 50 | M | 0.952857 | 0.96443 | 1 | 0.0 | missing |
3 | 1 | 1.0 | 0 | 0.0 | 66.7 | 50 | M | 0.952857 | 0.96443 | 1 | 1.9 | missing |
4 | 1 | 2.0 | 0 | 0.0 | 66.7 | 50 | M | 0.952857 | 0.96443 | 1 | 3.3 | missing |
5 | 1 | 3.0 | 0 | 0.0 | 66.7 | 50 | M | 0.952857 | 0.96443 | 1 | 6.6 | missing |
6 | 1 | 6.0 | 0 | 0.0 | 66.7 | 50 | M | 0.952857 | 0.96443 | 1 | 9.1 | missing |
7 | 1 | 9.0 | 0 | 0.0 | 66.7 | 50 | M | 0.952857 | 0.96443 | 1 | 10.8 | missing |
8 | 1 | 12.0 | 0 | 0.0 | 66.7 | 50 | M | 0.952857 | 0.96443 | 1 | 8.6 | missing |
9 | 1 | 24.0 | 0 | 0.0 | 66.7 | 50 | M | 0.952857 | 0.96443 | 1 | 5.6 | 44.0 |
10 | 1 | 36.0 | 0 | 0.0 | 66.7 | 50 | M | 0.952857 | 0.96443 | 1 | 4.0 | 27.0 |
11 | 1 | 48.0 | 0 | 0.0 | 66.7 | 50 | M | 0.952857 | 0.96443 | 1 | 2.7 | 28.0 |
12 | 1 | 72.0 | 0 | 0.0 | 66.7 | 50 | M | 0.952857 | 0.96443 | 1 | 0.8 | 31.0 |
13 | 1 | 96.0 | 0 | 0.0 | 66.7 | 50 | M | 0.952857 | 0.96443 | 1 | missing | 60.0 |
⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ |
319 | 32 | 48.0 | 0 | 0.0 | 62.0 | 21 | M | 0.885714 | 0.912999 | 1 | 6.9 | 24.0 |
320 | 32 | 72.0 | 0 | 0.0 | 62.0 | 21 | M | 0.885714 | 0.912999 | 1 | 4.4 | 23.0 |
321 | 32 | 96.0 | 0 | 0.0 | 62.0 | 21 | M | 0.885714 | 0.912999 | 1 | 3.5 | 20.0 |
322 | 32 | 120.0 | 0 | 0.0 | 62.0 | 21 | M | 0.885714 | 0.912999 | 1 | 2.5 | 22.0 |
323 | 33 | 0.0 | 1 | 100.0 | 66.7 | 50 | M | 0.952857 | 0.96443 | 1 | missing | missing |
324 | 33 | 0.0 | 0 | 0.0 | 66.7 | 50 | M | 0.952857 | 0.96443 | 1 | missing | 100.0 |
325 | 33 | 24.0 | 0 | 0.0 | 66.7 | 50 | M | 0.952857 | 0.96443 | 1 | 9.2 | 49.0 |
326 | 33 | 36.0 | 0 | 0.0 | 66.7 | 50 | M | 0.952857 | 0.96443 | 1 | 8.5 | 32.0 |
327 | 33 | 48.0 | 0 | 0.0 | 66.7 | 50 | M | 0.952857 | 0.96443 | 1 | 6.4 | 26.0 |
328 | 33 | 72.0 | 0 | 0.0 | 66.7 | 50 | M | 0.952857 | 0.96443 | 1 | 4.8 | 22.0 |
329 | 33 | 96.0 | 0 | 0.0 | 66.7 | 50 | M | 0.952857 | 0.96443 | 1 | 3.1 | 28.0 |
330 | 33 | 120.0 | 0 | 0.0 | 66.7 | 50 | M | 0.952857 | 0.96443 | 1 | 2.5 | 33.0 |
2.3 Creating a Pumas Population
Below is the creation of a population object in Pumas using read_pumas
. Only the conc
data are treated as the observation variable:
= read_pumas(
pop
warfarin_data_wide;= :id,
id = :time,
time = :amt,
amt = :cmt,
cmt = :evid,
evid = [:sex, :wtbl, :FSZV, :FSZCL],
covariates = [:conc],
observations )
Population
Subjects: 32
Covariates: sex, wtbl, FSZV, FSZCL
Observations: conc
The same data can contain multiple endpoints or PD observations. In this tutorial, the focus is solely on PK fitting. PKPD modeling on this warfarin dataset will be introduced later.
2.4 Checking Model-Data Compatibility
Before performing any fit, it is recommended to verify whether the defined model is consistent with the provided dataset. Pumas offers functions such as loglikelihood
and findinfluential
for these checks.
2.4.1 The loglikelihood
Function
The loglikelihood
function computes the log-likelihood of the model given data and parameters. In Pumas, the signature typically looks like:
loglikelihood(model, population, params, approx)
where:
model
: The Pumas model definition (e.g.,warfarin_pk_model
).population
: A Pumas population object (e.g.,pop
).params
: A named tuple or dictionary containing parameter values.approx
: The approximation method to use. Common options includeFOCE()
,FO()
,LaplaceI()
, etc.
For example, one might write:
# A named tuple of parameter values
= (
param_vals = 0.134,
pop_CL = 8.11,
pop_Vc = 0.523,
pop_tabs = 0.1,
pop_lag = Diagonal([0.01, 0.01, 0.01]),
pk_Ω = 0.00752,
σ_prop = 0.0661,
σ_add
)
= loglikelihood(warfarin_pk_model, pop, param_vals, FOCE()) ll_value
-12090.636664027372
The initial loglikelihood of the warfarin PK model given the data and parameter values is `LL = -12090.636664027372
`.
If the model and data are incompatible (e.g., missing doses for compartments, or out-of-range parameter values), loglikelihood
might return an error or produce a warning. A successful computation is a good sign that the model can handle the data.
2.4.2 The findinfluential
Function
The findinfluential
function helps identify observations that disproportionately influence the fit. It can be used before or after fitting, but even with initial guesses, it can highlight potentially problematic data points. The most notable case is when the loglikelihood cannot be evaluated in which case the data returned for the problematic subject does not return a finite value.
Typical usage is:
findinfluential(warfarin_pk_model, pop, params; approx)
This will list the individual loglikelihoods for each subject. Readers can then inspect these observations further, potentially re-checking the data quality or adjusting the model assumptions if needed for each subject.
= findinfluential(warfarin_pk_model, pop, param_vals, FOCE()) fi
32-element Vector{@NamedTuple{id::String, nll::Float64}}:
(id = "12", nll = 3179.162620631005)
(id = "14", nll = 2113.146717564112)
(id = "13", nll = 1627.4380743700433)
(id = "9", nll = 1050.6701195571484)
(id = "2", nll = 780.9109417396234)
(id = "1", nll = 699.9267530243993)
(id = "7", nll = 512.7488860179473)
(id = "11", nll = 470.0423353858021)
(id = "6", nll = 311.9666456226985)
(id = "3", nll = 192.83907815907168)
⋮
(id = "16", nll = 22.434033734725986)
(id = "5", nll = 18.475981181576064)
(id = "32", nll = 17.87557264312157)
(id = "17", nll = 17.341505598982042)
(id = "26", nll = 8.191625356662307)
(id = "24", nll = 6.4001549755272755)
(id = "31", nll = 6.363630970170725)
(id = "23", nll = 5.48597914877157)
(id = "29", nll = 4.343343958293616)
The default output is a vector of NamedTuples, which can easily be converted into a DataFrame for further analysis. One can then plot the distribution of the individual loglikelihoods to identify any potential problematic subjects.
= DataFrame(fi)
fidf hist(
fidf.nll,= (xlabel = "Log-likelihood", ylabel = "Frequency"),
axis = (:blue, 0.5),
color )
The plot above shows that there are some subjects with very high initial log-likelihoods, which might be worth investigating. Notice, that a high initial log-likelihood does not necessarily imply a model or data problem. It can simply be that for the chosen initial population parameters the model does not fit the observations well for the subject in question.
2.5 Getting Good Initial Estimates Using Naive Pooled
A reliable starting point for nonlinear mixed-effects modeling is to get reasonable initial parameter estimates. Pumas provides approaches such as:
- Naive Pooled: Treats all data as if there were no inter-individual variability.
- NCA (Non-compartmental Analysis): Uses PK metrics (CL, V, etc.) from a non-compartmental approach to seed the parameter values (not showcased in this tutorial).
Here, the Naive Pooled approach is demonstrated. This method estimates population parameters by ignoring inter-individual differences, effectively pooling data as if from a single “super-subject.”
Below is the signature and documentation for the Pumas fit
function. For more details, see the Fitting in Pumas Documentation.
fit(
model::PumasModel,
population::Population,
param::NamedTuple,
approx::Union{LikelihoodApproximation, MAP};
optim_alg = Optim.BFGS(linesearch = Optim.LineSearches.BackTracking(), initial_stepnorm = 1.0),
optim_options::NamedTuple = NamedTuple(),
optimize_fn = nothing,
constantcoef::Tuple = (),
ensemblealg::SciMLBase.EnsembleAlgorithm = EnsembleThreads(),
checkidentification = true,
diffeq_options = NamedTuple(),
verbose = true,
ignore_numerical_error = true,
)
Fit the Pumas model model to the dataset population with starting values param using the estimation method approx. Currently supported values for the `approx` argument are `FO`, `FOCE`,
`LaplaceI`, `NaivePooled`, and `BayesMCMC`. See the online documentation for more details about the different methods.
To control the optimization procedure for finding population parameters (fixed effects), use the `optim_alg` and `optim_options` keyword arguments. In previous versions of Pumas the argument
`optimize_fn` was used, but is now discouraged and will be removed in a later version of Pumas. These options control the optimization of all `approx` methods except BayesMCMC. The default
optimization function uses the quasi-Newton routine BFGS method from the `Optim` package. It can be changed by setting the `optim_alg` to an algorithm implemented in Optim.jl as long as it
does not use second order derivatives. Optimization specific options can be passed in using the `optim_options` keyword and has to be a `NamedTuple` with names and values that match the
`Optim.Options` type. For example, the optimization trace can be disabled and the algorithm can be changed to L-BFGS by passing `optim_alg=Optim.LBFGS(), optim_options = ;(show_trace=false)`
to fit. See [Optim](https://docs.pumas.ai/stable/basics/estimation/#Optimization-option) for more defails.
It is possible to fix one or more parameters of the fit by passing a Tuple of Symbols as the `constantcoef` argument with elements corresponding to the names of the fixed parameters, e.g.
`constantcoef=(:σ,)`.
When models include an `@random` block and fitting with NaivePooled is requested, it is required that the user sets all variability parameters to zero with `constantcoef` such that these can
be ignored in the optimization, e.g. `constantcoef = (:Ω,)` while overwriting the corresponding values in `param` with `(; init_params..., Ω = zeros(3, 3))`.
Parallelization of the optimization is supported for most estimation methods via the ensemble interface of DifferentialEquations.jl. Currently, the only supported options are:
• `EnsembleThreads()`: the default. Accelerate fits by using multiple threads.
• `EnsembleSerial()`: fit using a single thread.
• `EnsembleDistributed()`: fit by using multiple worker processes.
The `fit` function will check if any gradients and throw an exception if any of the elements are exactly zero unless `checkidentification` is set to false.
Further keyword arguments can be passed via the `diffeq_options` argument. This allows for passing arguments to the differential equations solver such as `alg`, `abstol`, and `reltol`. The
default values for these are `AutoVern7(Rodas5P(autodiff=true)), 1e-12, and 1e-8` respectively. See the [documentation](https://docs.pumas.ai/stable/basics/simulation/#Options-and-settings-for-simobs) for more details.
The keyword `verbose` controls if info statements about initial evaluations of the loglikelihood function and gradient should be printed or not. Defaults to true.
Since the numerical optimization routine can sometimes visit extreme regions of the parameter space during it's exploration we automatically handle the situation where the input
parameters result in errors when solving the dynamical system. This allows the algorithm to recover and continue in many situations that would have otherwise stalled early. Sometimes, it
is useful to turn this error handling off when debugging a model fit, and this can be done by setting `ignore_numerical_error = false`.
Below is how to run a Naive Pooled analysis:
= fit(warfarin_pk_model, pop, param_vals, NaivePooled(); constantcoef = (:pk_Ω,)) naive_fit
[ Info: Checking the initial parameter values. [ Info: The initial negative log likelihood and its gradient are finite. Check passed. Iter Function value Gradient norm 0 6.698433e+04 7.339547e+04 * time: 0.06300187110900879 1 1.608646e+04 2.976762e+04 * time: 2.3654468059539795 2 5.868548e+03 1.035750e+04 * time: 2.368582010269165 3 2.939718e+03 4.142057e+03 * time: 2.3716118335723877 4 1.911162e+03 2.672629e+03 * time: 2.374521017074585 5 1.258125e+03 1.666208e+03 * time: 2.377405881881714 6 8.872473e+02 1.004002e+03 * time: 2.380250930786133 7 6.854678e+02 5.653206e+02 * time: 2.3832578659057617 8 5.981939e+02 4.298111e+02 * time: 2.3860819339752197 9 5.697119e+02 3.298714e+02 * time: 2.3888649940490723 10 5.630895e+02 2.782014e+02 * time: 2.391618013381958 11 5.610542e+02 2.546132e+02 * time: 2.3944058418273926 12 5.583559e+02 2.306961e+02 * time: 2.3971920013427734 13 5.519849e+02 1.916256e+02 * time: 2.4000489711761475 14 5.385646e+02 1.280939e+02 * time: 2.4029488563537598 15 5.213242e+02 7.353540e+01 * time: 2.405838966369629 16 5.171523e+02 5.151365e+01 * time: 2.408723831176758 17 5.151262e+02 5.020215e+01 * time: 2.411641836166382 18 5.145245e+02 4.127464e+01 * time: 2.4145379066467285 19 5.144783e+02 4.250454e+01 * time: 2.417534828186035 20 5.144258e+02 4.182999e+01 * time: 2.421085834503174 21 5.141981e+02 3.496010e+01 * time: 2.4257609844207764 22 5.138854e+02 2.245596e+01 * time: 2.4293408393859863 23 5.135343e+02 1.780783e+01 * time: 2.4326329231262207 24 5.133723e+02 1.971138e+01 * time: 2.4358808994293213 25 5.133304e+02 2.298999e+01 * time: 2.43913197517395 26 5.133150e+02 2.217028e+01 * time: 2.4423508644104004 27 5.132848e+02 2.023817e+01 * time: 2.445596933364868 28 5.132093e+02 1.788705e+01 * time: 2.448824882507324 29 5.130231e+02 1.853999e+01 * time: 2.4520599842071533 30 5.126118e+02 1.979296e+01 * time: 2.455293893814087 31 5.119112e+02 2.368559e+01 * time: 2.4585280418395996 32 5.112562e+02 1.621839e+01 * time: 2.4617810249328613 33 5.110240e+02 5.757384e+00 * time: 2.464829921722412 34 5.109963e+02 4.298973e+00 * time: 2.46807599067688 35 5.109950e+02 4.288314e+00 * time: 2.4713070392608643 36 5.109943e+02 4.286157e+00 * time: 2.47442889213562 37 5.109913e+02 4.284737e+00 * time: 2.477496862411499 38 5.109845e+02 4.289628e+00 * time: 2.480592966079712 39 5.109654e+02 4.935513e+00 * time: 2.483675003051758 40 5.109155e+02 9.365998e+00 * time: 2.486764907836914 41 5.107757e+02 1.725988e+01 * time: 2.490056037902832 42 5.103371e+02 3.350312e+01 * time: 2.493114948272705 43 5.075433e+02 8.627240e+01 * time: 2.496354818344116 44 5.050172e+02 8.799170e+01 * time: 2.4996488094329834 45 4.962069e+02 9.509664e+01 * time: 2.5038368701934814 46 4.761013e+02 3.635622e+01 * time: 2.5097458362579346 47 4.755510e+02 1.838628e+02 * time: 2.513279914855957 48 4.667561e+02 3.725269e+01 * time: 2.5165798664093018 49 4.655530e+02 3.789792e+01 * time: 2.519847869873047 50 4.625171e+02 6.336759e+01 * time: 2.5231118202209473 51 4.601300e+02 9.460813e+01 * time: 2.526366949081421 52 4.564463e+02 2.519378e+01 * time: 2.5295088291168213 53 4.543494e+02 8.101728e+00 * time: 2.532634973526001 54 4.514042e+02 3.041037e+01 * time: 2.535784959793091 55 4.504885e+02 3.540669e+01 * time: 2.5395588874816895 56 4.484865e+02 4.508032e+01 * time: 2.543351888656616 57 4.480667e+02 4.948631e+01 * time: 2.5471458435058594 58 4.471799e+02 8.417188e+01 * time: 2.5510239601135254 59 4.461134e+02 4.529769e+01 * time: 2.556380033493042 60 4.452253e+02 7.551449e+00 * time: 2.561663866043091 61 4.447828e+02 1.762000e+01 * time: 2.5670628547668457 62 4.446123e+02 4.642462e+01 * time: 2.572300910949707 63 4.443932e+02 2.755456e+01 * time: 2.577713966369629 64 4.439630e+02 1.675038e+01 * time: 2.5830068588256836 65 4.434601e+02 1.160119e+02 * time: 2.5877888202667236 66 4.432802e+02 7.713017e+00 * time: 2.5912039279937744 67 4.428348e+02 7.635308e+00 * time: 2.595047950744629 68 4.424368e+02 8.110696e+01 * time: 2.5985050201416016 69 4.420727e+02 9.342020e+01 * time: 2.6019489765167236 70 4.403707e+02 1.405464e+02 * time: 2.6055989265441895 71 4.400903e+02 6.770072e+01 * time: 2.6117918491363525 72 4.398476e+02 2.660505e+02 * time: 2.615924835205078 73 4.394886e+02 9.746305e+01 * time: 2.619365930557251 74 4.392992e+02 3.934368e+02 * time: 2.622999906539917 75 4.390273e+02 6.220876e+01 * time: 2.6264748573303223 76 4.384455e+02 1.200034e+02 * time: 2.6301510334014893 77 4.381655e+02 7.536095e+02 * time: 2.635066032409668 78 4.376569e+02 2.716359e+03 * time: 2.638692855834961 79 4.371396e+02 1.427975e+02 * time: 2.779093027114868 80 4.365253e+02 2.291738e+01 * time: 2.782742977142334 81 4.362967e+02 3.650541e+02 * time: 2.7888548374176025 82 4.360851e+02 9.683767e+02 * time: 2.793313980102539 83 4.359698e+02 1.188451e+03 * time: 2.7976009845733643 84 4.356883e+02 1.529329e+03 * time: 2.8018739223480225 85 4.353129e+02 2.477125e+03 * time: 2.806185007095337 86 4.347641e+02 4.445794e+02 * time: 2.8098440170288086 87 4.346491e+02 3.346309e+03 * time: 2.814093828201294 88 4.340737e+02 5.621341e+03 * time: 2.81768798828125 89 4.330218e+02 5.253844e+03 * time: 2.821431875228882 90 4.329277e+02 1.466953e+04 * time: 2.825824022293091 91 4.321405e+02 1.988316e+03 * time: 2.8294918537139893 92 4.316377e+02 1.175474e+04 * time: 2.833195924758911 93 4.315243e+02 1.457175e+04 * time: 2.8376200199127197 94 4.313393e+02 1.434983e+04 * time: 2.84212589263916 95 4.311457e+02 1.466406e+04 * time: 2.846637010574341 96 4.309173e+02 1.397546e+04 * time: 2.8513028621673584 97 4.306791e+02 1.484925e+04 * time: 2.8574960231781006 98 4.303810e+02 1.223827e+04 * time: 2.862381935119629 99 4.300841e+02 2.606666e+04 * time: 2.86742901802063 100 4.296818e+02 3.263127e+04 * time: 2.8725388050079346 101 4.294220e+02 2.953363e+04 * time: 2.877699851989746 102 4.291854e+02 3.241563e+04 * time: 2.8827898502349854 103 4.288485e+02 2.199741e+04 * time: 2.8882369995117188 104 4.285033e+02 7.169887e+04 * time: 2.893751859664917 105 4.282675e+02 8.209757e+04 * time: 2.8990888595581055 106 4.278118e+02 6.496780e+04 * time: 2.904326915740967 107 4.276005e+02 7.412336e+04 * time: 2.9094719886779785 108 4.272582e+02 5.876273e+04 * time: 2.9145798683166504 109 4.269661e+02 9.679311e+04 * time: 2.919595956802368 110 4.265732e+02 1.131204e+05 * time: 2.9247310161590576 111 4.262809e+02 1.100267e+05 * time: 2.929771900177002 112 4.259916e+02 1.131859e+05 * time: 2.9348440170288086 113 4.256448e+02 9.590546e+04 * time: 2.940081834793091 114 4.252941e+02 2.497391e+05 * time: 2.945194959640503 115 4.247396e+02 3.667522e+05 * time: 2.950361967086792 116 4.244417e+02 3.177088e+05 * time: 2.9575328826904297 117 4.242009e+02 3.549387e+05 * time: 2.9635260105133057 118 4.238293e+02 2.317690e+05 * time: 2.969058036804199 119 4.234645e+02 5.444075e+05 * time: 2.9746909141540527 120 4.231832e+02 6.626479e+05 * time: 2.9803709983825684 121 4.229000e+02 6.248714e+05 * time: 2.9857609272003174 122 4.226437e+02 6.202141e+05 * time: 2.990994930267334 123 4.223476e+02 5.763202e+05 * time: 2.9965310096740723 124 4.220353e+02 6.722184e+05 * time: 3.00197696685791 125 4.215553e+02 3.868167e+05 * time: 3.007438898086548 126 4.211613e+02 2.917101e+06 * time: 3.0128488540649414 127 4.204823e+02 3.685050e+06 * time: 3.0185089111328125 128 4.187401e+02 6.951196e+06 * time: 3.023772954940796 129 4.174679e+02 5.788368e+06 * time: 3.033639907836914 130 4.171014e+02 1.511670e+07 * time: 3.042224884033203 131 4.166690e+02 2.943324e+06 * time: 3.04762601852417 132 4.157380e+02 4.686169e+05 * time: 3.05302095413208 133 4.154194e+02 9.348694e+06 * time: 3.059499979019165 134 4.152062e+02 1.421900e+07 * time: 3.0659890174865723 135 4.149796e+02 1.671737e+07 * time: 3.072237968444824 136 4.147019e+02 1.802229e+07 * time: 3.0784168243408203 137 4.144039e+02 1.847006e+07 * time: 3.0855748653411865 138 4.140881e+02 1.850972e+07 * time: 3.0940539836883545 139 4.137488e+02 1.861463e+07 * time: 3.1003499031066895 140 4.133741e+02 1.860858e+07 * time: 3.1069040298461914 141 4.129603e+02 1.907969e+07 * time: 3.113184928894043 142 4.124802e+02 1.741585e+07 * time: 3.119328022003174 143 4.119884e+02 4.218883e+07 * time: 3.1252989768981934 144 4.109091e+02 1.714223e+08 * time: 3.1314330101013184 145 4.104487e+02 1.441888e+08 * time: 3.137526035308838 146 4.096514e+02 1.763079e+08 * time: 3.1436610221862793 147 4.091312e+02 5.946881e+08 * time: 3.148981809616089 148 4.052097e+02 3.624908e+09 * time: 3.1557059288024902 149 4.046037e+02 5.109422e+08 * time: 3.1651828289031982 150 4.040337e+02 5.926690e+08 * time: 3.1714208126068115 151 4.032925e+02 1.529026e+08 * time: 3.1771860122680664 152 4.029895e+02 1.133947e+09 * time: 3.188459873199463 153 4.028606e+02 2.150461e+09 * time: 3.1940479278564453 154 4.023712e+02 2.847341e+09 * time: 3.19952392578125 155 4.013024e+02 3.636985e+09 * time: 3.2060868740081787 156 3.988076e+02 9.859466e+09 * time: 3.211897850036621 157 3.976552e+02 1.714441e+10 * time: 3.2188289165496826 158 3.955430e+02 2.256291e+10 * time: 3.225735902786255 159 3.940546e+02 1.193260e+10 * time: 3.2344160079956055 160 3.937247e+02 9.073932e+09 * time: 3.2455058097839355 161 3.936886e+02 4.858347e+10 * time: 3.251490831375122 162 3.929759e+02 1.467714e+10 * time: 3.2573838233947754 163 3.917377e+02 6.528655e+10 * time: 3.263216018676758 164 3.911035e+02 8.289130e+10 * time: 3.2699899673461914 165 3.888147e+02 1.098190e+12 * time: 3.2760119438171387 166 3.862471e+02 8.637045e+10 * time: 3.2829949855804443 167 3.847153e+02 6.632016e+10 * time: 3.290367841720581 168 3.847153e+02 6.632016e+10 * time: 3.317539930343628 169 3.847153e+02 6.632016e+10 * time: 3.34230899810791
FittedPumasModel
Dynamical system type: Matrix exponential
Number of subjects: 32
Observation records: Active Missing
conc: 251 47
Total: 251 47
Number of parameters: Constant Optimized
1 8
Likelihood approximation: NaivePooled
Likelihood optimizer: BFGS
Termination Reason: NoObjectiveChange
Log-likelihood value: -384.71534
------------------------
Estimate
------------------------
pop_CL 0.12659
pop_Vc 8.7367
pop_tabs 1.4476e-11
pop_lag 1.0
† pk_Ω₁,₁ 0.01
† pk_Ω₂,₂ 0.01
† pk_Ω₃,₃ 0.01
σ_prop 0.25974
σ_add 2.8554e-9
------------------------
† indicates constant parameters
As you can see, the constantcoef
argument is used to fix the inter-individual variability parameters to zero. The parameter estimates from the naive pooled fit seem out of place for some parameters such as pop_tabs
. Perhaps the initial parameter values are not close to the true values and some adjustments are needed. It is also possible that the data cannot support a model with lags and first order absorption if some subjects very quickly have large concentration values or we really need inter-individual variability to fit the data. In Pumas, you can rerun the fit with new initial parameter values very easily. Suppose, we want to run the naive pooled fit again with 4 new initial parameter values that are within 10% of the original values.
We can use the power of the Julia language to generate 4 new parameter values that are within 10% of the original values.
# Generate parameter sets by scaling original values by ±10%
= [
param_sets NamedTuple(k => v * scale for (k, v) in pairs(param_vals)) for
in [0.9, 0.95, 1.05, 1.1]
scale ]
4-element Vector{@NamedTuple{pop_CL::Float64, pop_Vc::Float64, pop_tabs::Float64, pop_lag::Float64, pk_Ω::Diagonal{Float64, Vector{Float64}}, σ_prop::Float64, σ_add::Float64}}:
(pop_CL = 0.12060000000000001, pop_Vc = 7.2989999999999995, pop_tabs = 0.4707, pop_lag = 0.09000000000000001, pk_Ω = [0.009000000000000001 0.0 0.0; 0.0 0.009000000000000001 0.0; 0.0 0.0 0.009000000000000001], σ_prop = 0.006768, σ_add = 0.05949000000000001)
(pop_CL = 0.1273, pop_Vc = 7.7044999999999995, pop_tabs = 0.49685, pop_lag = 0.095, pk_Ω = [0.0095 0.0 0.0; 0.0 0.0095 0.0; 0.0 0.0 0.0095], σ_prop = 0.007143999999999999, σ_add = 0.062795)
(pop_CL = 0.14070000000000002, pop_Vc = 8.5155, pop_tabs = 0.54915, pop_lag = 0.10500000000000001, pk_Ω = [0.0105 0.0 0.0; 0.0 0.0105 0.0; 0.0 0.0 0.0105], σ_prop = 0.007896, σ_add = 0.06940500000000001)
(pop_CL = 0.14740000000000003, pop_Vc = 8.921, pop_tabs = 0.5753, pop_lag = 0.11000000000000001, pk_Ω = [0.011000000000000001 0.0 0.0; 0.0 0.011000000000000001 0.0; 0.0 0.0 0.011000000000000001], σ_prop = 0.008272, σ_add = 0.07271000000000001)
Now, we can run the fit again with the new parameter values.
= [
new_fits fit(warfarin_pk_model, pop, param_set, NaivePooled(), constantcoef = (:pk_Ω,)) for
in param_sets
param_set ];
You can compare the results of the new fits with perturbed initial parameter values.
compare_estimates(;
= new_fits[1],
p1 = new_fits[2],
p2 = new_fits[3],
p3 = new_fits[4],
p4 = naive_fit,
original )
Row | parameter | p1 | p2 | p3 | p4 | original |
---|---|---|---|---|---|---|
String | Float64? | Float64? | Float64? | Float64? | Float64? | |
1 | pop_CL | 0.123625 | 0.192774 | 0.123403 | 0.0235632 | 0.126594 |
2 | pop_Vc | 9.28085 | 16.1102 | 9.13101 | 49.454 | 8.73673 |
3 | pop_tabs | 4.10644e-12 | 0.60404 | 0.972573 | 0.154961 | 1.44759e-11 |
4 | pop_lag | 1.0 | 0.706171 | 0.5 | 0.648555 | 1.0 |
5 | pk_Ω₁,₁ | 0.009 | 0.0095 | 0.0105 | 0.011 | 0.01 |
6 | pk_Ω₂,₂ | 0.009 | 0.0095 | 0.0105 | 0.011 | 0.01 |
7 | pk_Ω₃,₃ | 0.009 | 0.0095 | 0.0105 | 0.011 | 0.01 |
8 | σ_prop | 0.293987 | 0.778574 | 0.283517 | 2.15052 | 0.259744 |
9 | σ_add | 1.25063e-85 | 1.6711e-173 | 4.76955e-12 | 1.57173e-162 | 2.8554e-9 |
The compare_estimates
function is a convenience function that is part of the PumasUtilities
package. It is used to compare the estimates of the new fits with the original fit. The p3
results seem to be reasonable amongst all the fits and these results can be taken forward for the FOCE()
fit.
2.6 Estimating Parameters via Maximum Likelihood (FOCE)
After obtaining initial estimates, the next step is fitting the model to data using a maximum likelihood approach. One of the most commonly used methods in Pumas is the First-Order Conditional Estimation (FOCE) method. This is invoked by specifying FOCE()
as the estimation algorithm using the initial parameter values specified in the model definition:
= fit(warfarin_pk_model, pop, init_params(warfarin_pk_model), FOCE()) foce_fit
The init_params
function is a convenience function that extracts the initial parameter values from the model definition.
By default, the fit
function will use all the cores of your machine to parallelize the optimization. You can control this using the ensemblealg
argument. For more details, see the fit options section in the Pumas Documentation.
2.7 Fixing Parameters with constantcoef
Sometimes, it is desirable to hold certain parameters constant during estimation. For instance, suppose one wants to fix tabs
at a particular value:
= fit(
foce_fit_fixedtabs
warfarin_pk_model,
pop,..., pop_tabs = 0.5),
(; param_valsFOCE();
= (:pop_tabs,),
constantcoef # This fixes the pop_tabs parameter );
foce_fit_fixedtabs
FittedPumasModel
Dynamical system type: Matrix exponential
Number of subjects: 32
Observation records: Active Missing
conc: 251 47
Total: 251 47
Number of parameters: Constant Optimized
1 8
Likelihood approximation: FOCE
Likelihood optimizer: BFGS
Termination Reason: NoXChange
Log-likelihood value: -350.54228
----------------------
Estimate
----------------------
pop_CL 0.13455
pop_Vc 8.0517
† pop_tabs 0.5
pop_lag 0.8751
pk_Ω₁,₁ 0.07066
pk_Ω₂,₂ 0.01826
pk_Ω₃,₃ 0.95054
σ_prop 0.089942
σ_add 0.39217
----------------------
† indicates constant parameters
This approach is useful for sensitivity analysis, exploring different values, or simplifying the model.
2.8 Changing Tolerance During Fitting
The fit
function accepts additional keyword arguments to tweak the fitting process, such as tolerances for convergence:
= fit(
foce_fit_fixedtabs
warfarin_pk_model,
pop,..., pop_tabs = 0.5),
(; param_valsFOCE();
= (:pop_tabs,),
constantcoef = (; x_reltol = 1e-6, x_abstol = 1e-8, iterations = 200),
optim_options );
These arguments (x_reltol
, x_abstol
, iterations
) help fine-tune the solver’s stopping criteria. For more details, see the Optimization Options during Fitting.
2.9 Interpreting the Printed Results
Once the model fit is complete, Pumas will print a summary of the fit to the console. This typically includes:
- Objective Function Value (OFV): The minimized negative log-likelihood (or similar objective). Lower is better.
- Convergence Status: An indication of whether the solver found a local optimum or if it ran into issues (e.g., maximum iterations reached).
- Parameter Estimates: A table listing final estimates of fixed effects, random effects (variances or standard deviations), and residual variability components.
- Standard Errors (if computed): By default, these might not be computed until using
infer
(discussed in the next section).
The printed output might look like:
julia> foce_fit_fixedtabs
FittedPumasModel
Successful minimization: true
Likelihood approximation: FOCE
Likelihood Optimizer: BFGS
Dynamical system type: Matrix exponential
Log-likelihood value: -320.2235
Number of subjects: 31
Number of parameters: Fixed Optimized
1 8
Observation records: Active Missing
conc: 239 47
Total: 239 47
-----------------------
Estimate
-----------------------
pop_CL 0.13091
pop_Vc 8.08
pop_tabs 0.5
pop_lag 0.91842
pk_Ω₁,₁ 0.050929
pk_Ω₂,₂ 0.019161
pk_Ω₃,₃ 1.056
σ_prop 0.090974
σ_add 0.33189
-----------------------
We can break down the output of the fit
function into the following components:
2.9.1 FittedPumasModel
FittedPumasModel
The FittedPumasModel
object contains the following fields:
model
: The original model definition
foce_fit_fixedtabs.model
PumasModel
Parameters: pop_CL, pop_Vc, pop_tabs, pop_lag, pk_Ω, σ_prop, σ_add
Random effects: pk_η
Covariates: FSZCL, FSZV
Dynamical system variables: Depot, Central
Dynamical system type: Matrix exponential
Derived: conc
Observed: conc
data
: The population data
foce_fit_fixedtabs.data
Population
Subjects: 32
Covariates: sex, wtbl, FSZV, FSZCL
Observations: conc
optim
: The optimization results
foce_fit_fixedtabs.optim
* Status: success
* Candidate solution
Final objective value: 3.505425e+02
* Found with
Algorithm: BFGS
* Convergence measures
|x - x'| = 1.38e-06 ≰ 1.0e-08
|x - x'|/|x'| = 3.46e-07 ≤ 1.0e-06
|f(x) - f(x')| = 5.58e-07 ≰ 0.0e+00
|f(x) - f(x')|/|f(x')| = 1.59e-09 ≰ 0.0e+00
|g(x)| = 6.41e-02 ≰ 1.0e-03
* Work counters
Seconds run: 6 (vs limit Inf)
Iterations: 74
f(x) calls: 82
∇f(x) calls: 75
approx
: The likelihood approximation method
foce_fit_fixedtabs.approx
FOCE
kwargs
: Additional keyword arguments
foce_fit_fixedtabs.kwargs
fixedparamset
: The fixed parameter set
foce_fit_fixedtabs.fixedparamset
optim_state
: The optimization state
foce_fit_fixedtabs.optim_state
2.9.2 Minimization Status
Successful minimization: true
The Successful minimization
field indicates whether the minimization was successful. It depends on the convergence of the optimizer and rules set in the optim_options
argument, for example x_reltol
and x_abstol
.
2.9.3 Likelihood Approximation
Likelihood approximation: FOCE
The Likelihood approximation
field indicates the likelihood approximation method used.
2.9.4 Likelihood Optimizer
Likelihood Optimizer: BFGS
The Likelihood Optimizer
field indicates the optimizer used which by default is BFGS
from the Optim
package. A user can change the optimizer by setting the optim_alg
argument in the fit
function.
2.9.5 Dynamical system type
Dynamical system type: Matrix exponential
The Dynamical system type
field indicates the type of dynamical system used. In the example here, even though the model is a differential equation model, the Matrix exponential
indicates that the model is solved using a matrix exponential solver as the system has been deemed linear. The user can check the linear
nature of the system as follows: foce_fit.model.prob
which results in this case to Pumas.LinearODE(). If the user does not want the automatic check for linearity, they can turn it off in the @options
block of the model definition.
@model begin
@param begin
...
end
@random begin
...
end
@dynamics begin
...
end
@pre begin
...
end
@options begin
= false
checklinear end
end
The checklinear
option is a boolean that determines whether the solver should check if the system defined in the @dynamics
block is linear. If the system is linear, setting this option to true
(default) enables calculating the solution through matrix exponentials. If it is set to false
or time (t) appears in the @pre
block, this optimization is disabled. This option can be useful when the matrix exponential solver is not superior to general numerical integrators or for debugging purposes.
2.9.6 Log-likelihood value
Log-likelihood value: -320.2235
The Log-likelihood value
field indicates the value of the log-likelihood function at the maximum likelihood estimate.
2.9.7 Number of subjects
Number of subjects: 31
The Number of subjects
field indicates the number of subjects from the population data used in the fit.
2.9.8 Number of parameters
Number of parameters: Fixed Optimized
1 8
The Number of parameters
field indicates the number of fixed and optimized parameters. The Fixed
column indicates the number of parameters that were fixed using the constantcoef
argument during the fit and the Optimized
column indicates the number of parameters that were optimized.
2.9.9 Observation records
Observation records: Active Missing
conc: 239 47
Total: 239 47
The Observation records
field indicates the number of active and missing observations in the population data. The Active
column indicates the number of observations that were used in the fit and the Missing
column indicates the number of observations that were missing
and not used in the fit. missing
in this case is due to the missingness of the conc
observations, which is the collection of all the conc
observations from all the subjects.
2.9.10 Parameter estimates
-----------------------
Estimate
-----------------------
pop_CL 0.13091
pop_Vc 8.08
pop_tabs 0.5
pop_lag 0.91842
pk_Ω₁,₁ 0.050929
pk_Ω₂,₂ 0.019161
pk_Ω₃,₃ 1.056
σ_prop 0.090974
σ_add 0.33189
-----------------------
The Parameter estimates
field indicates the final parameter estimates. The Estimate
column indicates the maximum likelihood estimates of the parameter.
2.10 Computing Parameter Precision with infer
The infer
function in Pumas estimates the uncertainty (precision) of parameter estimates. Depending on the chosen method, infer
can provide standard errors, confidence intervals, and correlation matrices.
The signature for infer
often looks like:
infer(
::FittedPumasModel;
fpm= 0.95,
level ::Bool = false,
rethrow_error::Bool = true,
sandwich_estimator )
where:
fpm::FittedPumasModel
: The result offit
(e.g.,foce_fit
).level
: The confidence interval level (e.g., 0.95). The confidence intervals are calculated as the(1-level)/2
and(1+level)/2
quantiles of the estimated parametersrethrow_error
: If rethrow_error is false (the default value), no error will be thrown if the variance-covariance matrix estimator fails. If it is true, an error will be thrown if the estimator fails.sandwich_estimator
: Whether to use the sandwich estimator. If set totrue
(the default value), the sandwich estimator will be used. If set tofalse
, the standard error will be calculated using the inverse of the Hessian matrix.
An example usage:
= infer(foce_fit_fixedtabs; level = 0.95) inference_results
[ Info: Calculating: variance-covariance matrix. [ Info: Done.
Asymptotic inference results using sandwich estimator
Dynamical system type: Matrix exponential
Number of subjects: 32
Observation records: Active Missing
conc: 251 47
Total: 251 47
Number of parameters: Constant Optimized
1 8
Likelihood approximation: FOCE
Likelihood optimizer: BFGS
Termination Reason: SmallXChange
Log-likelihood value: -350.54254
----------------------------------------------------------
Estimate SE 95.0% C.I.
----------------------------------------------------------
pop_CL 0.13455 0.0065675 [ 0.12168 ; 0.14742]
pop_Vc 8.0517 0.21972 [ 7.6211 ; 8.4824 ]
† pop_tabs 0.5 NaN [ NaN ; NaN ]
pop_lag 0.87498 0.051763 [ 0.77353 ; 0.97644]
pk_Ω₁,₁ 0.070958 0.024781 [ 0.022387 ; 0.11953]
pk_Ω₂,₂ 0.01827 0.0051531 [ 0.0081698; 0.02837]
pk_Ω₃,₃ 0.94384 0.38455 [ 0.19013 ; 1.6975 ]
σ_prop 0.089942 0.014455 [ 0.061611 ; 0.11827]
σ_add 0.39221 0.066849 [ 0.26119 ; 0.52323]
----------------------------------------------------------
† indicates constant parameters
We can use the sandwich_estimator
argument to get a more robust estimate of the standard errors.
= infer(foce_fit_fixedtabs; level = 0.95, sandwich_estimator = false) inference_results
[ Info: Calculating: variance-covariance matrix. [ Info: Done.
Asymptotic inference results using negative inverse Hessian
Dynamical system type: Matrix exponential
Number of subjects: 32
Observation records: Active Missing
conc: 251 47
Total: 251 47
Number of parameters: Constant Optimized
1 8
Likelihood approximation: FOCE
Likelihood optimizer: BFGS
Termination Reason: SmallXChange
Log-likelihood value: -350.54254
-----------------------------------------------------------
Estimate SE 95.0% C.I.
-----------------------------------------------------------
pop_CL 0.13455 0.0064882 [ 0.12183 ; 0.14727 ]
pop_Vc 8.0517 0.22744 [ 7.606 ; 8.4975 ]
† pop_tabs 0.5 NaN [ NaN ; NaN ]
pop_lag 0.87498 0.038082 [ 0.80034 ; 0.94962 ]
pk_Ω₁,₁ 0.070958 0.018798 [ 0.034114 ; 0.1078 ]
pk_Ω₂,₂ 0.01827 0.0062287 [ 0.0060618; 0.030478]
pk_Ω₃,₃ 0.94384 0.4181 [ 0.12438 ; 1.7633 ]
σ_prop 0.089942 0.0089122 [ 0.072474 ; 0.10741 ]
σ_add 0.39221 0.046786 [ 0.30051 ; 0.48391 ]
-----------------------------------------------------------
† indicates constant parameters
This result above indicates that both with the sandwich estimator and the inverse of the Hessian matrix on the fixed parameter, the inference worked. We will pursue other methods to obtain parameter precision in later tutorials, such as bootstrap and SIR.
3 Concluding Remarks
This tutorial showcased a typical Pumas workflow for PK model fitting using a Warfarin dataset:
- Model Definition and Data Preparation.
- Checking Compatibility via
loglikelihood
andfindinfluential
. - Initial Parameter Estimation using Naive Pooled.
- Fitting with FOCE, demonstrating how to fix parameters or change solver tolerances.
- Interpreting Fit Results, exploring the components of a
fittedpumasmodel
. - Computing Precision of estimates with
infer
using different methods.
Readers are encouraged to refine their understanding by:
- Performing thorough Exploratory Data Analysis prior to modeling.
- Exploring Bootstrap or SIR methods for deeper uncertainty quantification (covered in later tutorials).
- Validating the final model through visual diagnostics (e.g., VPCs, GOF plots).
More advanced topics can be found throughout the Pumas Documentation and Pumas Tutorials.
Real-world workflows are iterative. Analysts often revisit model assumptions, re-check data, and explore alternative parameterizations before finalizing any one model as “best.”