using Pumas
using PharmaDatasets
using DataFramesMeta
using CairoMakie
using AlgebraOfGraphics
using PumasUtilities
using MarkdownFitting and Inferring a Compartmental PK Model
1 Introduction
A typical workflow for fitting a Pumas model and deriving parameter precision typically involves:
- Preparing the data and the model.
- Checking model-data compatibility.
- Obtaining initial parameter estimates.
- Fitting the model via a chosen estimation method.
- Interpreting the fit results.
- Computing parameter precision.
- (Optionally) proceeding with more advanced techniques like bootstrapping or SIR for robust uncertainty quantification.
The following sections will walk through these steps using a one-compartment PK model for Warfarin, focusing on the PK aspects only. Exploratory data analysis (EDA), although extremely important, is out of scope here. Readers interested in EDA are encouraged to consult other tutorials.
2 Model and Data
2.1 Model Definition
Below is the PK model, named warfarin_pk_model, defined in Pumas. This model contains:
- Fixed effects (population parameters):
pop_CL, pop_Vc, pop_tabs, pop_lag - Inter-individual variability (IIV) components:
pk_Ω - Residual error model parameters:
σ_prop, σ_add - Covariates for scaling:
FSZCLandFSZV - Differential equations describing the PK behavior in the compartments.
warfarin_pk_model = @model begin
@metadata begin
desc = "Warfarin 1-compartment PK model (PD removed)"
timeu = u"hr"
end
@param begin
# PK parameters
"""
Clearance (L/hr)
"""
pop_CL ∈ RealDomain(lower = 0.0, init = 0.134)
"""
Central volume (L)
"""
pop_Vc ∈ RealDomain(lower = 0.0, init = 8.11)
"""
Absorption lag time (hr)
"""
pop_tabs ∈ RealDomain(lower = 0.0, init = 0.523)
"""
Lag time (hr)
"""
pop_lag ∈ RealDomain(lower = 0.0, init = 0.1)
# Inter-individual variability
"""
- ΩCL: Clearance
- ΩVc: Central volume
- Ωtabs: Absorption lag time
"""
pk_Ω ∈ PDiagDomain([0.01, 0.01, 0.01])
# Residual variability
"""
σ_prop: Proportional error
"""
σ_prop ∈ RealDomain(lower = 0.0, init = 0.00752)
"""
σ_add: Additive error
"""
σ_add ∈ RealDomain(lower = 0.0, init = 0.0661)
end
@random begin
pk_η ~ MvNormal(pk_Ω) # mean = 0, covariance = pk_Ω
end
@covariates begin
"""
FSZCL: Clearance scaling factor
"""
FSZCL
"""
FSZV: Volume scaling factor
"""
FSZV
end
@pre begin
CL = FSZCL * pop_CL * exp(pk_η[1])
Vc = FSZV * pop_Vc * exp(pk_η[2])
tabs = pop_tabs * exp(pk_η[3])
Ka = log(2) / tabs
end
@dosecontrol begin
lags = (Depot = pop_lag,)
end
@vars begin
cp := Central / Vc
end
@dynamics begin
Depot' = -Ka * Depot
Central' = Ka * Depot - (CL / Vc) * Central
end
@derived begin
"""
Concentration (ng/mL)
"""
conc ~ @. Normal(cp, sqrt((σ_prop * cp)^2 + σ_add^2))
end
endPumasModel
Parameters: pop_CL, pop_Vc, pop_tabs, pop_lag, pk_Ω, σ_prop, σ_add
Random effects: pk_η
Covariates: FSZCL, FSZV
Dynamical system variables: Depot, Central
Dynamical system type: Matrix exponential
Derived: conc
Observed: conc
2.2 Data Preparation
The Warfarin data used in this tutorial is pulled from PharmaDatasets for demonstration purposes. Note how the code reshapes and prepares the data in “wide” format for reading into Pumas. Only the conc column is treated as observations for the PK model.
warfarin_data = dataset("pumas/warfarin_nonmem")
# Step 2: Fix Duplicate Time Points
# -------------------------------
# Some subjects have duplicate time points for dvid = 1
# For this dataset, the triple (id, time, dvid) should define
# a row uniquely, but
nrow(warfarin_data)
nrow(unique(warfarin_data, ["id", "time", "dvid"]))
# We can identify the problematic rows by grouping on the index variables
@chain warfarin_data begin
@groupby :id :time :dvid
@transform :tmp = length(:id)
@rsubset :tmp > 1
end
# Transform the data in a single chain of operations
warfarin_data_wide = @chain warfarin_data begin
@rtransform begin
# Scaling factors
:FSZV = :wtbl / 70 # volume scaling
:FSZCL = (:wtbl / 70)^0.75 # clearance scaling (allometric)
# Column name for the dv
:dvname = "dv$(:dvid)"
# Dosing indicator columns
:cmt = ismissing(:amt) ? missing : 1
:evid = iszero(:amt) ? 0 : 1
end
unstack(Not([:dvid, :dvname, :dv]), :dvname, :dv)
rename!(:dv1 => :conc, :dv2 => :pca)
select!(Not([:dv0]))
end| Row | id | time | evid | amt | wtbl | age | sex | FSZV | FSZCL | cmt | conc | pca |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Int64 | Float64 | Int64 | Float64 | Float64 | Int64 | String1 | Float64 | Float64 | Int64 | Float64? | Float64? | |
| 1 | 1 | 0.0 | 1 | 100.0 | 66.7 | 50 | M | 0.952857 | 0.96443 | 1 | missing | missing |
| 2 | 1 | 0.5 | 0 | 0.0 | 66.7 | 50 | M | 0.952857 | 0.96443 | 1 | 0.0 | missing |
| 3 | 1 | 1.0 | 0 | 0.0 | 66.7 | 50 | M | 0.952857 | 0.96443 | 1 | 1.9 | missing |
| 4 | 1 | 2.0 | 0 | 0.0 | 66.7 | 50 | M | 0.952857 | 0.96443 | 1 | 3.3 | missing |
| 5 | 1 | 3.0 | 0 | 0.0 | 66.7 | 50 | M | 0.952857 | 0.96443 | 1 | 6.6 | missing |
| 6 | 1 | 6.0 | 0 | 0.0 | 66.7 | 50 | M | 0.952857 | 0.96443 | 1 | 9.1 | missing |
| 7 | 1 | 9.0 | 0 | 0.0 | 66.7 | 50 | M | 0.952857 | 0.96443 | 1 | 10.8 | missing |
| 8 | 1 | 12.0 | 0 | 0.0 | 66.7 | 50 | M | 0.952857 | 0.96443 | 1 | 8.6 | missing |
| 9 | 1 | 24.0 | 0 | 0.0 | 66.7 | 50 | M | 0.952857 | 0.96443 | 1 | 5.6 | 44.0 |
| 10 | 1 | 36.0 | 0 | 0.0 | 66.7 | 50 | M | 0.952857 | 0.96443 | 1 | 4.0 | 27.0 |
| 11 | 1 | 48.0 | 0 | 0.0 | 66.7 | 50 | M | 0.952857 | 0.96443 | 1 | 2.7 | 28.0 |
| 12 | 1 | 72.0 | 0 | 0.0 | 66.7 | 50 | M | 0.952857 | 0.96443 | 1 | 0.8 | 31.0 |
| 13 | 1 | 96.0 | 0 | 0.0 | 66.7 | 50 | M | 0.952857 | 0.96443 | 1 | missing | 60.0 |
| ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ |
| 319 | 32 | 48.0 | 0 | 0.0 | 62.0 | 21 | M | 0.885714 | 0.912999 | 1 | 6.9 | 24.0 |
| 320 | 32 | 72.0 | 0 | 0.0 | 62.0 | 21 | M | 0.885714 | 0.912999 | 1 | 4.4 | 23.0 |
| 321 | 32 | 96.0 | 0 | 0.0 | 62.0 | 21 | M | 0.885714 | 0.912999 | 1 | 3.5 | 20.0 |
| 322 | 32 | 120.0 | 0 | 0.0 | 62.0 | 21 | M | 0.885714 | 0.912999 | 1 | 2.5 | 22.0 |
| 323 | 33 | 0.0 | 1 | 100.0 | 66.7 | 50 | M | 0.952857 | 0.96443 | 1 | missing | missing |
| 324 | 33 | 0.0 | 0 | 0.0 | 66.7 | 50 | M | 0.952857 | 0.96443 | 1 | missing | 100.0 |
| 325 | 33 | 24.0 | 0 | 0.0 | 66.7 | 50 | M | 0.952857 | 0.96443 | 1 | 9.2 | 49.0 |
| 326 | 33 | 36.0 | 0 | 0.0 | 66.7 | 50 | M | 0.952857 | 0.96443 | 1 | 8.5 | 32.0 |
| 327 | 33 | 48.0 | 0 | 0.0 | 66.7 | 50 | M | 0.952857 | 0.96443 | 1 | 6.4 | 26.0 |
| 328 | 33 | 72.0 | 0 | 0.0 | 66.7 | 50 | M | 0.952857 | 0.96443 | 1 | 4.8 | 22.0 |
| 329 | 33 | 96.0 | 0 | 0.0 | 66.7 | 50 | M | 0.952857 | 0.96443 | 1 | 3.1 | 28.0 |
| 330 | 33 | 120.0 | 0 | 0.0 | 66.7 | 50 | M | 0.952857 | 0.96443 | 1 | 2.5 | 33.0 |
2.3 Creating a Pumas Population
Below is the creation of a population object in Pumas using read_pumas. Only the conc data are treated as the observation variable:
pop = read_pumas(
warfarin_data_wide;
id = :id,
time = :time,
amt = :amt,
cmt = :cmt,
evid = :evid,
covariates = [:sex, :wtbl, :FSZV, :FSZCL],
observations = [:conc],
)Population
Subjects: 32
Covariates: sex, wtbl, FSZV, FSZCL
Observations: conc
The same data can contain multiple endpoints or PD observations. In this tutorial, the focus is solely on PK fitting. PKPD modeling on this warfarin dataset will be introduced later.
2.4 Checking Model-Data Compatibility
Before performing any fit, it is recommended to verify whether the defined model is consistent with the provided dataset. Pumas offers functions such as loglikelihood and findinfluential for these checks.
2.4.1 The loglikelihood Function
The loglikelihood function computes the log-likelihood of the model given data and parameters. In Pumas, the signature typically looks like:
loglikelihood(model, population, params, approx)where:
model: The Pumas model definition (e.g.,warfarin_pk_model).population: A Pumas population object (e.g.,pop).params: A named tuple or dictionary containing parameter values.approx: The approximation method to use. Common options includeFOCE(),FO(),LaplaceI(), etc.
For example, one might write:
# A named tuple of parameter values
param_vals = (
pop_CL = 0.134,
pop_Vc = 8.11,
pop_tabs = 0.523,
pop_lag = 0.1,
pk_Ω = Diagonal([0.01, 0.01, 0.01]),
σ_prop = 0.00752,
σ_add = 0.0661,
)
ll_value = loglikelihood(warfarin_pk_model, pop, param_vals, FOCE())-12090.636664027372
The initial loglikelihood of the warfarin PK model given the data and parameter values is `LL = -12090.636664027372 `.
If the model and data are incompatible (e.g., missing doses for compartments, or out-of-range parameter values), loglikelihood might return an error or produce a warning. A successful computation is a good sign that the model can handle the data.
2.4.2 The findinfluential Function
The findinfluential function helps identify observations that disproportionately influence the fit. It can be used before or after fitting, but even with initial guesses, it can highlight potentially problematic data points. The most notable case is when the loglikelihood cannot be evaluated in which case the data returned for the problematic subject does not return a finite value.
Typical usage is:
findinfluential(warfarin_pk_model, pop, params; approx)This will list the individual loglikelihoods for each subject. Readers can then inspect these observations further, potentially re-checking the data quality or adjusting the model assumptions if needed for each subject.
fi = findinfluential(warfarin_pk_model, pop, param_vals, FOCE())32-element Vector{@NamedTuple{id::String, nll::Float64}}:
(id = "12", nll = 3179.162620631005)
(id = "14", nll = 2113.146717564112)
(id = "13", nll = 1627.4380743700433)
(id = "9", nll = 1050.6701195571484)
(id = "2", nll = 780.9109417396234)
(id = "1", nll = 699.9267530243993)
(id = "7", nll = 512.7488860179473)
(id = "11", nll = 470.0423353858021)
(id = "6", nll = 311.9666456226985)
(id = "3", nll = 192.83907815907168)
⋮
(id = "16", nll = 22.434033734725986)
(id = "5", nll = 18.475981181576064)
(id = "32", nll = 17.87557264312157)
(id = "17", nll = 17.341505598982042)
(id = "26", nll = 8.191625356662307)
(id = "24", nll = 6.4001549755272755)
(id = "31", nll = 6.363630970170725)
(id = "23", nll = 5.48597914877157)
(id = "29", nll = 4.343343958293616)
The default output is a vector of NamedTuples, which can easily be converted into a DataFrame for further analysis. One can then plot the distribution of the individual loglikelihoods to identify any potential problematic subjects.
fidf = DataFrame(fi)
hist(
fidf.nll,
axis = (xlabel = "Log-likelihood", ylabel = "Frequency"),
color = (:blue, 0.5),
)The plot above shows that there are some subjects with very high initial log-likelihoods, which might be worth investigating. Notice, that a high initial log-likelihood does not necessarily imply a model or data problem. It can simply be that for the chosen initial population parameters the model does not fit the observations well for the subject in question.
2.5 Getting Good Initial Estimates Using Naive Pooled
A reliable starting point for nonlinear mixed-effects modeling is to get reasonable initial parameter estimates. Pumas provides approaches such as:
- Naive Pooled: Treats all data as if there were no inter-individual variability.
- NCA (Non-compartmental Analysis): Uses PK metrics (CL, V, etc.) from a non-compartmental approach to seed the parameter values (not showcased in this tutorial).
Here, the Naive Pooled approach is demonstrated. This method estimates population parameters by ignoring inter-individual differences, effectively pooling data as if from a single “super-subject.”
Below is the signature and documentation for the Pumas fit function. For more details, see the Fitting in Pumas Documentation.
fit(
model::PumasModel,
population::Population,
param::NamedTuple,
approx::Union{LikelihoodApproximation, MAP};
optim_alg = Optim.BFGS(linesearch = Optim.LineSearches.BackTracking(), initial_stepnorm = 1.0),
optim_options::NamedTuple = NamedTuple(),
optimize_fn = nothing,
constantcoef::Tuple = (),
ensemblealg::SciMLBase.EnsembleAlgorithm = EnsembleThreads(),
checkidentification = true,
diffeq_options = NamedTuple(),
verbose = true,
ignore_numerical_error = true,
)
Fit the Pumas model model to the dataset population with starting values param using the estimation method approx. Currently supported values for the `approx` argument are `FO`, `FOCE`,
`LaplaceI`, `NaivePooled`, and `BayesMCMC`. See the online documentation for more details about the different methods.
To control the optimization procedure for finding population parameters (fixed effects), use the `optim_alg` and `optim_options` keyword arguments. In previous versions of Pumas the argument
`optimize_fn` was used, but is now discouraged and will be removed in a later version of Pumas. These options control the optimization of all `approx` methods except BayesMCMC. The default
optimization function uses the quasi-Newton routine BFGS method from the `Optim` package. It can be changed by setting the `optim_alg` to an algorithm implemented in Optim.jl as long as it
does not use second order derivatives. Optimization specific options can be passed in using the `optim_options` keyword and has to be a `NamedTuple` with names and values that match the
`Optim.Options` type. For example, the optimization trace can be disabled and the algorithm can be changed to L-BFGS by passing `optim_alg=Optim.LBFGS(), optim_options = ;(show_trace=false)`
to fit. See [Optim](https://docs.pumas.ai/stable/basics/estimation/#Optimization-option) for more defails.
It is possible to fix one or more parameters of the fit by passing a Tuple of Symbols as the `constantcoef` argument with elements corresponding to the names of the fixed parameters, e.g.
`constantcoef=(:σ,)`.
When models include an `@random` block and fitting with NaivePooled is requested, it is required that the user sets all variability parameters to zero with `constantcoef` such that these can
be ignored in the optimization, e.g. `constantcoef = (:Ω,)` while overwriting the corresponding values in `param` with `(; init_params..., Ω = zeros(3, 3))`.
Parallelization of the optimization is supported for most estimation methods via the ensemble interface of DifferentialEquations.jl. Currently, the only supported options are:
• `EnsembleThreads()`: the default. Accelerate fits by using multiple threads.
• `EnsembleSerial()`: fit using a single thread.
• `EnsembleDistributed()`: fit by using multiple worker processes.
The `fit` function will check if any gradients and throw an exception if any of the elements are exactly zero unless `checkidentification` is set to false.
Further keyword arguments can be passed via the `diffeq_options` argument. This allows for passing arguments to the differential equations solver such as `alg`, `abstol`, and `reltol`.
See the [documentation](https://docs.pumas.ai/stable/basics/simulation/#Options-and-settings-for-simobs) for more details.
The keyword `verbose` controls if info statements about initial evaluations of the loglikelihood function and gradient should be printed or not. Defaults to true.
Since the numerical optimization routine can sometimes visit extreme regions of the parameter space during it's exploration we automatically handle the situation where the input
parameters result in errors when solving the dynamical system. This allows the algorithm to recover and continue in many situations that would have otherwise stalled early. Sometimes, it
is useful to turn this error handling off when debugging a model fit, and this can be done by setting `ignore_numerical_error = false`.
Below is how to run a Naive Pooled analysis:
naive_fit = fit(warfarin_pk_model, pop, param_vals, NaivePooled(); constantcoef = (:pk_Ω,))[ Info: Checking the initial parameter values. [ Info: The initial negative log likelihood and its gradient are finite. Check passed. Iter Function value Gradient norm 0 6.698433e+04 7.339547e+04 * time: 0.06017494201660156 1 1.608646e+04 2.976762e+04 * time: 2.2393579483032227 2 5.868548e+03 1.035750e+04 * time: 2.244560956954956 3 2.939718e+03 4.142057e+03 * time: 2.250001907348633 4 1.911162e+03 2.672629e+03 * time: 2.2557849884033203 5 1.258125e+03 1.666208e+03 * time: 2.2610299587249756 6 8.872473e+02 1.004002e+03 * time: 2.266676902770996 7 6.854678e+02 5.653206e+02 * time: 2.2718749046325684 8 5.981939e+02 4.298111e+02 * time: 2.2770140171051025 9 5.697119e+02 3.298714e+02 * time: 2.282541036605835 10 5.630895e+02 2.782014e+02 * time: 2.288094997406006 11 5.610542e+02 2.546132e+02 * time: 2.293267011642456 12 5.583559e+02 2.306961e+02 * time: 2.2986090183258057 13 5.519849e+02 1.916256e+02 * time: 2.3038928508758545 14 5.385646e+02 1.280939e+02 * time: 2.308781862258911 15 5.213242e+02 7.353540e+01 * time: 2.313570976257324 16 5.171523e+02 5.151365e+01 * time: 2.3178958892822266 17 5.151262e+02 5.020215e+01 * time: 2.3218770027160645 18 5.145245e+02 4.127464e+01 * time: 2.325556993484497 19 5.144783e+02 4.250454e+01 * time: 2.328809976577759 20 5.144258e+02 4.182999e+01 * time: 2.3319668769836426 21 5.141981e+02 3.496010e+01 * time: 2.334972858428955 22 5.138854e+02 2.245596e+01 * time: 2.337779998779297 23 5.135343e+02 1.780783e+01 * time: 2.3405239582061768 24 5.133723e+02 1.971138e+01 * time: 2.3432159423828125 25 5.133304e+02 2.298999e+01 * time: 2.3457839488983154 26 5.133150e+02 2.217028e+01 * time: 2.3482730388641357 27 5.132848e+02 2.023817e+01 * time: 2.3507699966430664 28 5.132093e+02 1.788705e+01 * time: 2.3532509803771973 29 5.130231e+02 1.853999e+01 * time: 2.3556630611419678 30 5.126118e+02 1.979296e+01 * time: 2.358229875564575 31 5.119112e+02 2.368559e+01 * time: 2.483628034591675 32 5.112562e+02 1.621839e+01 * time: 2.4863829612731934 33 5.110240e+02 5.757384e+00 * time: 2.489081859588623 34 5.109963e+02 4.298973e+00 * time: 2.491896867752075 35 5.109950e+02 4.288314e+00 * time: 2.4945068359375 36 5.109943e+02 4.286157e+00 * time: 2.497144937515259 37 5.109913e+02 4.284737e+00 * time: 2.499763011932373 38 5.109845e+02 4.289628e+00 * time: 2.5023670196533203 39 5.109654e+02 4.935513e+00 * time: 2.505002975463867 40 5.109155e+02 9.365998e+00 * time: 2.507643938064575 41 5.107757e+02 1.725988e+01 * time: 2.5102880001068115 42 5.103371e+02 3.350312e+01 * time: 2.512946844100952 43 5.075433e+02 8.627240e+01 * time: 2.5156538486480713 44 5.050172e+02 8.799170e+01 * time: 2.5184099674224854 45 4.962069e+02 9.509664e+01 * time: 2.521340847015381 46 4.761013e+02 3.635622e+01 * time: 2.5245959758758545 47 4.755510e+02 1.838628e+02 * time: 2.5272159576416016 48 4.667561e+02 3.725269e+01 * time: 2.5300989151000977 49 4.655530e+02 3.789792e+01 * time: 2.532845973968506 50 4.625171e+02 6.336759e+01 * time: 2.5354208946228027 51 4.601300e+02 9.460813e+01 * time: 2.538033962249756 52 4.564463e+02 2.519378e+01 * time: 2.5407068729400635 53 4.543494e+02 8.101728e+00 * time: 2.5433130264282227 54 4.514042e+02 3.041037e+01 * time: 2.5459718704223633 55 4.504885e+02 3.540669e+01 * time: 2.549142837524414 56 4.484865e+02 4.508032e+01 * time: 2.5523550510406494 57 4.480667e+02 4.948631e+01 * time: 2.555572032928467 58 4.471799e+02 8.417188e+01 * time: 2.558276891708374 59 4.461134e+02 4.529769e+01 * time: 2.5609519481658936 60 4.452253e+02 7.551449e+00 * time: 2.5638089179992676 61 4.447828e+02 1.762000e+01 * time: 2.5665059089660645 62 4.446123e+02 4.642462e+01 * time: 2.5692059993743896 63 4.443932e+02 2.755456e+01 * time: 2.571923017501831 64 4.439630e+02 1.675038e+01 * time: 2.574863910675049 65 4.434601e+02 1.160119e+02 * time: 2.5778119564056396 66 4.432802e+02 7.713017e+00 * time: 2.5805599689483643 67 4.428348e+02 7.635308e+00 * time: 2.5833799839019775 68 4.424368e+02 8.110696e+01 * time: 2.5862159729003906 69 4.420727e+02 9.342020e+01 * time: 2.5890719890594482 70 4.403707e+02 1.405464e+02 * time: 2.5921130180358887 71 4.400903e+02 6.770072e+01 * time: 2.597245931625366 72 4.398476e+02 2.660505e+02 * time: 2.6008689403533936 73 4.394886e+02 9.746305e+01 * time: 2.6039068698883057 74 4.392992e+02 3.934368e+02 * time: 2.607301950454712 75 4.390273e+02 6.220876e+01 * time: 2.6103639602661133 76 4.384455e+02 1.200034e+02 * time: 2.613656997680664 77 4.381655e+02 7.536095e+02 * time: 2.6177029609680176 78 4.376569e+02 2.716359e+03 * time: 2.620971918106079 79 4.371396e+02 1.427975e+02 * time: 2.624842882156372 80 4.365253e+02 2.291738e+01 * time: 2.6280970573425293 81 4.362967e+02 3.650541e+02 * time: 2.632037878036499 82 4.360851e+02 9.683767e+02 * time: 2.6359338760375977 83 4.359698e+02 1.188451e+03 * time: 2.6398298740386963 84 4.356883e+02 1.529329e+03 * time: 2.643765926361084 85 4.353129e+02 2.477125e+03 * time: 2.6478519439697266 86 4.347641e+02 4.445794e+02 * time: 2.6512279510498047 87 4.346491e+02 3.346309e+03 * time: 2.6554529666900635 88 4.340737e+02 5.621341e+03 * time: 2.6593289375305176 89 4.330218e+02 5.253844e+03 * time: 2.663037061691284 90 4.329277e+02 1.466953e+04 * time: 2.667430877685547 91 4.321405e+02 1.988316e+03 * time: 2.6709489822387695 92 4.316377e+02 1.175474e+04 * time: 2.6744589805603027 93 4.315243e+02 1.457175e+04 * time: 2.6786739826202393 94 4.313393e+02 1.434983e+04 * time: 2.6830079555511475 95 4.311457e+02 1.466406e+04 * time: 2.6874358654022217 96 4.309173e+02 1.397546e+04 * time: 2.6921398639678955 97 4.306791e+02 1.484925e+04 * time: 2.6965198516845703 98 4.303810e+02 1.223827e+04 * time: 2.701066017150879 99 4.300841e+02 2.606666e+04 * time: 2.7054338455200195 100 4.296818e+02 3.263127e+04 * time: 2.709894895553589 101 4.294220e+02 2.953363e+04 * time: 2.7144980430603027 102 4.291854e+02 3.241563e+04 * time: 2.7190909385681152 103 4.288485e+02 2.199741e+04 * time: 2.7236740589141846 104 4.285033e+02 7.169887e+04 * time: 2.7283239364624023 105 4.282675e+02 8.209757e+04 * time: 2.733043909072876 106 4.278118e+02 6.496780e+04 * time: 2.737596035003662 107 4.276005e+02 7.412336e+04 * time: 2.742374897003174 108 4.272582e+02 5.876273e+04 * time: 2.7470669746398926 109 4.269661e+02 9.679311e+04 * time: 2.7516539096832275 110 4.265732e+02 1.131204e+05 * time: 2.7564189434051514 111 4.262809e+02 1.100267e+05 * time: 2.7611310482025146 112 4.259916e+02 1.131859e+05 * time: 2.765972852706909 113 4.256448e+02 9.590546e+04 * time: 2.7710609436035156 114 4.252941e+02 2.497391e+05 * time: 2.776190996170044 115 4.247396e+02 3.667522e+05 * time: 2.781247854232788 116 4.244417e+02 3.177088e+05 * time: 2.786289930343628 117 4.242009e+02 3.549387e+05 * time: 2.7911880016326904 118 4.238293e+02 2.317690e+05 * time: 2.796009063720703 119 4.234645e+02 5.444075e+05 * time: 2.80102801322937 120 4.231832e+02 6.626479e+05 * time: 2.80600905418396 121 4.229000e+02 6.248714e+05 * time: 2.8110108375549316 122 4.226437e+02 6.202141e+05 * time: 2.816074848175049 123 4.223476e+02 5.763202e+05 * time: 2.8214099407196045 124 4.220353e+02 6.722184e+05 * time: 2.8266918659210205 125 4.215553e+02 3.868167e+05 * time: 2.8320279121398926 126 4.211613e+02 2.917101e+06 * time: 2.8372650146484375 127 4.204823e+02 3.685050e+06 * time: 2.8422398567199707 128 4.187401e+02 6.951196e+06 * time: 2.846544027328491 129 4.174679e+02 5.788368e+06 * time: 2.852602005004883 130 4.171014e+02 1.511670e+07 * time: 2.8598480224609375 131 4.166690e+02 2.943324e+06 * time: 2.865466833114624 132 4.157380e+02 4.686169e+05 * time: 2.8703229427337646 133 4.154194e+02 9.348694e+06 * time: 2.8756120204925537 134 4.152062e+02 1.421900e+07 * time: 2.880849838256836 135 4.149796e+02 1.671737e+07 * time: 2.886137008666992 136 4.147019e+02 1.802229e+07 * time: 2.8916890621185303 137 4.144039e+02 1.847006e+07 * time: 2.8971290588378906 138 4.140881e+02 1.850972e+07 * time: 2.9025139808654785 139 4.137488e+02 1.861463e+07 * time: 2.9078729152679443 140 4.133741e+02 1.860858e+07 * time: 2.913383960723877 141 4.129603e+02 1.907969e+07 * time: 2.9188079833984375 142 4.124802e+02 1.741585e+07 * time: 2.9244840145111084 143 4.119884e+02 4.218883e+07 * time: 2.9300780296325684 144 4.109091e+02 1.714223e+08 * time: 2.9356749057769775 145 4.104487e+02 1.441888e+08 * time: 2.941215991973877 146 4.096514e+02 1.763079e+08 * time: 2.9467859268188477 147 4.091312e+02 5.946881e+08 * time: 2.9515340328216553 148 4.052097e+02 3.624908e+09 * time: 2.9577958583831787 149 4.046037e+02 5.109422e+08 * time: 2.964761972427368 150 4.040337e+02 5.926690e+08 * time: 2.969679832458496 151 4.032925e+02 1.529026e+08 * time: 2.974529981613159 152 4.029895e+02 1.133947e+09 * time: 2.9844210147857666 153 4.028606e+02 2.150461e+09 * time: 2.989366054534912 154 4.023712e+02 2.847341e+09 * time: 2.9943580627441406 155 4.013024e+02 3.636985e+09 * time: 3.0005059242248535 156 3.988076e+02 9.859466e+09 * time: 3.0057640075683594 157 3.976552e+02 1.714441e+10 * time: 3.0123229026794434 158 3.955430e+02 2.256291e+10 * time: 3.0189950466156006 159 3.940546e+02 1.193260e+10 * time: 3.0278470516204834 160 3.937247e+02 9.073932e+09 * time: 3.0369369983673096 161 3.936886e+02 4.858347e+10 * time: 3.042590856552124 162 3.929759e+02 1.467714e+10 * time: 3.0481739044189453 163 3.917377e+02 6.528655e+10 * time: 3.053915023803711 164 3.911035e+02 8.289130e+10 * time: 3.060699939727783 165 3.888147e+02 1.098190e+12 * time: 3.066859006881714 166 3.862471e+02 8.637045e+10 * time: 3.0739660263061523 167 3.847153e+02 6.632016e+10 * time: 3.081164836883545 168 3.847153e+02 6.632016e+10 * time: 3.1059038639068604 169 3.847153e+02 6.632016e+10 * time: 3.1302578449249268
FittedPumasModel
Dynamical system type: Matrix exponential
Number of subjects: 32
Observation records: Active Missing
conc: 251 47
Total: 251 47
Number of parameters: Constant Optimized
1 8
Likelihood approximation: NaivePooled
Likelihood optimizer: BFGS
Termination Reason: NoObjectiveChange
Log-likelihood value: -384.71534
------------------------
Estimate
------------------------
pop_CL 0.12659
pop_Vc 8.7367
pop_tabs 1.4476e-11
pop_lag 1.0
† pk_Ω₁,₁ 0.01
† pk_Ω₂,₂ 0.01
† pk_Ω₃,₃ 0.01
σ_prop 0.25974
σ_add 2.8554e-9
------------------------
† indicates constant parameters
As you can see, the constantcoef argument is used to fix the inter-individual variability parameters to zero. The parameter estimates from the naive pooled fit seem out of place for some parameters such as pop_tabs. Perhaps the initial parameter values are not close to the true values and some adjustments are needed. It is also possible that the data cannot support a model with lags and first order absorption if some subjects very quickly have large concentration values or we really need inter-individual variability to fit the data. In Pumas, you can rerun the fit with new initial parameter values very easily. Suppose, we want to run the naive pooled fit again with 4 new initial parameter values that are within 10% of the original values.
We can use the power of the Julia language to generate 4 new parameter values that are within 10% of the original values.
# Generate parameter sets by scaling original values by ±10%
param_sets = [
NamedTuple(k => v * scale for (k, v) in pairs(param_vals)) for
scale in [0.9, 0.95, 1.05, 1.1]
]4-element Vector{@NamedTuple{pop_CL::Float64, pop_Vc::Float64, pop_tabs::Float64, pop_lag::Float64, pk_Ω::Diagonal{Float64, Vector{Float64}}, σ_prop::Float64, σ_add::Float64}}:
(pop_CL = 0.12060000000000001, pop_Vc = 7.2989999999999995, pop_tabs = 0.4707, pop_lag = 0.09000000000000001, pk_Ω = [0.009000000000000001 0.0 0.0; 0.0 0.009000000000000001 0.0; 0.0 0.0 0.009000000000000001], σ_prop = 0.006768, σ_add = 0.05949000000000001)
(pop_CL = 0.1273, pop_Vc = 7.7044999999999995, pop_tabs = 0.49685, pop_lag = 0.095, pk_Ω = [0.0095 0.0 0.0; 0.0 0.0095 0.0; 0.0 0.0 0.0095], σ_prop = 0.007143999999999999, σ_add = 0.062795)
(pop_CL = 0.14070000000000002, pop_Vc = 8.5155, pop_tabs = 0.54915, pop_lag = 0.10500000000000001, pk_Ω = [0.0105 0.0 0.0; 0.0 0.0105 0.0; 0.0 0.0 0.0105], σ_prop = 0.007896, σ_add = 0.06940500000000001)
(pop_CL = 0.14740000000000003, pop_Vc = 8.921, pop_tabs = 0.5753, pop_lag = 0.11000000000000001, pk_Ω = [0.011000000000000001 0.0 0.0; 0.0 0.011000000000000001 0.0; 0.0 0.0 0.011000000000000001], σ_prop = 0.008272, σ_add = 0.07271000000000001)
Now, we can run the fit again with the new parameter values.
new_fits = [
fit(warfarin_pk_model, pop, param_set, NaivePooled(), constantcoef = (:pk_Ω,)) for
param_set in param_sets
];You can compare the results of the new fits with perturbed initial parameter values.
compare_estimates(;
p1 = new_fits[1],
p2 = new_fits[2],
p3 = new_fits[3],
p4 = new_fits[4],
original = naive_fit,
)| Row | parameter | p1 | p2 | p3 | p4 | original |
|---|---|---|---|---|---|---|
| String | Float64? | Float64? | Float64? | Float64? | Float64? | |
| 1 | pop_CL | 0.123625 | 0.192774 | 0.123403 | 0.0235632 | 0.126594 |
| 2 | pop_Vc | 9.28085 | 16.1102 | 9.13101 | 49.454 | 8.73673 |
| 3 | pop_tabs | 4.10644e-12 | 0.60404 | 0.972573 | 0.154961 | 1.44759e-11 |
| 4 | pop_lag | 1.0 | 0.706171 | 0.5 | 0.648555 | 1.0 |
| 5 | pk_Ω₁,₁ | 0.009 | 0.0095 | 0.0105 | 0.011 | 0.01 |
| 6 | pk_Ω₂,₂ | 0.009 | 0.0095 | 0.0105 | 0.011 | 0.01 |
| 7 | pk_Ω₃,₃ | 0.009 | 0.0095 | 0.0105 | 0.011 | 0.01 |
| 8 | σ_prop | 0.293987 | 0.778574 | 0.283517 | 2.15052 | 0.259744 |
| 9 | σ_add | 1.25063e-85 | 1.6711e-173 | 4.76955e-12 | 1.57173e-162 | 2.8554e-9 |
The compare_estimates function is a convenience function that is part of the PumasUtilities package. It is used to compare the estimates of the new fits with the original fit. The p3 results seem to be reasonable amongst all the fits and these results can be taken forward for the FOCE() fit.
2.6 Estimating Parameters via Maximum Likelihood (FOCE)
After obtaining initial estimates, the next step is fitting the model to data using a maximum likelihood approach. One of the most commonly used methods in Pumas is the First-Order Conditional Estimation (FOCE) method. This is invoked by specifying FOCE() as the estimation algorithm using the initial parameter values specified in the model definition:
foce_fit = fit(warfarin_pk_model, pop, init_params(warfarin_pk_model), FOCE())The init_params function is a convenience function that extracts the initial parameter values from the model definition.
By default, the fit function will use all the cores of your machine to parallelize the optimization. You can control this using the ensemblealg argument. For more details, see the fit options section in the Pumas Documentation.
2.7 Fixing Parameters with constantcoef
Sometimes, it is desirable to hold certain parameters constant during estimation. For instance, suppose one wants to fix tabs at a particular value:
foce_fit_fixedtabs = fit(
warfarin_pk_model,
pop,
(; param_vals..., pop_tabs = 0.5),
FOCE();
constantcoef = (:pop_tabs,),
); # This fixes the pop_tabs parameterfoce_fit_fixedtabsFittedPumasModel
Dynamical system type: Matrix exponential
Number of subjects: 32
Observation records: Active Missing
conc: 251 47
Total: 251 47
Number of parameters: Constant Optimized
1 8
Likelihood approximation: FOCE
Likelihood optimizer: BFGS
Termination Reason: NoXChange
Log-likelihood value: -350.54228
----------------------
Estimate
----------------------
pop_CL 0.13455
pop_Vc 8.0517
† pop_tabs 0.5
pop_lag 0.8751
pk_Ω₁,₁ 0.07066
pk_Ω₂,₂ 0.01826
pk_Ω₃,₃ 0.95054
σ_prop 0.089942
σ_add 0.39217
----------------------
† indicates constant parameters
This approach is useful for sensitivity analysis, exploring different values, or simplifying the model.
2.8 Changing Tolerance During Fitting
The fit function accepts additional keyword arguments to tweak the fitting process, such as tolerances for convergence:
foce_fit_fixedtabs = fit(
warfarin_pk_model,
pop,
(; param_vals..., pop_tabs = 0.5),
FOCE();
constantcoef = (:pop_tabs,),
optim_options = (; x_reltol = 1e-6, x_abstol = 1e-8, iterations = 200),
);These arguments (x_reltol, x_abstol, iterations) help fine-tune the solver’s stopping criteria. For more details, see the Optimization Options during Fitting.
2.9 Interpreting the Printed Results
Once the model fit is complete, Pumas will print a summary of the fit to the console. This typically includes:
- Objective Function Value (OFV): The minimized negative log-likelihood (or similar objective). Lower is better.
- Convergence Status: An indication of whether the solver found a local optimum or if it ran into issues (e.g., maximum iterations reached).
- Parameter Estimates: A table listing final estimates of fixed effects, random effects (variances or standard deviations), and residual variability components.
- Standard Errors (if computed): By default, these might not be computed until using
infer(discussed in the next section).
The printed output might look like:
julia> foce_fit_fixedtabs
FittedPumasModel
Successful minimization: true
Likelihood approximation: FOCE
Likelihood Optimizer: BFGS
Dynamical system type: Matrix exponential
Log-likelihood value: -320.2235
Number of subjects: 31
Number of parameters: Fixed Optimized
1 8
Observation records: Active Missing
conc: 239 47
Total: 239 47
-----------------------
Estimate
-----------------------
pop_CL 0.13091
pop_Vc 8.08
pop_tabs 0.5
pop_lag 0.91842
pk_Ω₁,₁ 0.050929
pk_Ω₂,₂ 0.019161
pk_Ω₃,₃ 1.056
σ_prop 0.090974
σ_add 0.33189
-----------------------
We can break down the output of the fit function into the following components:
2.9.1 FittedPumasModel
FittedPumasModelThe FittedPumasModel object contains the following fields:
model: The original model definition
foce_fit_fixedtabs.modelPumasModel
Parameters: pop_CL, pop_Vc, pop_tabs, pop_lag, pk_Ω, σ_prop, σ_add
Random effects: pk_η
Covariates: FSZCL, FSZV
Dynamical system variables: Depot, Central
Dynamical system type: Matrix exponential
Derived: conc
Observed: conc
data: The population data
foce_fit_fixedtabs.dataPopulation
Subjects: 32
Covariates: sex, wtbl, FSZV, FSZCL
Observations: conc
optim: The optimization results
foce_fit_fixedtabs.optim * Status: success
* Candidate solution
Final objective value: 3.505425e+02
* Found with
Algorithm: BFGS
* Convergence measures
|x - x'| = 1.38e-06 ≰ 1.0e-08
|x - x'|/|x'| = 3.46e-07 ≤ 1.0e-06
|f(x) - f(x')| = 5.58e-07 ≰ 0.0e+00
|f(x) - f(x')|/|f(x')| = 1.59e-09 ≰ 0.0e+00
|g(x)| = 6.41e-02 ≰ 1.0e-03
* Work counters
Seconds run: 6 (vs limit Inf)
Iterations: 74
f(x) calls: 82
∇f(x) calls: 75
approx: The likelihood approximation method
foce_fit_fixedtabs.approxFOCE
kwargs: Additional keyword arguments
foce_fit_fixedtabs.kwargsfixedparamset: The fixed parameter set
foce_fit_fixedtabs.fixedparamsetoptim_state: The optimization state
foce_fit_fixedtabs.optim_state2.9.2 Minimization Status
Successful minimization: true
The Successful minimization field indicates whether the minimization was successful. It depends on the convergence of the optimizer and rules set in the optim_options argument, for example x_reltol and x_abstol.
2.9.3 Likelihood Approximation
Likelihood approximation: FOCE
The Likelihood approximation field indicates the likelihood approximation method used.
2.9.4 Likelihood Optimizer
Likelihood Optimizer: BFGS
The Likelihood Optimizer field indicates the optimizer used which by default is BFGS from the Optim package. A user can change the optimizer by setting the optim_alg argument in the fit function.
2.9.5 Dynamical system type
Dynamical system type: Matrix exponential
The Dynamical system type field indicates the type of dynamical system used. In the example here, even though the model is a differential equation model, the Matrix exponential indicates that the model is solved using a matrix exponential solver as the system has been deemed linear. The user can check the linear nature of the system as follows: foce_fit.model.prob which results in this case to Pumas.LinearODE(). If the user does not want the automatic check for linearity, they can turn it off in the @options block of the model definition.
@model begin
@param begin
...
end
@random begin
...
end
@dynamics begin
...
end
@pre begin
...
end
@options begin
checklinear = false
end
endThe checklinear option is a boolean that determines whether the solver should check if the system defined in the @dynamics block is linear. If the system is linear, setting this option to true (default) enables calculating the solution through matrix exponentials. If it is set to false or time (t) appears in the @pre block, this optimization is disabled. This option can be useful when the matrix exponential solver is not superior to general numerical integrators or for debugging purposes.
2.9.6 Log-likelihood value
Log-likelihood value: -320.2235
The Log-likelihood value field indicates the value of the log-likelihood function at the maximum likelihood estimate.
2.9.7 Number of subjects
Number of subjects: 31
The Number of subjects field indicates the number of subjects from the population data used in the fit.
2.9.8 Number of parameters
Number of parameters: Fixed Optimized
1 8
The Number of parameters field indicates the number of fixed and optimized parameters. The Fixed column indicates the number of parameters that were fixed using the constantcoef argument during the fit and the Optimized column indicates the number of parameters that were optimized.
2.9.9 Observation records
Observation records: Active Missing
conc: 239 47
Total: 239 47
The Observation records field indicates the number of active and missing observations in the population data. The Active column indicates the number of observations that were used in the fit and the Missing column indicates the number of observations that were missing and not used in the fit. missing in this case is due to the missingness of the conc observations, which is the collection of all the conc observations from all the subjects.
2.9.10 Parameter estimates
-----------------------
Estimate
-----------------------
pop_CL 0.13091
pop_Vc 8.08
pop_tabs 0.5
pop_lag 0.91842
pk_Ω₁,₁ 0.050929
pk_Ω₂,₂ 0.019161
pk_Ω₃,₃ 1.056
σ_prop 0.090974
σ_add 0.33189
-----------------------
The Parameter estimates field indicates the final parameter estimates. The Estimate column indicates the maximum likelihood estimates of the parameter.
2.10 Computing Parameter Precision with infer
The infer function in Pumas estimates the uncertainty (precision) of parameter estimates. Depending on the chosen method, infer can provide standard errors, confidence intervals, and correlation matrices.
The signature for infer often looks like:
infer(
fpm::FittedPumasModel;
level = 0.95,
rethrow_error::Bool = false,
sandwich_estimator::Bool = true,
)where:
fpm::FittedPumasModel: The result offit(e.g.,foce_fit).level: The confidence interval level (e.g., 0.95). The confidence intervals are calculated as the(1-level)/2and(1+level)/2quantiles of the estimated parametersrethrow_error: If rethrow_error is false (the default value), no error will be thrown if the variance-covariance matrix estimator fails. If it is true, an error will be thrown if the estimator fails.sandwich_estimator: Whether to use the sandwich estimator. If set totrue(the default value), the sandwich estimator will be used. If set tofalse, the standard error will be calculated using the inverse of the Hessian matrix.
An example usage:
inference_results = infer(foce_fit_fixedtabs; level = 0.95)[ Info: Calculating: variance-covariance matrix. [ Info: Done.
Asymptotic inference results using sandwich estimator
Dynamical system type: Matrix exponential
Number of subjects: 32
Observation records: Active Missing
conc: 251 47
Total: 251 47
Number of parameters: Constant Optimized
1 8
Likelihood approximation: FOCE
Likelihood optimizer: BFGS
Termination Reason: SmallXChange
Log-likelihood value: -350.54254
----------------------------------------------------------
Estimate SE 95.0% C.I.
----------------------------------------------------------
pop_CL 0.13455 0.0065675 [ 0.12168 ; 0.14742]
pop_Vc 8.0517 0.21972 [ 7.6211 ; 8.4824 ]
† pop_tabs 0.5 NaN [ NaN ; NaN ]
pop_lag 0.87498 0.051763 [ 0.77353 ; 0.97644]
pk_Ω₁,₁ 0.070958 0.024781 [ 0.022387 ; 0.11953]
pk_Ω₂,₂ 0.01827 0.0051531 [ 0.0081698; 0.02837]
pk_Ω₃,₃ 0.94384 0.38455 [ 0.19013 ; 1.6975 ]
σ_prop 0.089942 0.014455 [ 0.061611 ; 0.11827]
σ_add 0.39221 0.066849 [ 0.26119 ; 0.52323]
----------------------------------------------------------
† indicates constant parameters
We can use the sandwich_estimator argument to get a more robust estimate of the standard errors.
inference_results = infer(foce_fit_fixedtabs; level = 0.95, sandwich_estimator = false)[ Info: Calculating: variance-covariance matrix. [ Info: Done.
Asymptotic inference results using negative inverse Hessian
Dynamical system type: Matrix exponential
Number of subjects: 32
Observation records: Active Missing
conc: 251 47
Total: 251 47
Number of parameters: Constant Optimized
1 8
Likelihood approximation: FOCE
Likelihood optimizer: BFGS
Termination Reason: SmallXChange
Log-likelihood value: -350.54254
-----------------------------------------------------------
Estimate SE 95.0% C.I.
-----------------------------------------------------------
pop_CL 0.13455 0.0064882 [ 0.12183 ; 0.14727 ]
pop_Vc 8.0517 0.22744 [ 7.606 ; 8.4975 ]
† pop_tabs 0.5 NaN [ NaN ; NaN ]
pop_lag 0.87498 0.038082 [ 0.80034 ; 0.94962 ]
pk_Ω₁,₁ 0.070958 0.018798 [ 0.034114 ; 0.1078 ]
pk_Ω₂,₂ 0.01827 0.0062287 [ 0.0060618; 0.030478]
pk_Ω₃,₃ 0.94384 0.4181 [ 0.12438 ; 1.7633 ]
σ_prop 0.089942 0.0089122 [ 0.072474 ; 0.10741 ]
σ_add 0.39221 0.046786 [ 0.30051 ; 0.48391 ]
-----------------------------------------------------------
† indicates constant parameters
This result above indicates that both with the sandwich estimator and the inverse of the Hessian matrix on the fixed parameter, the inference worked. We will pursue other methods to obtain parameter precision in later tutorials, such as bootstrap and SIR.
3 Concluding Remarks
This tutorial showcased a typical Pumas workflow for PK model fitting using a Warfarin dataset:
- Model Definition and Data Preparation.
- Checking Compatibility via
loglikelihoodandfindinfluential. - Initial Parameter Estimation using Naive Pooled.
- Fitting with FOCE, demonstrating how to fix parameters or change solver tolerances.
- Interpreting Fit Results, exploring the components of a
fittedpumasmodel. - Computing Precision of estimates with
inferusing different methods.
Readers are encouraged to refine their understanding by:
- Performing thorough Exploratory Data Analysis prior to modeling.
- Exploring Bootstrap or SIR methods for deeper uncertainty quantification (covered in later tutorials).
- Validating the final model through visual diagnostics (e.g., VPCs, GOF plots).
More advanced topics can be found throughout the Pumas Documentation and Pumas Tutorials.
Real-world workflows are iterative. Analysts often revisit model assumptions, re-check data, and explore alternative parameterizations before finalizing any one model as “best.”