using Pumas
using PumasUtilities
using NCA
using NCAUtilities

A Comprehensive Introduction to Pumas
This tutorial provides a comprehensive introduction to a modeling and simulation workflow in Pumas. The idea is not to get into the details of Pumas specifics, but instead provide a narrative on the lines of a regular workflow in our day-to-day work, with brevity where required to allow a broad overview. Wherever possible, cross-references will be provided to documentation and detailed examples that provide deeper insight into a particular topic.
As part of this workflow, you will be introduced to various aspects such as:
- Data wrangling in Julia
- Exploratory analysis in Julia
- Continuous data non-linear mixed effects modeling in Pumas
- Model comparison routines, post-processing, validation etc.
1 The Study and Design
CTMNopain is a novel anti-inflammatory agent under preliminary investigation. A dose-ranging trial was conducted comparing placebo with 3 doses of CTMNopain (5mg, 20mg and 80 mg QD). The maximum tolerated dose is 160 mg per day. Plasma concentrations (mg/L) of the drug were measured at 0
, 0.5
, 1
, 1.5
, 2
, 2.5
, 3
-8
hours.
Pain score (0
=no pain, 1
=mild, 2
=moderate, 3
=severe) were obtained at time points when plasma concentration was collected. A pain score of 2
or more is considered as no pain relief.
The subjects can request for remedication if pain relief is not achieved after 2 hours post dose. Some subjects had remedication before 2 hours if they were not able to bear the pain. The time to remedication and the remedication status is available for subjects.
The pharmacokinetic dataset can be accessed using PharmaDatasets.jl
.
2 Setup
2.1 Load libraries
These libraries provide the workhorse functionality in the Pumas ecosystem:
In addition, libraries below are good add-on’s that provide ancillary functionality:
using GLM: lm, @formula
using Random
using CSV
using DataFramesMeta
using CairoMakie
using PharmaDatasets
2.2 Data Wrangling
We start by reading in the dataset and making some quick summaries.
If you want to learn more about data wrangling, don’t forget to check our Data Wrangling in Julia tutorials!
= dataset("pk_painrelief")
pkpain_df first(pkpain_df, 5)
Row | Subject | Time | Conc | PainRelief | PainScore | RemedStatus | Dose |
---|---|---|---|---|---|---|---|
Int64 | Float64 | Float64 | Int64 | Int64 | Int64 | String7 | |
1 | 1 | 0.0 | 0.0 | 0 | 3 | 1 | 20 mg |
2 | 1 | 0.5 | 1.15578 | 1 | 1 | 0 | 20 mg |
3 | 1 | 1.0 | 1.37211 | 1 | 0 | 0 | 20 mg |
4 | 1 | 1.5 | 1.30058 | 1 | 0 | 0 | 20 mg |
5 | 1 | 2.0 | 1.19195 | 1 | 1 | 0 | 20 mg |
Let’s filter out the placebo data as we don’t need that for the PK analysis.
= @rsubset pkpain_df :Dose != "Placebo";
pkpain_noplb_df first(pkpain_noplb_df, 5)
Row | Subject | Time | Conc | PainRelief | PainScore | RemedStatus | Dose |
---|---|---|---|---|---|---|---|
Int64 | Float64 | Float64 | Int64 | Int64 | Int64 | String7 | |
1 | 1 | 0.0 | 0.0 | 0 | 3 | 1 | 20 mg |
2 | 1 | 0.5 | 1.15578 | 1 | 1 | 0 | 20 mg |
3 | 1 | 1.0 | 1.37211 | 1 | 0 | 0 | 20 mg |
4 | 1 | 1.5 | 1.30058 | 1 | 0 | 0 | 20 mg |
5 | 1 | 2.0 | 1.19195 | 1 | 1 | 0 | 20 mg |
3 Analysis
3.1 Non-compartmental analysis
Let’s begin by performing a quick NCA of the concentration time profiles and view the exposure changes across doses. The input data specification for NCA analysis requires the presence of a :route
column and an :amt
column that specifies the dose. So, let’s add that in:
@rtransform! pkpain_noplb_df begin
:route = "ev"
:Dose = parse(Int, chop(:Dose; tail = 3))
end
We also need to create an :amt
column:
@rtransform! pkpain_noplb_df :amt = :Time == 0 ? :Dose : missing
Now, we map the data variables to the read_nca
function that prepares the data for NCA analysis.
= read_nca(
pkpain_nca
pkpain_noplb_df;= :Subject,
id = :Time,
time = :amt,
amt = :Conc,
observations = [:Dose],
group = :route,
route )
NCAPopulation (120 subjects):
Group: [["Dose" => 5], ["Dose" => 20], ["Dose" => 80]]
Number of missing observations: 0
Number of blq observations: 0
Now that we mapped the data in, let’s visualize the concentration vs time plots for a few individuals. When paginate
is set to true
, a vector of plots are returned and below we display the first element with 9 individuals.
= observations_vs_time(
f
pkpain_nca;= true,
paginate = (; xlabel = "Time (hr)", ylabel = "CTMNoPain Concentration (ng/mL)"),
axis
)1] f[
or you can view the summary curves by dose group as passed in to the group
argument in read_nca
summary_observations_vs_time(
pkpain_nca,= (; fontsize = 22, size = (800, 1000)),
figure = "black",
color = 3,
linewidth = (; xlabel = "Time (hr)", ylabel = "CTMX Concentration (μg/mL)"),
axis )
A full NCA Report is now obtained for completeness purposes using the run_nca
function, but later we will only extract a couple of key metrics of interest.
= run_nca(pkpain_nca; sigdigits = 3) pk_nca
We can look at the NCA fits for some subjects. Here f
is a vector or figures. We’ll showcase the first image by indexing f
:
= subject_fits(
f
pk_nca,= true,
paginate = (; xlabel = "Time (hr)", ylabel = "CTMX Concentration (μg/mL)"),
axis
# Legend options
= (; position = :bottom),
legend
)1] f[
As CTMNopain’s effect maybe mainly related to maximum concentration (cmax
) or area under the curve (auc
), we present some summary statistics using the summarize
function from NCA
.
= [:Dose] strata
1-element Vector{Symbol}:
:Dose
= [:cmax, :aucinf_obs] params
2-element Vector{Symbol}:
:cmax
:aucinf_obs
= summarize(pk_nca; stratify_by = strata, parameters = params) output
Row | Dose | parameters | numsamples | minimum | maximum | mean | std | geomean | geostd | geomeanCV |
---|---|---|---|---|---|---|---|---|---|---|
Int64 | String | Int64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | |
1 | 5 | cmax | 40 | 0.19 | 0.539 | 0.356075 | 0.0884129 | 0.345104 | 1.2932 | 26.1425 |
2 | 5 | aucinf_obs | 40 | 0.914 | 3.4 | 1.5979 | 0.490197 | 1.53373 | 1.32974 | 29.0868 |
3 | 20 | cmax | 40 | 0.933 | 2.7 | 1.4737 | 0.361871 | 1.43408 | 1.2633 | 23.6954 |
4 | 20 | aucinf_obs | 40 | 2.77 | 14.1 | 6.377 | 2.22239 | 6.02031 | 1.41363 | 35.6797 |
5 | 80 | cmax | 40 | 3.3 | 8.47 | 5.787 | 1.31957 | 5.64164 | 1.25757 | 23.2228 |
6 | 80 | aucinf_obs | 40 | 13.7 | 49.1 | 29.5 | 8.68984 | 28.2954 | 1.34152 | 30.0258 |
The statistics printed above are the default, but you can pass in your own statistics using the stats = []
argument to the summarize
function.
We can look at a few parameter distribution plots.
parameters_vs_group(
pk_nca,= :cmax,
parameter = (; xlabel = "Dose (mg)", ylabel = "Cₘₐₓ (ng/mL)"),
axis = (; fontsize = 18),
figure )
Dose normalized PK parameters, cmax
and aucinf
were essentially dose proportional between for 5 mg, 20 mg and 80 mg doses. You can perform a simple regression to check the impact of dose on cmax
:
= NCA.DoseLinearityPowerModel(pk_nca, :cmax; level = 0.9) dp
Dose Linearity Power Model
Variable: cmax
Model: log(cmax) ~ log(α) + β × log(dose)
────────────────────────────────────
Estimate low CI 90% high CI 90%
────────────────────────────────────
β 1.00775 0.97571 1.0398
────────────────────────────────────
Here’s a visualization for the dose linearity using a power model for cmax
:
power_model(dp; legend = (; position = :bottom))
We can also visualize a dose proportionality results with respect to a specific endpoint in a NCA Report; for example cmax
and aucinf_obs
:
dose_vs_dose_normalized(pk_nca, :cmax)
dose_vs_dose_normalized(pk_nca, :aucinf_obs)
Based on visual inspection of the concentration time profiles as seen earlier, CTMNopain exhibited monophasic decline, and perhaps a one compartment model best fits the PK data.
3.2 Pharmacokinetic modeling
As seen from the plots above, the concentrations decline monoexponentially. We will evaluate both one and two compartment structural models to assess best fit. Further, different residual error models will also be tested.
We will use the results from NCA to provide us good initial estimates.
3.2.1 Data preparation for modeling
PumasNDF requires the presence of :evid
and :cmt
columns in the dataset.
@rtransform! pkpain_noplb_df begin
:evid = :Time == 0 ? 1 : 0
:cmt = :Time == 0 ? 1 : 2
:cmt2 = 1 # for zero order absorption
end
Further, observations at time of dosing, i.e., when evid = 1
have to be missing
@rtransform! pkpain_noplb_df :Conc = :evid == 1 ? missing : :Conc
The dataframe will now be converted to a Population
using read_pumas
. Note that both observations
and covariates
are required to be an array even if it is one element.
= read_pumas(
pkpain_noplb
pkpain_noplb_df;= :Subject,
id = :Time,
time = :amt,
amt = [:Conc],
observations = [:Dose],
covariates = :evid,
evid = :cmt,
cmt )
Population
Subjects: 120
Covariates: Dose
Observations: Conc
Now that the data is transformed to a Population
of subjects, we can explore different models.
3.2.2 One-compartment model
If you are not familiar yet with the @model
blocks and syntax, please check our documentation.
= @model begin
pk_1cmp
@metadata begin
= "One Compartment Model"
desc = u"hr"
timeu end
@param begin
"""
Clearance (L/hr)
"""
∈ RealDomain(; lower = 0, init = 3.2)
tvcl """
Volume (L)
"""
∈ RealDomain(; lower = 0, init = 16.4)
tvv """
Absorption rate constant (h-1)
"""
∈ RealDomain(; lower = 0, init = 3.8)
tvka """
- ΩCL
- ΩVc
- ΩKa
"""
∈ PDiagDomain(init = [0.04, 0.04, 0.04])
Ω """
Proportional RUV
"""
∈ RealDomain(; lower = 0.0001, init = 0.2)
σ_p end
@random begin
~ MvNormal(Ω)
η end
@covariates begin
"""
Dose (mg)
"""
Doseend
@pre begin
= tvcl * exp(η[1])
CL = tvv * exp(η[2])
Vc = tvka * exp(η[3])
Ka end
@dynamics Depots1Central1
@derived begin
:= @. Central / Vc
cp """
CTMx Concentration (ng/mL)
"""
~ @. Normal(cp, abs(cp) * σ_p)
Conc end
end
┌ Warning: Covariate Dose is not used in the model. └ @ Pumas ~/run/_work/PumasTutorials.jl/PumasTutorials.jl/custom_julia_depot/packages/Pumas/LN24a/src/dsl/model_macro.jl:3153
PumasModel
Parameters: tvcl, tvv, tvka, Ω, σ_p
Random effects: η
Covariates: Dose
Dynamical system variables: Depot, Central
Dynamical system type: Closed form
Derived: Conc
Observed: Conc
Note that the local assignment :=
can be used to define intermediate statements that will not be carried outside of the block. This means that all the resulting data workflows from this model will not contain the intermediate variables defined with :=
. We use this when we want to suppress the variable from any further output.
The idea behind :=
is for performance reasons. If you are not carrying the variable defined with :=
outside of the block, then it is not necessary to store it in the resulting data structures. Not only will your model run faster, but the resulting data structures will also be smaller.
Before going to fit the model, let’s evaluate some helpful steps via simulation to check appropriateness of data and model
# zero out the random effects
= zero_randeffs(pk_1cmp, pkpain_noplb, init_params(pk_1cmp)) etas
Above, we are generating a vector of η
’s of the same length as the number of subjects to zero out the random effects. We do this as we are evaluating the trajectories of the concentrations at the initial set of parameters at a population level. Other helper functions here are sample_randeffs
and init_randeffs
. Please refer to the documentation.
= simobs(pk_1cmp, pkpain_noplb, init_params(pk_1cmp), etas) simpk_iparams
Simulated population (Vector{<:Subject})
Simulated subjects: 120
Simulated variables: Conc
sim_plot(
pk_1cmp,
simpk_iparams;= [:Conc],
observations = (; fontsize = 18),
figure = (;
axis = "Time (hr)",
xlabel = "Observed/Predicted \n CTMx Concentration (ng/mL)",
ylabel
), )
Our NCA based initial guess on the parameters seem to work well.
Lets change the initial estimate of a couple of the parameters to evaluate the sensitivity.
= (; init_params(pk_1cmp)..., tvka = 2, tvv = 10) pkparam
(tvcl = 3.2,
tvv = 10,
tvka = 2,
Ω = [0.04 0.0 0.0; 0.0 0.04 0.0; 0.0 0.0 0.04],
σ_p = 0.2,)
= simobs(pk_1cmp, pkpain_noplb, pkparam, etas) simpk_changedpars
Simulated population (Vector{<:Subject})
Simulated subjects: 120
Simulated variables: Conc
sim_plot(
pk_1cmp,
simpk_changedpars;= [:Conc],
observations = (; fontsize = 18),
figure = (
axis = "Time (hr)",
xlabel = "Observed/Predicted \n CTMx Concentration (ng/mL)",
ylabel
), )
Changing the tvka
and decreasing the tvv
seemed to make an impact and observations go through the simulated lines.
To get a quick ballpark estimate of your PK parameters, we can do a NaivePooled
analysis.
3.2.2.1 NaivePooled
= fit(pk_1cmp, pkpain_noplb, init_params(pk_1cmp), NaivePooled(); omegas = (:Ω,)) pkfit_np
[ Info: Checking the initial parameter values. [ Info: The initial negative log likelihood and its gradient are finite. Check passed. Iter Function value Gradient norm 0 7.744356e+02 3.715711e+03 * time: 0.07605504989624023 1 2.343899e+02 1.747348e+03 * time: 2.315423011779785 2 9.696232e+01 1.198088e+03 * time: 2.320127010345459 3 -7.818699e+01 5.538151e+02 * time: 2.323575973510742 4 -1.234803e+02 2.462514e+02 * time: 2.326970100402832 5 -1.372888e+02 2.067458e+02 * time: 2.3318960666656494 6 -1.410579e+02 1.162950e+02 * time: 2.335287094116211 7 -1.434754e+02 5.632816e+01 * time: 2.338449001312256 8 -1.453401e+02 7.859270e+01 * time: 2.3415050506591797 9 -1.498185e+02 1.455606e+02 * time: 2.3447279930114746 10 -1.534371e+02 1.303682e+02 * time: 2.3479771614074707 11 -1.563557e+02 5.975474e+01 * time: 2.351196050643921 12 -1.575052e+02 9.308611e+00 * time: 2.3542771339416504 13 -1.579357e+02 1.234484e+01 * time: 2.3575069904327393 14 -1.581874e+02 7.478196e+00 * time: 2.3606221675872803 15 -1.582981e+02 2.027162e+00 * time: 2.3637871742248535 16 -1.583375e+02 5.578262e+00 * time: 2.366858959197998 17 -1.583556e+02 4.727050e+00 * time: 2.370090961456299 18 -1.583644e+02 2.340173e+00 * time: 2.373183012008667 19 -1.583680e+02 7.738100e-01 * time: 2.376235008239746 20 -1.583696e+02 3.300689e-01 * time: 2.379223108291626 21 -1.583704e+02 3.641985e-01 * time: 2.3823740482330322 22 -1.583707e+02 4.365901e-01 * time: 2.3854329586029053 23 -1.583709e+02 3.887800e-01 * time: 2.388512134552002 24 -1.583710e+02 2.766977e-01 * time: 2.3915271759033203 25 -1.583710e+02 1.758029e-01 * time: 2.394792079925537 26 -1.583710e+02 1.133947e-01 * time: 2.397952079772949 27 -1.583710e+02 7.922544e-02 * time: 2.401085138320923 28 -1.583710e+02 5.954998e-02 * time: 2.404100179672241 29 -1.583710e+02 4.157080e-02 * time: 2.4073121547698975 30 -1.583710e+02 4.295446e-02 * time: 2.410396099090576 31 -1.583710e+02 5.170752e-02 * time: 2.4134249687194824 32 -1.583710e+02 2.644382e-02 * time: 2.4174771308898926 33 -1.583710e+02 4.548987e-03 * time: 2.4213991165161133 34 -1.583710e+02 2.501805e-02 * time: 2.4255411624908447 35 -1.583710e+02 3.763439e-02 * time: 2.4285590648651123 36 -1.583710e+02 3.206027e-02 * time: 2.4316530227661133 37 -1.583710e+02 1.003700e-02 * time: 2.434627056121826 38 -1.583710e+02 2.209084e-02 * time: 2.437901020050049 39 -1.583710e+02 4.954136e-03 * time: 2.4409351348876953 40 -1.583710e+02 1.609366e-02 * time: 2.445019006729126 41 -1.583710e+02 1.579810e-02 * time: 2.4482710361480713 42 -1.583710e+02 1.014156e-03 * time: 2.4513840675354004 43 -1.583710e+02 6.050792e-03 * time: 2.455404043197632 44 -1.583710e+02 1.354381e-02 * time: 2.4585909843444824 45 -1.583710e+02 4.473216e-03 * time: 2.461724042892456 46 -1.583710e+02 4.645458e-03 * time: 2.4648079872131348 47 -1.583710e+02 9.828063e-03 * time: 2.4678330421447754 48 -1.583710e+02 1.047215e-03 * time: 2.4711029529571533 49 -1.583710e+02 8.374104e-03 * time: 2.474297046661377 50 -1.583710e+02 7.841995e-04 * time: 2.4775030612945557
FittedPumasModel
Dynamical system type: Closed form
Number of subjects: 120
Observation records: Active Missing
Conc: 1320 0
Total: 1320 0
Number of parameters: Constant Optimized
1 6
Likelihood approximation: NaivePooled
Likelihood optimizer: BFGS
Termination Reason: GradientNorm
Log-likelihood value: 158.37103
------------------
Estimate
------------------
tvcl 3.0054
tvv 14.089
tvka 44.227
† Ω₁,₁ 0.0
† Ω₂,₂ 0.0
† Ω₃,₃ 0.0
σ_p 0.32999
------------------
† indicates constant parameters
coefficients_table(pkfit_np)
Row | Parameter | Description | Constant | Estimate |
---|---|---|---|---|
String | SubStrin… | Bool | Float64 | |
1 | tvcl | Clearance (L/hr) | false | 3.005 |
2 | tvv | Volume (L) | false | 14.089 |
3 | tvka | Absorption rate constant (h-1) | false | 44.227 |
4 | Ω₁,₁ | ΩCL | true | 0.0 |
5 | Ω₂,₂ | ΩVc | true | 0.0 |
6 | Ω₃,₃ | ΩKa | true | 0.0 |
7 | σ_p | Proportional RUV | false | 0.33 |
The final estimates from the NaivePooled
approach seem reasonably close to our initial guess from NCA, except for the tvka
parameter. We will stick with our initial guess.
One way to be cautious before going into a complete fit
ting routine is to evaluate the likelihood of the individual subjects given the initial parameter values and see if any subject(s) pops out as unreasonable. There are a few ways of doing this:
- check the
loglikelihood
subject wise - check if there any influential subjects
Below, we are basically checking if the initial estimates for any subject are way off that we are unable to compute the initial loglikelihood
.
= [loglikelihood(pk_1cmp, subj, pkparam, FOCE()) for subj in pkpain_noplb]
lls # the plot below is using native CairoMakie `hist`
hist(lls; bins = 10, normalization = :none, color = (:black, 0.5))
The distribution of the loglikelihood’s suggest no extreme outliers.
A more convenient way is to use the findinfluential
function that provides a list of k
top influential subjects by showing the normalized (minus) loglikelihood for each subject. As you can see below, the minus loglikelihood in the range of 16 agrees with the histogram plotted above.
= findinfluential(pk_1cmp, pkpain_noplb, pkparam, FOCE()) influential_subjects
120-element Vector{@NamedTuple{id::String, nll::Float64}}:
(id = "148", nll = 16.65965885684477)
(id = "135", nll = 16.648985190076335)
(id = "156", nll = 15.959069556607496)
(id = "159", nll = 15.441218240496484)
(id = "149", nll = 14.71513464411951)
(id = "88", nll = 13.09709837464614)
(id = "16", nll = 12.98228052193144)
(id = "61", nll = 12.65218290230368)
(id = "71", nll = 12.500330088085505)
(id = "59", nll = 12.241510254805235)
⋮
(id = "57", nll = -22.79767423253431)
(id = "93", nll = -22.836900711478208)
(id = "12", nll = -23.007742339519247)
(id = "123", nll = -23.292751843079234)
(id = "41", nll = -23.425412534960515)
(id = "99", nll = -23.535214841901112)
(id = "29", nll = -24.025959868383083)
(id = "52", nll = -24.164757842493685)
(id = "24", nll = -25.57209232565845)
3.2.2.2 FOCE
Now that we have a good handle on our data, lets go ahead and fit
a population model with FOCE
:
= fit(pk_1cmp, pkpain_noplb, pkparam, FOCE(); constantcoef = (; tvka = 2)) pkfit_1cmp
[ Info: Checking the initial parameter values. [ Info: The initial negative log likelihood and its gradient are finite. Check passed. Iter Function value Gradient norm 0 -5.935351e+02 5.597318e+02 * time: 2.8848648071289062e-5 1 -7.022088e+02 1.707063e+02 * time: 0.6749019622802734 2 -7.314067e+02 2.903269e+02 * time: 0.7896158695220947 3 -8.520591e+02 2.285888e+02 * time: 0.9046328067779541 4 -1.120191e+03 3.795410e+02 * time: 1.2159268856048584 5 -1.178784e+03 2.323978e+02 * time: 1.3268909454345703 6 -1.218320e+03 9.699907e+01 * time: 1.4310028553009033 7 -1.223641e+03 5.862105e+01 * time: 2.0299289226531982 8 -1.227620e+03 1.831402e+01 * time: 2.1307928562164307 9 -1.228381e+03 2.132323e+01 * time: 2.2298879623413086 10 -1.230098e+03 2.921228e+01 * time: 4.391286849975586 11 -1.230854e+03 2.029662e+01 * time: 4.490128993988037 12 -1.231116e+03 5.229099e+00 * time: 4.583923816680908 13 -1.231179e+03 1.689232e+00 * time: 4.681553840637207 14 -1.231187e+03 1.215379e+00 * time: 4.782559871673584 15 -1.231188e+03 2.770381e-01 * time: 4.878432989120483 16 -1.231188e+03 1.636652e-01 * time: 4.964958906173706 17 -1.231188e+03 2.701149e-01 * time: 5.0537238121032715 18 -1.231188e+03 3.163341e-01 * time: 5.144328832626343 19 -1.231188e+03 1.505153e-01 * time: 5.231457948684692 20 -1.231188e+03 2.485002e-02 * time: 5.392660856246948 21 -1.231188e+03 8.435209e-04 * time: 5.448892831802368
FittedPumasModel
Dynamical system type: Closed form
Number of subjects: 120
Observation records: Active Missing
Conc: 1320 0
Total: 1320 0
Number of parameters: Constant Optimized
1 6
Likelihood approximation: FOCE
Likelihood optimizer: BFGS
Termination Reason: GradientNorm
Log-likelihood value: 1231.188
-------------------
Estimate
-------------------
tvcl 3.1642
tvv 13.288
† tvka 2.0
Ω₁,₁ 0.08494
Ω₂,₂ 0.048568
Ω₃,₃ 5.5811
σ_p 0.10093
-------------------
† indicates constant parameters
infer(pkfit_1cmp)
[ Info: Calculating: variance-covariance matrix. [ Info: Done.
Asymptotic inference results using sandwich estimator
Dynamical system type: Closed form
Number of subjects: 120
Observation records: Active Missing
Conc: 1320 0
Total: 1320 0
Number of parameters: Constant Optimized
1 6
Likelihood approximation: FOCE
Likelihood optimizer: BFGS
Termination Reason: GradientNorm
Log-likelihood value: 1231.188
---------------------------------------------------------
Estimate SE 95.0% C.I.
---------------------------------------------------------
tvcl 3.1642 0.08662 [ 2.9944 ; 3.334 ]
tvv 13.288 0.27481 [ 12.749 ; 13.827 ]
† tvka 2.0 NaN [ NaN ; NaN ]
Ω₁,₁ 0.08494 0.011022 [ 0.063338; 0.10654 ]
Ω₂,₂ 0.048568 0.0063502 [ 0.036122; 0.061014]
Ω₃,₃ 5.5811 1.2188 [ 3.1922 ; 7.97 ]
σ_p 0.10093 0.0057196 [ 0.089718; 0.11214 ]
---------------------------------------------------------
† indicates constant parameters
Notice that tvka
is fixed to 2 as we don’t have a lot of information before tmax
. From the results above, we see that the parameter precision for this model is reasonable.
3.2.3 Two-compartment model
Just to be sure, let’s fit a 2-compartment model and evaluate:
= @model begin
pk_2cmp
@param begin
"""
Clearance (L/hr)
"""
∈ RealDomain(; lower = 0, init = 3.2)
tvcl """
Central Volume (L)
"""
∈ RealDomain(; lower = 0, init = 16.4)
tvv """
Peripheral Volume (L)
"""
∈ RealDomain(; lower = 0, init = 10)
tvvp """
Distributional Clearance (L/hr)
"""
∈ RealDomain(; lower = 0, init = 2)
tvq """
Absorption rate constant (h-1)
"""
∈ RealDomain(; lower = 0, init = 1.3)
tvka """
- ΩCL
- ΩVc
- ΩKa
- ΩVp
- ΩQ
"""
∈ PDiagDomain(init = [0.04, 0.04, 0.04, 0.04, 0.04])
Ω """
Proportional RUV
"""
∈ RealDomain(; lower = 0.0001, init = 0.2)
σ_p end
@random begin
~ MvNormal(Ω)
η end
@covariates begin
"""
Dose (mg)
"""
Doseend
@pre begin
= tvcl * exp(η[1])
CL = tvv * exp(η[2])
Vc = tvka * exp(η[3])
Ka = tvvp * exp(η[4])
Vp = tvq * exp(η[5])
Q end
@dynamics Depots1Central1Periph1
@derived begin
:= @. Central / Vc
cp """
CTMx Concentration (ng/mL)
"""
~ @. Normal(cp, cp * σ_p)
Conc end
end
┌ Warning: Covariate Dose is not used in the model. └ @ Pumas ~/run/_work/PumasTutorials.jl/PumasTutorials.jl/custom_julia_depot/packages/Pumas/LN24a/src/dsl/model_macro.jl:3153
PumasModel
Parameters: tvcl, tvv, tvvp, tvq, tvka, Ω, σ_p
Random effects: η
Covariates: Dose
Dynamical system variables: Depot, Central, Peripheral
Dynamical system type: Closed form
Derived: Conc
Observed: Conc
3.2.3.1 FOCE
=
pkfit_2cmp fit(pk_2cmp, pkpain_noplb, init_params(pk_2cmp), FOCE(); constantcoef = (; tvka = 2))
[ Info: Checking the initial parameter values. [ Info: The initial negative log likelihood and its gradient are finite. Check passed. Iter Function value Gradient norm 0 -6.302369e+02 1.021050e+03 * time: 1.5974044799804688e-5 1 -9.197817e+02 9.927951e+02 * time: 0.7824268341064453 2 -1.372640e+03 2.054986e+02 * time: 1.0309619903564453 3 -1.446326e+03 1.543987e+02 * time: 1.271009922027588 4 -1.545570e+03 1.855028e+02 * time: 1.5110459327697754 5 -1.581449e+03 1.713157e+02 * time: 1.8391668796539307 6 -1.639433e+03 1.257382e+02 * time: 2.087116003036499 7 -1.695964e+03 7.450539e+01 * time: 2.315091848373413 8 -1.722243e+03 5.961044e+01 * time: 2.542961835861206 9 -1.736883e+03 7.320921e+01 * time: 2.7721588611602783 10 -1.753547e+03 7.501938e+01 * time: 3.0058889389038086 11 -1.764053e+03 6.185661e+01 * time: 3.2697129249572754 12 -1.778991e+03 4.831033e+01 * time: 3.5174968242645264 13 -1.791492e+03 4.943278e+01 * time: 3.77160382270813 14 -1.799847e+03 2.871410e+01 * time: 4.05883002281189 15 -1.805374e+03 7.520790e+01 * time: 4.327724933624268 16 -1.816260e+03 2.990621e+01 * time: 4.586523056030273 17 -1.818252e+03 2.401915e+01 * time: 4.81342887878418 18 -1.822988e+03 2.587225e+01 * time: 5.065968036651611 19 -1.824653e+03 1.550517e+01 * time: 5.287530899047852 20 -1.826074e+03 1.788927e+01 * time: 5.516044855117798 21 -1.826821e+03 1.888389e+01 * time: 5.741992950439453 22 -1.827900e+03 1.432840e+01 * time: 5.988553047180176 23 -1.828511e+03 9.422040e+00 * time: 6.229126930236816 24 -1.828754e+03 5.363445e+00 * time: 6.475062847137451 25 -1.828862e+03 4.916168e+00 * time: 6.711834907531738 26 -1.829007e+03 4.695750e+00 * time: 6.967119932174683 27 -1.829358e+03 1.090244e+01 * time: 7.210114002227783 28 -1.829830e+03 1.451320e+01 * time: 7.451592922210693 29 -1.830201e+03 1.108695e+01 * time: 7.698290824890137 30 -1.830360e+03 2.892317e+00 * time: 7.9604949951171875 31 -1.830390e+03 1.699265e+00 * time: 8.195772886276245 32 -1.830404e+03 1.602222e+00 * time: 8.424339056015015 33 -1.830432e+03 2.823676e+00 * time: 8.655365943908691 34 -1.830475e+03 4.121601e+00 * time: 8.8936128616333 35 -1.830527e+03 5.080494e+00 * time: 9.159162044525146 36 -1.830591e+03 2.668323e+00 * time: 9.406501054763794 37 -1.830615e+03 3.522601e+00 * time: 9.656627893447876 38 -1.830623e+03 2.203940e+00 * time: 9.893674850463867 39 -1.830625e+03 1.642394e+00 * time: 10.15145993232727 40 -1.830627e+03 9.396311e-01 * time: 10.402334928512573 41 -1.830628e+03 8.588414e-01 * time: 10.65546989440918 42 -1.830628e+03 3.457037e-01 * time: 10.879348039627075 43 -1.830629e+03 4.556038e-01 * time: 11.103724002838135 44 -1.830630e+03 6.366787e-01 * time: 11.32477593421936 45 -1.830630e+03 4.104090e-01 * time: 11.564074993133545 46 -1.830630e+03 7.434196e-02 * time: 11.773545980453491 47 -1.830630e+03 7.316846e-02 * time: 12.032345056533813 48 -1.830630e+03 7.320992e-02 * time: 12.325500011444092 49 -1.830630e+03 7.471716e-02 * time: 12.550195932388306 50 -1.830630e+03 7.471716e-02 * time: 12.828661918640137 51 -1.830630e+03 7.471716e-02 * time: 12.982517957687378
FittedPumasModel
Dynamical system type: Closed form
Number of subjects: 120
Observation records: Active Missing
Conc: 1320 0
Total: 1320 0
Number of parameters: Constant Optimized
1 10
Likelihood approximation: FOCE
Likelihood optimizer: BFGS
Termination Reason: NoXChange
Log-likelihood value: 1830.6304
-------------------
Estimate
-------------------
tvcl 2.8137
tvv 11.005
tvvp 5.5401
tvq 1.5159
† tvka 2.0
Ω₁,₁ 0.10266
Ω₂,₂ 0.060778
Ω₃,₃ 1.2012
Ω₄,₄ 0.42347
Ω₅,₅ 0.24471
σ_p 0.048404
-------------------
† indicates constant parameters
3.3 Comparing One- versus Two-compartment models
The 2-compartment model has a much lower objective function compared to the 1-compartment. Let’s compare the estimates from the 2 models using the compare_estimates
function.
compare_estimates(; pkfit_1cmp, pkfit_2cmp)
Row | parameter | pkfit_1cmp | pkfit_2cmp |
---|---|---|---|
String | Float64? | Float64? | |
1 | tvcl | 3.1642 | 2.81373 |
2 | tvv | 13.288 | 11.0047 |
3 | tvka | 2.0 | 2.0 |
4 | Ω₁,₁ | 0.0849405 | 0.102656 |
5 | Ω₂,₂ | 0.0485682 | 0.060778 |
6 | Ω₃,₃ | 5.58107 | 1.20118 |
7 | σ_p | 0.100928 | 0.0484045 |
8 | tvvp | missing | 5.54005 |
9 | tvq | missing | 1.51591 |
10 | Ω₄,₄ | missing | 0.423466 |
11 | Ω₅,₅ | missing | 0.244713 |
We perform a likelihood ratio test to compare the two nested models. The test statistic and the \(p\)-value clearly indicate that a 2-compartment model should be preferred.
lrtest(pkfit_1cmp, pkfit_2cmp)
Statistic: 1200.0
Degrees of freedom: 4
P-value: 0.0
We should also compare the other metrics and statistics, such ηshrinkage
, ϵshrinkage
, aic
, and bic
using the metrics_table
function.
@chain metrics_table(pkfit_2cmp) begin
leftjoin(metrics_table(pkfit_1cmp); on = :Metric, makeunique = true)
rename!(:Value => :pk2cmp, :Value_1 => :pk1cmp)
end
WARNING: using deprecated binding Distributions.MatrixReshaped in Pumas.
, use Distributions.ReshapedDistribution{2, S, D} where D<:Distributions.Distribution{Distributions.ArrayLikeVariate{1}, S} where S<:Distributions.ValueSupport instead.
Row | Metric | pk2cmp | pk1cmp |
---|---|---|---|
Any | Any | Any | |
1 | Successful | true | true |
2 | Estimation Time | 12.985 | 5.449 |
3 | Subjects | 120 | 120 |
4 | Fixed Parameters | 1 | 1 |
5 | Optimized Parameters | 10 | 6 |
6 | Conc Active Observations | 1320 | 1320 |
7 | Conc Missing Observations | 0 | 0 |
8 | Total Active Observations | 1320 | 1320 |
9 | Total Missing Observations | 0 | 0 |
10 | Likelihood Approximation | Pumas.FOCE{Optim.NewtonTrustRegion{Float64}, Optim.Options{Float64, Nothing}} | Pumas.FOCE{Optim.NewtonTrustRegion{Float64}, Optim.Options{Float64, Nothing}} |
11 | RawTypst("LogLikelihood (LL)") | 1830.63 | 1231.19 |
12 | RawTypst("-2LL") | -3661.26 | -2462.38 |
13 | AIC | -3641.26 | -2450.38 |
14 | BIC | -3589.41 | -2419.26 |
15 | RawTypst("(η-shrinkage) η₁") | 0.037 | 0.016 |
16 | RawTypst("(η-shrinkage) η₂") | 0.047 | 0.04 |
17 | RawTypst("(η-shrinkage) η₃") | 0.516 | 0.733 |
18 | RawTypst("(ϵ-shrinkage) Conc") | 0.185 | 0.105 |
19 | RawTypst("(η-shrinkage) η₄") | 0.287 | missing |
20 | RawTypst("(η-shrinkage) η₅") | 0.154 | missing |
We next generate some goodness of fit plots to compare which model is performing better. To do this, we first inspect
the diagnostics of our model fit.
= inspect(pkfit_1cmp) res_inspect_1cmp
[ Info: Calculating predictions. [ Info: Calculating weighted residuals. [ Info: Calculating empirical bayes. [ Info: Evaluating dose control parameters. [ Info: Evaluating individual parameters. [ Info: Done.
FittedPumasModelInspection
Likelihood approximation used for weighted residuals: FOCE
= inspect(pkfit_2cmp) res_inspect_2cmp
[ Info: Calculating predictions. [ Info: Calculating weighted residuals. [ Info: Calculating empirical bayes. [ Info: Evaluating dose control parameters. [ Info: Evaluating individual parameters. [ Info: Done.
FittedPumasModelInspection
Likelihood approximation used for weighted residuals: FOCE
= goodness_of_fit(
gof_1cmp
res_inspect_1cmp;= (; fontsize = 12),
figure = (; position = :bottom),
legend )
= goodness_of_fit(
gof_2cmp
res_inspect_2cmp;= (; fontsize = 12),
figure = (; position = :bottom),
legend )
These plots clearly indicate that the 2-compartment model is a better fit compared to the 1-compartment model.
We can look at selected sample of individual plots.
= subject_fits(
fig_subject_fits
res_inspect_2cmp;= true,
separate = true,
paginate = (; fontsize = 18),
figure = (; xlabel = "Time (hr)", ylabel = "CTMx Concentration (ng/mL)"),
axis
)1] fig_subject_fits[
There a lot of important plotting functions you can use for your standard model diagnostics. Please make sure to read the documentation for plotting. Below, we are checking the distribution of the empirical Bayes estimates.
empirical_bayes_dist(res_inspect_2cmp; zeroline_color = :red)
empirical_bayes_vs_covariates(
res_inspect_2cmp;= [:Dose],
categorical = (; size = (600, 800)),
figure )
Clearly, our guess at tvka
seems off-target. Let’s try and estimate tvka
instead of fixing it to 2
:
= fit(pk_2cmp, pkpain_noplb, init_params(pk_2cmp), FOCE()) pkfit_2cmp_unfix_ka
[ Info: Checking the initial parameter values. [ Info: The initial negative log likelihood and its gradient are finite. Check passed. Iter Function value Gradient norm 0 -3.200734e+02 1.272671e+03 * time: 2.8848648071289062e-5 1 -8.682982e+02 1.000199e+03 * time: 3.590601921081543 2 -1.381870e+03 5.008081e+02 * time: 3.8490729331970215 3 -1.551053e+03 6.833490e+02 * time: 4.141964912414551 4 -1.680887e+03 1.834586e+02 * time: 4.557896852493286 5 -1.726118e+03 8.870274e+01 * time: 4.797769784927368 6 -1.761023e+03 1.162036e+02 * time: 5.143439769744873 7 -1.786619e+03 1.114552e+02 * time: 5.388704776763916 8 -1.863556e+03 9.914305e+01 * time: 5.70662784576416 9 -1.882942e+03 5.342676e+01 * time: 5.965419769287109 10 -1.888020e+03 2.010181e+01 * time: 6.261753797531128 11 -1.889832e+03 1.867262e+01 * time: 6.580699920654297 12 -1.891649e+03 1.668510e+01 * time: 6.886764764785767 13 -1.892615e+03 1.820707e+01 * time: 7.185096979141235 14 -1.893453e+03 1.745193e+01 * time: 7.459592819213867 15 -1.894760e+03 1.850174e+01 * time: 7.76335597038269 16 -1.895647e+03 1.773921e+01 * time: 8.038359880447388 17 -1.896597e+03 1.143421e+01 * time: 8.314603805541992 18 -1.897114e+03 9.720034e+00 * time: 8.607655763626099 19 -1.897373e+03 6.054160e+00 * time: 8.878468990325928 20 -1.897498e+03 3.985923e+00 * time: 9.149141788482666 21 -1.897571e+03 4.262502e+00 * time: 9.436415910720825 22 -1.897633e+03 4.010316e+00 * time: 9.697462797164917 23 -1.897714e+03 4.805389e+00 * time: 9.959414958953857 24 -1.897802e+03 3.508614e+00 * time: 10.246753931045532 25 -1.897865e+03 3.691472e+00 * time: 10.509668827056885 26 -1.897900e+03 2.982676e+00 * time: 10.770537853240967 27 -1.897928e+03 2.563863e+00 * time: 11.070770978927612 28 -1.897968e+03 3.261530e+00 * time: 11.331854820251465 29 -1.898013e+03 3.064695e+00 * time: 11.598069906234741 30 -1.898040e+03 1.636456e+00 * time: 11.88095498085022 31 -1.898051e+03 1.439998e+00 * time: 12.139494895935059 32 -1.898057e+03 1.436505e+00 * time: 12.401907920837402 33 -1.898069e+03 1.881592e+00 * time: 12.663623809814453 34 -1.898095e+03 3.253228e+00 * time: 12.946459770202637 35 -1.898142e+03 4.257954e+00 * time: 13.228419780731201 36 -1.898199e+03 3.685153e+00 * time: 13.492685794830322 37 -1.898245e+03 2.567377e+00 * time: 13.789976835250854 38 -1.898246e+03 2.561577e+00 * time: 14.166137933731079 39 -1.898251e+03 2.530928e+00 * time: 14.519227981567383 40 -1.898298e+03 2.673773e+00 * time: 14.792173862457275 41 -1.898300e+03 2.795859e+00 * time: 15.141010999679565 42 -1.898337e+03 3.666102e+00 * time: 15.483920812606812 43 -1.898342e+03 3.753077e+00 * time: 15.927880764007568 44 -1.898429e+03 4.461850e+00 * time: 16.275630950927734 45 -1.898461e+03 3.584769e+00 * time: 16.588422775268555 46 -1.898477e+03 2.357431e+00 * time: 16.86016583442688 47 -1.898479e+03 7.373685e-01 * time: 17.13435387611389 48 -1.898479e+03 6.197624e-01 * time: 17.49106478691101 49 -1.898480e+03 1.716594e+00 * time: 17.830471992492676 50 -1.898480e+03 1.128276e+00 * time: 18.112990856170654 51 -1.898480e+03 9.157160e-01 * time: 18.42747187614441 52 -1.898480e+03 3.051694e-01 * time: 18.7495379447937 53 -1.898480e+03 2.316257e-01 * time: 19.03283977508545 54 -1.898480e+03 2.316260e-01 * time: 19.29244589805603 55 -1.898480e+03 2.344996e-01 * time: 19.769754886627197 56 -1.898480e+03 2.372822e-01 * time: 20.249743938446045 57 -1.898480e+03 2.372809e-01 * time: 20.7313449382782 58 -1.898480e+03 2.372808e-01 * time: 21.175105810165405 59 -1.898480e+03 2.372808e-01 * time: 21.65797996520996 60 -1.898480e+03 2.372808e-01 * time: 22.13892698287964 61 -1.898480e+03 2.372808e-01 * time: 22.614396810531616 62 -1.898480e+03 2.372808e-01 * time: 23.0898699760437
FittedPumasModel
Dynamical system type: Closed form
Number of subjects: 120
Observation records: Active Missing
Conc: 1320 0
Total: 1320 0
Number of parameters: Constant Optimized
0 11
Likelihood approximation: FOCE
Likelihood optimizer: BFGS
Termination Reason: NoObjectiveChange
Log-likelihood value: 1898.4797
-----------------
Estimate
-----------------
tvcl 2.6191
tvv 11.378
tvvp 8.4529
tvq 1.3164
tvka 4.8925
Ω₁,₁ 0.13243
Ω₂,₂ 0.059669
Ω₃,₃ 0.41581
Ω₄,₄ 0.080679
Ω₅,₅ 0.24996
σ_p 0.049098
-----------------
compare_estimates(; pkfit_2cmp, pkfit_2cmp_unfix_ka)
Row | parameter | pkfit_2cmp | pkfit_2cmp_unfix_ka |
---|---|---|---|
String | Float64? | Float64? | |
1 | tvcl | 2.81373 | 2.61912 |
2 | tvv | 11.0047 | 11.3783 |
3 | tvvp | 5.54005 | 8.45292 |
4 | tvq | 1.51591 | 1.31636 |
5 | tvka | 2.0 | 4.89251 |
6 | Ω₁,₁ | 0.102656 | 0.132433 |
7 | Ω₂,₂ | 0.060778 | 0.0596692 |
8 | Ω₃,₃ | 1.20118 | 0.415807 |
9 | Ω₄,₄ | 0.423466 | 0.080679 |
10 | Ω₅,₅ | 0.244713 | 0.249963 |
11 | σ_p | 0.0484045 | 0.0490975 |
Let’s revaluate the goodness of fits and η distribution plots.
Not much change in the general gof
plots
= inspect(pkfit_2cmp_unfix_ka) res_inspect_2cmp_unfix_ka
[ Info: Calculating predictions. [ Info: Calculating weighted residuals. [ Info: Calculating empirical bayes. [ Info: Evaluating dose control parameters. [ Info: Evaluating individual parameters. [ Info: Done.
FittedPumasModelInspection
Likelihood approximation used for weighted residuals: FOCE
goodness_of_fit(
res_inspect_2cmp_unfix_ka;= (; fontsize = 12),
figure = (; position = :bottom),
legend )
But you can see a huge improvement in the ηka
, (η₃
) distribution which is now centered around zero
empirical_bayes_vs_covariates(
res_inspect_2cmp_unfix_ka;= [:Dose],
categorical = [:η₃],
ebes = (; size = (600, 800)),
figure )
Finally looking at some individual plots for the same subjects as earlier:
= subject_fits(
fig_subject_fits2
res_inspect_2cmp_unfix_ka;= true,
separate = true,
paginate = (; linkyaxes = false),
facet = (; fontsize = 18),
figure = (; xlabel = "Time (hr)", ylabel = "CTMx Concentration (ng/mL)"),
axis
)6] fig_subject_fits2[
The randomly sampled individual fits don’t seem good in some individuals, but we can evaluate this via a vpc
to see how to go about.
3.4 Visual Predictive Checks (VPC)
We can now perform a vpc
to check. The default plots provide a 80% prediction interval and a 95% simulated CI (shaded area) around each of the quantiles
= vpc(pkfit_2cmp_unfix_ka, 200; observations = [:Conc], stratify_by = [:Dose]) pk_vpc
[ Info: Continuous VPC
Visual Predictive Check
Type of VPC: Continuous VPC
Simulated populations: 200
Subjects in data: 120
Stratification variable(s): [:Dose]
Confidence level: 0.95
VPC lines: quantiles ([0.1, 0.5, 0.9])
vpc_plot(
pk_2cmp,
pk_vpc;= 1,
rows = 3,
columns = (; size = (1400, 1000), fontsize = 22),
figure = (;
axis = "Time (hr)",
xlabel = "Observed/Predicted\n CTMx Concentration (ng/mL)",
ylabel
),= (; combinelabels = true),
facet )
The visual predictive check suggests that the model captures the data well across all dose levels.
4 Additional Help
If you have questions regarding this tutorial, please post them on our discourse site.