using Pumas
using PumasUtilities
using NCA
using NCAUtilities
A Comprehensive Introduction to Pumas
This tutorial provides a comprehensive introduction to a modeling and simulation workflow in Pumas. The idea is not to get into the details of Pumas specifics, but instead provide a narrative on the lines of a regular workflow in our day-to-day work, with brevity where required to allow a broad overview. Wherever possible, cross-references will be provided to documentation and detailed examples that provide deeper insight into a particular topic.
As part of this workflow, you will be introduced to various aspects such as:
- Data wrangling in Julia
- Exploratory analysis in Julia
- Continuous data non-linear mixed effects modeling in Pumas
- Model comparison routines, post-processing, validation etc.
1 The Study and Design
CTMNopain is a novel anti-inflammatory agent under preliminary investigation. A dose-ranging trial was conducted comparing placebo with 3 doses of CTMNopain (5mg, 20mg and 80 mg QD). The maximum tolerated dose is 160 mg per day. Plasma concentrations (mg/L) of the drug were measured at 0
, 0.5
, 1
, 1.5
, 2
, 2.5
, 3
-8
hours.
Pain score (0
=no pain, 1
=mild, 2
=moderate, 3
=severe) were obtained at time points when plasma concentration was collected. A pain score of 2
or more is considered as no pain relief.
The subjects can request for remedication if pain relief is not achieved after 2 hours post dose. Some subjects had remedication before 2 hours if they were not able to bear the pain. The time to remedication and the remedication status is available for subjects.
The pharmacokinetic dataset can be accessed using PharmaDatasets.jl
.
2 Setup
2.1 Load libraries
These libraries provide the workhorse functionality in the Pumas ecosystem:
In addition, libraries below are good add-on’s that provide ancillary functionality:
using GLM: lm, @formula
using Random
using CSV
using DataFramesMeta
using CairoMakie
using PharmaDatasets
2.2 Data Wrangling
We start by reading in the dataset and making some quick summaries.
If you want to learn more about data wrangling, don’t forget to check our Data Wrangling in Julia tutorials!
= dataset("pk_painrelief")
pkpain_df first(pkpain_df, 5)
Row | Subject | Time | Conc | PainRelief | PainScore | RemedStatus | Dose |
---|---|---|---|---|---|---|---|
Int64 | Float64 | Float64 | Int64 | Int64 | Int64 | String7 | |
1 | 1 | 0.0 | 0.0 | 0 | 3 | 1 | 20 mg |
2 | 1 | 0.5 | 1.15578 | 1 | 1 | 0 | 20 mg |
3 | 1 | 1.0 | 1.37211 | 1 | 0 | 0 | 20 mg |
4 | 1 | 1.5 | 1.30058 | 1 | 0 | 0 | 20 mg |
5 | 1 | 2.0 | 1.19195 | 1 | 1 | 0 | 20 mg |
Let’s filter out the placebo data as we don’t need that for the PK analysis.
= @rsubset pkpain_df :Dose != "Placebo";
pkpain_noplb_df first(pkpain_noplb_df, 5)
Row | Subject | Time | Conc | PainRelief | PainScore | RemedStatus | Dose |
---|---|---|---|---|---|---|---|
Int64 | Float64 | Float64 | Int64 | Int64 | Int64 | String7 | |
1 | 1 | 0.0 | 0.0 | 0 | 3 | 1 | 20 mg |
2 | 1 | 0.5 | 1.15578 | 1 | 1 | 0 | 20 mg |
3 | 1 | 1.0 | 1.37211 | 1 | 0 | 0 | 20 mg |
4 | 1 | 1.5 | 1.30058 | 1 | 0 | 0 | 20 mg |
5 | 1 | 2.0 | 1.19195 | 1 | 1 | 0 | 20 mg |
3 Analysis
3.1 Non-compartmental analysis
Let’s begin by performing a quick NCA of the concentration time profiles and view the exposure changes across doses. The input data specification for NCA analysis requires the presence of a :route
column and an :amt
column that specifies the dose. So, let’s add that in:
@rtransform! pkpain_noplb_df begin
:route = "ev"
:Dose = parse(Int, chop(:Dose; tail = 3))
end
We also need to create an :amt
column:
@rtransform! pkpain_noplb_df :amt = :Time == 0 ? :Dose : missing
Now, we map the data variables to the read_nca
function that prepares the data for NCA analysis.
= read_nca(
pkpain_nca
pkpain_noplb_df;= :Subject,
id = :Time,
time = :amt,
amt = :Conc,
observations = [:Dose],
group = :route,
route )
NCAPopulation (120 subjects):
Group: [["Dose" => 5], ["Dose" => 20], ["Dose" => 80]]
Number of missing observations: 0
Number of blq observations: 0
Now that we mapped the data in, let’s visualize the concentration vs time plots for a few individuals. When paginate
is set to true
, a vector of plots are returned and below we display the first element with 9 individuals.
= observations_vs_time(
f
pkpain_nca;= true,
paginate = (; xlabel = "Time (hr)", ylabel = "CTMNoPain Concentration (ng/mL)"),
axis = (; combinelabels = true),
facet
)1] f[
or you can view the summary curves by dose group as passed in to the group
argument in read_nca
summary_observations_vs_time(
pkpain_nca,= (; fontsize = 22, resolution = (800, 1000)),
figure = "black",
color = 3,
linewidth = (; xlabel = "Time (hr)", ylabel = "CTMX Concentration (μg/mL)"),
axis = (; combinelabels = true, linkaxes = true),
facet )
A full NCA Report is now obtained for completeness purposes using the run_nca
function, but later we will only extract a couple of key metrics of interest.
= run_nca(pkpain_nca; sigdigits = 3) pk_nca
We can look at the NCA fits for some subjects. Here f
is a vector or figures. We’ll showcase the first image by indexing f
:
= subject_fits(
f
pk_nca,= true,
paginate = (; xlabel = "Time (hr)", ylabel = "CTMX Concentration (μg/mL)"),
axis = (; combinelabels = true, linkaxes = true),
facet
)1] f[
As CTMNopain’s effect maybe mainly related to maximum concentration (cmax
) or area under the curve (auc
), we present some summary statistics using the summarize
function from NCA
.
= [:Dose] strata
1-element Vector{Symbol}:
:Dose
= [:cmax, :aucinf_obs] params
2-element Vector{Symbol}:
:cmax
:aucinf_obs
= summarize(pk_nca; stratify_by = strata, parameters = params) output
Row | Dose | parameters | numsamples | minimum | maximum | mean | std | geomean | geostd | geomeanCV |
---|---|---|---|---|---|---|---|---|---|---|
Int64 | String | Int64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | |
1 | 5 | cmax | 40 | 0.19 | 0.539 | 0.356075 | 0.0884129 | 0.345104 | 1.2932 | 26.1425 |
2 | 5 | aucinf_obs | 40 | 0.914 | 3.4 | 1.5979 | 0.490197 | 1.53373 | 1.32974 | 29.0868 |
3 | 20 | cmax | 40 | 0.933 | 2.7 | 1.4737 | 0.361871 | 1.43408 | 1.2633 | 23.6954 |
4 | 20 | aucinf_obs | 40 | 2.77 | 14.1 | 6.377 | 2.22239 | 6.02031 | 1.41363 | 35.6797 |
5 | 80 | cmax | 40 | 3.3 | 8.47 | 5.787 | 1.31957 | 5.64164 | 1.25757 | 23.2228 |
6 | 80 | aucinf_obs | 40 | 13.7 | 49.1 | 29.5 | 8.68984 | 28.2954 | 1.34152 | 30.0258 |
The statistics printed above are the default, but you can pass in your own statistics using the stats = []
argument to the summarize
function.
We can look at a few parameter distribution plots.
parameters_vs_group(
pk_nca,= :cmax,
parameter = (; xlabel = "Dose (mg)", ylabel = "Cₘₐₓ (ng/mL)"),
axis = (; fontsize = 18),
figure )
Dose normalized PK parameters, cmax
and aucinf
were essentially dose proportional between for 5 mg, 20 mg and 80 mg doses. You can perform a simple regression to check the impact of dose on cmax
:
= NCA.DoseLinearityPowerModel(pk_nca, :cmax; level = 0.9) dp
Dose Linearity Power Model
Variable: cmax
Model: log(cmax) ~ log(α) + β × log(dose)
────────────────────────────────────
Estimate low CI 90% high CI 90%
────────────────────────────────────
β 1.00775 0.97571 1.0398
────────────────────────────────────
Here’s a visualization for the dose linearity using a power model for cmax
:
power_model(dp)
We can also visualize a dose proportionality results with respect to a specific endpoint in a NCA Report; for example cmax
and aucinf_obs
:
dose_vs_dose_normalized(pk_nca, :cmax)
dose_vs_dose_normalized(pk_nca, :aucinf_obs)
Based on visual inspection of the concentration time profiles as seen earlier, CTMNopain exhibited monophasic decline, and perhaps a one compartment model best fits the PK data.
3.2 Pharmacokinetic modeling
As seen from the plots above, the concentrations decline monoexponentially. We will evaluate both one and two compartment structural models to assess best fit. Further, different residual error models will also be tested.
We will use the results from NCA to provide us good initial estimates.
3.2.1 Data preparation for modeling
PumasNDF requires the presence of :evid
and :cmt
columns in the dataset.
@rtransform! pkpain_noplb_df begin
:evid = :Time == 0 ? 1 : 0
:cmt = :Time == 0 ? 1 : 2
:cmt2 = 1 # for zero order absorption
end
Further, observations at time of dosing, i.e., when evid = 1
have to be missing
@rtransform! pkpain_noplb_df :Conc = :evid == 1 ? missing : :Conc
The dataframe will now be converted to a Population
using read_pumas
. Note that both observations
and covariates
are required to be an array even if it is one element.
= read_pumas(
pkpain_noplb
pkpain_noplb_df;= :Subject,
id = :Time,
time = :amt,
amt = [:Conc],
observations = [:Dose],
covariates = :evid,
evid = :cmt,
cmt )
Population
Subjects: 120
Covariates: Dose
Observations: Conc
Now that the data is transformed to a Population
of subjects, we can explore different models.
3.2.2 One-compartment model
If you are not familiar yet with the @model
blocks and syntax, please check our documentation.
= @model begin
pk_1cmp
@metadata begin
= "One Compartment Model"
desc = u"hr"
timeu end
@param begin
"""
Clearance (L/hr)
"""
∈ RealDomain(; lower = 0, init = 3.2)
tvcl """
Volume (L)
"""
∈ RealDomain(; lower = 0, init = 16.4)
tvv """
Absorption rate constant (h-1)
"""
∈ RealDomain(; lower = 0, init = 3.8)
tvka """
- ΩCL
- ΩVc
- ΩKa
"""
∈ PDiagDomain(init = [0.04, 0.04, 0.04])
Ω """
Proportional RUV
"""
∈ RealDomain(; lower = 0.0001, init = 0.2)
σ_p end
@random begin
~ MvNormal(Ω)
η end
@covariates begin
"""
Dose (mg)
"""
Doseend
@pre begin
= tvcl * exp(η[1])
CL = tvv * exp(η[2])
Vc = tvka * exp(η[3])
Ka end
@dynamics Depots1Central1
@derived begin
:= @. Central / Vc
cp """
CTMx Concentration (ng/mL)
"""
~ @. Normal(cp, abs(cp) * σ_p)
Conc end
end
PumasModel
Parameters: tvcl, tvv, tvka, Ω, σ_p
Random effects: η
Covariates: Dose
Dynamical variables: Depot, Central
Derived: Conc
Observed: Conc
Note that the local assignment :=
can be used to define intermediate statements that will not be carried outside of the block. This means that all the resulting data workflows from this model will not contain the intermediate variables defined with :=
. We use this when we want to suppress the variable from any further output.
The idea behind :=
is for performance reasons. If you are not carrying the variable defined with :=
outside of the block, then it is not necessary to store it in the resulting data structures. Not only will your model run faster, but the resulting data structures will also be smaller.
Before going to fit the model, let’s evaluate some helpful steps via simulation to check appropriateness of data and model
# zero out the random effects
= zero_randeffs(pk_1cmp, init_params(pk_1cmp), pkpain_noplb) etas
Above, we are generating a vector of η
’s of the same length as the number of subjects to zero out the random effects. We do this as we are evaluating the trajectories of the concentrations at the initial set of parameters at a population level. Other helper functions here are sample_randeffs
and init_randeffs
. Please refer to the documentation.
= simobs(pk_1cmp, pkpain_noplb, init_params(pk_1cmp), etas) simpk_iparams
Simulated population (Vector{<:Subject})
Simulated subjects: 120
Simulated variables: Conc
sim_plot(
pk_1cmp,
simpk_iparams;= [:Conc],
observations = (; fontsize = 18),
figure = (;
axis = "Time (hr)",
xlabel = "Observed/Predicted \n CTMx Concentration (ng/mL)",
ylabel
), )
Our NCA based initial guess on the parameters seem to work well.
Lets change the initial estimate of a couple of the parameters to evaluate the sensitivity.
= (; init_params(pk_1cmp)..., tvka = 2, tvv = 10) pkparam
(tvcl = 3.2,
tvv = 10,
tvka = 2,
Ω = [0.04 0.0 0.0; 0.0 0.04 0.0; 0.0 0.0 0.04],
σ_p = 0.2,)
= simobs(pk_1cmp, pkpain_noplb, pkparam, etas) simpk_changedpars
Simulated population (Vector{<:Subject})
Simulated subjects: 120
Simulated variables: Conc
sim_plot(
pk_1cmp,
simpk_changedpars;= [:Conc],
observations = (; fontsize = 18),
figure = (
axis = "Time (hr)",
xlabel = "Observed/Predicted \n CTMx Concentration (ng/mL)",
ylabel
), )
Changing the tvka
and decreasing the tvv
seemed to make an impact and observations go through the simulated lines.
To get a quick ballpark estimate of your PK parameters, we can do a NaivePooled
analysis.
3.2.2.1 NaivePooled
= fit(pk_1cmp, pkpain_noplb, init_params(pk_1cmp), NaivePooled(); omegas = (:Ω,)) pkfit_np
[ Info: Checking the initial parameter values.
[ Info: The initial negative log likelihood and its gradient are finite. Check passed.
Iter Function value Gradient norm
0 7.744356e+02 3.715711e+03
* time: 0.01824784278869629
1 2.343899e+02 1.747348e+03
* time: 0.4253568649291992
2 9.696232e+01 1.198088e+03
* time: 0.42664504051208496
3 -7.818699e+01 5.538151e+02
* time: 0.4276578426361084
4 -1.234803e+02 2.462514e+02
* time: 0.4289078712463379
5 -1.372888e+02 2.067458e+02
* time: 0.43021082878112793
6 -1.410579e+02 1.162950e+02
* time: 0.4315478801727295
7 -1.434754e+02 5.632816e+01
* time: 0.432844877243042
8 -1.453401e+02 7.859270e+01
* time: 0.43419694900512695
9 -1.498185e+02 1.455606e+02
* time: 0.4356398582458496
10 -1.534371e+02 1.303682e+02
* time: 0.43718886375427246
11 -1.563557e+02 5.975474e+01
* time: 0.4388608932495117
12 -1.575052e+02 9.308611e+00
* time: 0.44019103050231934
13 -1.579357e+02 1.234484e+01
* time: 0.4415709972381592
14 -1.581874e+02 7.478196e+00
* time: 0.44291090965270996
15 -1.582981e+02 2.027162e+00
* time: 0.44421887397766113
16 -1.583375e+02 5.578262e+00
* time: 0.4455139636993408
17 -1.583556e+02 4.727050e+00
* time: 0.491588830947876
18 -1.583644e+02 2.340173e+00
* time: 0.492656946182251
19 -1.583680e+02 7.738100e-01
* time: 0.49354004859924316
20 -1.583696e+02 3.300689e-01
* time: 0.4943828582763672
21 -1.583704e+02 3.641985e-01
* time: 0.49517297744750977
22 -1.583707e+02 4.365901e-01
* time: 0.49597597122192383
23 -1.583709e+02 3.887800e-01
* time: 0.4967968463897705
24 -1.583710e+02 2.766977e-01
* time: 0.49781298637390137
25 -1.583710e+02 1.758029e-01
* time: 0.49864792823791504
26 -1.583710e+02 1.133947e-01
* time: 0.4994790554046631
27 -1.583710e+02 7.922544e-02
* time: 0.5002830028533936
28 -1.583710e+02 5.954998e-02
* time: 0.5011289119720459
29 -1.583710e+02 4.157079e-02
* time: 0.502051830291748
30 -1.583710e+02 4.295447e-02
* time: 0.5029549598693848
31 -1.583710e+02 5.170753e-02
* time: 0.5038449764251709
32 -1.583710e+02 2.644383e-02
* time: 0.5050609111785889
33 -1.583710e+02 4.548993e-03
* time: 0.5061628818511963
34 -1.583710e+02 2.501804e-02
* time: 0.5072529315948486
35 -1.583710e+02 3.763440e-02
* time: 0.5083599090576172
36 -1.583710e+02 3.206026e-02
* time: 0.5095510482788086
37 -1.583710e+02 1.003698e-02
* time: 0.5107579231262207
38 -1.583710e+02 2.209089e-02
* time: 0.5119888782501221
39 -1.583710e+02 4.954172e-03
* time: 0.5132079124450684
40 -1.583710e+02 1.609373e-02
* time: 0.514868974685669
41 -1.583710e+02 1.579802e-02
* time: 0.5161399841308594
42 -1.583710e+02 1.014113e-03
* time: 0.5176019668579102
43 -1.583710e+02 6.050644e-03
* time: 0.519230842590332
44 -1.583710e+02 1.354412e-02
* time: 0.5205178260803223
45 -1.583710e+02 4.473248e-03
* time: 0.5217909812927246
46 -1.583710e+02 4.644735e-03
* time: 0.5230579376220703
47 -1.583710e+02 9.829910e-03
* time: 0.5243349075317383
48 -1.583710e+02 1.047561e-03
* time: 0.5256619453430176
49 -1.583710e+02 8.366895e-03
* time: 0.527026891708374
50 -1.583710e+02 7.879055e-04
* time: 0.5283708572387695
FittedPumasModel
Successful minimization: true
Likelihood approximation: NaivePooled
Log-likelihood value: 158.37103
Number of subjects: 120
Number of parameters: Fixed Optimized
1 6
Observation records: Active Missing
Conc: 1320 0
Total: 1320 0
------------------
Estimate
------------------
tvcl 3.0054
tvv 14.089
tvka 44.228
Ω₁,₁ 0.0
Ω₂,₂ 0.0
Ω₃,₃ 0.0
σ_p 0.32999
------------------
coefficients_table(pkfit_np)
Row | Parameter | Description | Estimate |
---|---|---|---|
String | Abstract… | Float64 | |
1 | tvcl | Clearance (L/hr)\n | 3.005 |
2 | tvv | Volume (L)\n | 14.089 |
3 | tvka | Absorption rate constant (h-1)\n | 44.228 |
4 | Ω₁,₁ | ΩCL | 0.0 |
5 | Ω₂,₂ | ΩVc | 0.0 |
6 | Ω₃,₃ | ΩKa | 0.0 |
7 | σ_p | Proportional RUV\n | 0.33 |
The final estimates from the NaivePooled
approach seem reasonably close to our initial guess from NCA, except for the tvka
parameter. We will stick with our initial guess.
One way to be cautious before going into a complete fit
ting routine is to evaluate the likelihood of the individual subjects given the initial parameter values and see if any subject(s) pops out as unreasonable. There are a few ways of doing this:
- check the
loglikelihood
subject wise - check if there any influential subjects
Below, we are basically checking if the initial estimates for any subject are way off that we are unable to compute the initial loglikelihood
.
= []
lls for subj in pkpain_noplb
push!(lls, loglikelihood(pk_1cmp, subj, pkparam, FOCE()))
end
# the plot below is using native CairoMakie `hist`
hist(lls; bins = 10, normalization = :none, color = (:black, 0.5), x_gap = 0)
The distribution of the loglikelihood’s suggest no extreme outliers.
A more convenient way is to use the findinfluential
function that provides a list of k
top influential subjects by showing the normalized (minus) loglikelihood for each subject. As you can see below, the minus loglikelihood in the range of 16 agrees with the histogram plotted above.
= findinfluential(pk_1cmp, pkpain_noplb, pkparam, FOCE()) influential_subjects
5-element Vector{NamedTuple{(:id, :nll), Tuple{String, Float64}}}:
(id = "148", nll = 16.65965885684475)
(id = "135", nll = 16.64898519007633)
(id = "156", nll = 15.9590695566075)
(id = "159", nll = 15.441218240496482)
(id = "149", nll = 14.715134644119512)
3.2.2.2 FOCE
Now that we have a good handle on our data, lets go ahead and fit
a population model with FOCE
:
= fit(pk_1cmp, pkpain_noplb, pkparam, FOCE(); constantcoef = (; tvka = 2)) pkfit_1cmp
[ Info: Checking the initial parameter values.
[ Info: The initial negative log likelihood and its gradient are finite. Check passed.
Iter Function value Gradient norm
0 -5.935351e+02 5.597318e+02
* time: 9.417533874511719e-5
1 -7.022088e+02 1.707063e+02
* time: 0.14495611190795898
2 -7.314067e+02 2.903269e+02
* time: 0.20394206047058105
3 -8.520591e+02 2.285888e+02
* time: 0.27533912658691406
4 -1.120191e+03 3.795410e+02
* time: 0.41217708587646484
5 -1.178784e+03 2.323978e+02
* time: 0.4711461067199707
6 -1.218320e+03 9.699907e+01
* time: 0.5254302024841309
7 -1.223641e+03 5.862105e+01
* time: 0.5784971714019775
8 -1.227620e+03 1.831402e+01
* time: 0.6319541931152344
9 -1.228381e+03 2.132323e+01
* time: 0.6819491386413574
10 -1.230098e+03 2.921228e+01
* time: 0.7464981079101562
11 -1.230854e+03 2.029662e+01
* time: 0.7981832027435303
12 -1.231116e+03 5.229099e+00
* time: 0.8475320339202881
13 -1.231179e+03 1.689231e+00
* time: 0.8959650993347168
14 -1.231187e+03 1.215379e+00
* time: 0.9422111511230469
15 -1.231188e+03 2.770381e-01
* time: 0.9724321365356445
16 -1.231188e+03 1.636651e-01
* time: 1.0098600387573242
17 -1.231188e+03 2.701087e-01
* time: 1.0505599975585938
18 -1.231188e+03 3.163373e-01
* time: 1.0893371105194092
19 -1.231188e+03 1.505097e-01
* time: 1.127120018005371
20 -1.231188e+03 2.485456e-02
* time: 1.150825023651123
21 -1.231188e+03 8.381370e-04
* time: 1.1832220554351807
FittedPumasModel
Successful minimization: true
Likelihood approximation: FOCE
Log-likelihood value: 1231.188
Number of subjects: 120
Number of parameters: Fixed Optimized
1 6
Observation records: Active Missing
Conc: 1320 0
Total: 1320 0
-------------------
Estimate
-------------------
tvcl 3.1642
tvv 13.288
tvka 2.0
Ω₁,₁ 0.08494
Ω₂,₂ 0.048568
Ω₃,₃ 5.5811
σ_p 0.10093
-------------------
infer(pkfit_1cmp)
[ Info: Calculating: variance-covariance matrix.
[ Info: Done.
Asymptotic inference results using sandwich estimator
Successful minimization: true
Likelihood approximation: FOCE
Log-likelihood value: 1231.188
Number of subjects: 120
Number of parameters: Fixed Optimized
1 6
Observation records: Active Missing
Conc: 1320 0
Total: 1320 0
-------------------------------------------------------------------
Estimate SE 95.0% C.I.
-------------------------------------------------------------------
tvcl 3.1642 0.086619 [ 2.9944 ; 3.334 ]
tvv 13.288 0.27481 [12.749 ; 13.827 ]
tvka 2.0 NaN [ NaN ; NaN ]
Ω₁,₁ 0.08494 0.011022 [ 0.063338; 0.10654 ]
Ω₂,₂ 0.048568 0.0063501 [ 0.036122; 0.061014]
Ω₃,₃ 5.5811 1.2194 [ 3.1911 ; 7.9711 ]
σ_p 0.10093 0.0057196 [ 0.089718; 0.11214 ]
-------------------------------------------------------------------
Notice that tvka
is fixed to 2 as we don’t have a lot of information before tmax
. From the results above, we see that the parameter precision for this model is reasonable.
3.2.3 Two-compartment model
Just to be sure, let’s fit a 2-compartment model and evaluate:
= @model begin
pk_2cmp
@param begin
"""
Clearance (L/hr)
"""
∈ RealDomain(; lower = 0, init = 3.2)
tvcl """
Central Volume (L)
"""
∈ RealDomain(; lower = 0, init = 16.4)
tvv """
Peripheral Volume (L)
"""
∈ RealDomain(; lower = 0, init = 10)
tvvp """
Distributional Clearance (L/hr)
"""
∈ RealDomain(; lower = 0, init = 2)
tvq """
Absorption rate constant (h-1)
"""
∈ RealDomain(; lower = 0, init = 1.3)
tvka """
- ΩCL
- ΩVc
- ΩKa
- ΩVp
- ΩQ
"""
∈ PDiagDomain(init = [0.04, 0.04, 0.04, 0.04, 0.04])
Ω """
Proportional RUV
"""
∈ RealDomain(; lower = 0.0001, init = 0.2)
σ_p end
@random begin
~ MvNormal(Ω)
η end
@covariates begin
"""
Dose (mg)
"""
Doseend
@pre begin
= tvcl * exp(η[1])
CL = tvv * exp(η[2])
Vc = tvka * exp(η[3])
Ka = tvvp * exp(η[4])
Vp = tvq * exp(η[5])
Q end
@dynamics Depots1Central1Periph1
@derived begin
:= @. Central / Vc
cp """
CTMx Concentration (ng/mL)
"""
~ @. Normal(cp, cp * σ_p)
Conc end
end
PumasModel
Parameters: tvcl, tvv, tvvp, tvq, tvka, Ω, σ_p
Random effects: η
Covariates: Dose
Dynamical variables: Depot, Central, Peripheral
Derived: Conc
Observed: Conc
3.2.3.1 FOCE
=
pkfit_2cmp fit(pk_2cmp, pkpain_noplb, init_params(pk_2cmp), FOCE(); constantcoef = (; tvka = 2))
[ Info: Checking the initial parameter values.
[ Info: The initial negative log likelihood and its gradient are finite. Check passed.
Iter Function value Gradient norm
0 -6.302369e+02 1.021050e+03
* time: 7.605552673339844e-5
1 -9.197817e+02 9.927951e+02
* time: 0.12303709983825684
2 -1.372640e+03 2.054986e+02
* time: 0.25167012214660645
3 -1.446326e+03 1.543987e+02
* time: 0.3589470386505127
4 -1.545570e+03 1.855028e+02
* time: 0.46608495712280273
5 -1.581449e+03 1.713157e+02
* time: 0.6394550800323486
6 -1.639433e+03 1.257382e+02
* time: 0.7408580780029297
7 -1.695964e+03 7.450539e+01
* time: 0.846156120300293
8 -1.722243e+03 5.961044e+01
* time: 0.9619541168212891
9 -1.736883e+03 7.320921e+01
* time: 1.0635600090026855
10 -1.753547e+03 7.501938e+01
* time: 1.1681349277496338
11 -1.764053e+03 6.185661e+01
* time: 1.2780790328979492
12 -1.778991e+03 4.831033e+01
* time: 1.4065330028533936
13 -1.791492e+03 4.943278e+01
* time: 1.5292370319366455
14 -1.799847e+03 2.871410e+01
* time: 1.6684391498565674
15 -1.805374e+03 7.520792e+01
* time: 1.7992949485778809
16 -1.816260e+03 2.990621e+01
* time: 1.9334359169006348
17 -1.818252e+03 2.401915e+01
* time: 2.0416269302368164
18 -1.822988e+03 2.587225e+01
* time: 2.153588056564331
19 -1.824653e+03 1.550517e+01
* time: 2.26131010055542
20 -1.826074e+03 1.788927e+01
* time: 2.3819000720977783
21 -1.826821e+03 1.888389e+01
* time: 2.486293077468872
22 -1.827900e+03 1.432840e+01
* time: 2.595020055770874
23 -1.828511e+03 9.422042e+00
* time: 2.717142105102539
24 -1.828754e+03 5.363448e+00
* time: 2.8298659324645996
25 -1.828862e+03 4.916153e+00
* time: 2.9363150596618652
26 -1.829007e+03 4.695757e+00
* time: 3.059744119644165
27 -1.829358e+03 1.090248e+01
* time: 3.1702170372009277
28 -1.829830e+03 1.451324e+01
* time: 3.2822561264038086
29 -1.830201e+03 1.108690e+01
* time: 3.396393060684204
30 -1.830360e+03 2.892290e+00
* time: 3.523123025894165
31 -1.830390e+03 1.698884e+00
* time: 3.629073143005371
32 -1.830404e+03 1.602129e+00
* time: 3.734908103942871
33 -1.830432e+03 2.823129e+00
* time: 3.853224039077759
34 -1.830475e+03 4.105407e+00
* time: 3.977522134780884
35 -1.830527e+03 5.093470e+00
* time: 4.09006404876709
36 -1.830592e+03 2.695875e+00
* time: 4.216608047485352
37 -1.830615e+03 3.470118e+00
* time: 4.325258016586304
38 -1.830623e+03 2.577088e+00
* time: 4.435423135757446
39 -1.830625e+03 1.765655e+00
* time: 4.541749954223633
40 -1.830627e+03 1.036472e+00
* time: 4.654070138931274
41 -1.830628e+03 1.124524e+00
* time: 4.756410121917725
42 -1.830628e+03 3.545350e-01
* time: 4.852785110473633
43 -1.830629e+03 4.125688e-01
* time: 4.948312997817993
44 -1.830630e+03 7.227016e-01
* time: 5.049795150756836
45 -1.830630e+03 5.071624e-01
* time: 5.1625590324401855
46 -1.830630e+03 1.064027e-01
* time: 5.2576210498809814
47 -1.830630e+03 5.753961e-03
* time: 5.343439102172852
48 -1.830630e+03 3.093135e-03
* time: 5.424744129180908
49 -1.830630e+03 2.462622e-03
* time: 5.506067991256714
50 -1.830630e+03 1.882491e-03
* time: 5.585752964019775
51 -1.830630e+03 1.882491e-03
* time: 5.700093984603882
52 -1.830630e+03 1.882491e-03
* time: 5.821819067001343
FittedPumasModel
Successful minimization: true
Likelihood approximation: FOCE
Log-likelihood value: 1830.6304
Number of subjects: 120
Number of parameters: Fixed Optimized
1 10
Observation records: Active Missing
Conc: 1320 0
Total: 1320 0
-------------------
Estimate
-------------------
tvcl 2.8138
tvv 11.005
tvvp 5.54
tvq 1.5159
tvka 2.0
Ω₁,₁ 0.10267
Ω₂,₂ 0.060776
Ω₃,₃ 1.2012
Ω₄,₄ 0.42349
Ω₅,₅ 0.24473
σ_p 0.048405
-------------------
3.3 Comparing One- versus Two-compartment models
The 2-compartment model has a much lower objective function compared to the 1-compartment. Let’s compare the estimates from the 2 models using the compare_estimates
function.
compare_estimates(; pkfit_1cmp, pkfit_2cmp)
Row | parameter | pkfit_1cmp | pkfit_2cmp |
---|---|---|---|
String | Float64? | Float64? | |
1 | tvcl | 3.1642 | 2.81378 |
2 | tvv | 13.288 | 11.0046 |
3 | tvka | 2.0 | 2.0 |
4 | Ω₁,₁ | 0.0849405 | 0.102669 |
5 | Ω₂,₂ | 0.0485682 | 0.0607755 |
6 | Ω₃,₃ | 5.58107 | 1.20116 |
7 | σ_p | 0.100928 | 0.0484049 |
8 | tvvp | missing | 5.53998 |
9 | tvq | missing | 1.51591 |
10 | Ω₄,₄ | missing | 0.423493 |
11 | Ω₅,₅ | missing | 0.244732 |
We perform a likelihood ratio test to compare the two nested models. The test statistic and the \(p\)-value clearly indicate that a 2-compartment model should be preferred.
lrtest(pkfit_1cmp, pkfit_2cmp)
Statistic: 1200.0
Degrees of freedom: 4
P-value: 0.0
We should also compare the other metrics and statistics, such ηshrinkage
, ϵshrinkage
, aic
, and bic
using the metrics_table
function.
@chain metrics_table(pkfit_2cmp) begin
leftjoin(metrics_table(pkfit_1cmp); on = :Metric, makeunique = true)
rename!(:Value => :pk2cmp, :Value_1 => :pk1cmp)
end
Row | Metric | pk2cmp | pk1cmp |
---|---|---|---|
String | Any | Any | |
1 | Successful | true | true |
2 | Estimation Time | 5.822 | 1.183 |
3 | Subjects | 120 | 120 |
4 | Fixed Parameters | 1 | 1 |
5 | Optimized Parameters | 10 | 6 |
6 | Conc Active Observations | 1320 | 1320 |
7 | Conc Missing Observations | 0 | 0 |
8 | Total Active Observations | 1320 | 1320 |
9 | Total Missing Observations | 0 | 0 |
10 | Likelihood Approximation | Pumas.FOCE | Pumas.FOCE |
11 | LogLikelihood (LL) | 1830.63 | 1231.19 |
12 | -2LL | -3661.26 | -2462.38 |
13 | AIC | -3641.26 | -2450.38 |
14 | BIC | -3589.41 | -2419.26 |
15 | (η-shrinkage) η₁ | 0.037 | 0.016 |
16 | (η-shrinkage) η₂ | 0.047 | 0.04 |
17 | (η-shrinkage) η₃ | 0.516 | 0.733 |
18 | (ϵ-shrinkage) Conc | 0.185 | 0.105 |
19 | (η-shrinkage) η₄ | 0.287 | missing |
20 | (η-shrinkage) η₅ | 0.154 | missing |
We next generate some goodness of fit plots to compare which model is performing better. To do this, we first inspect
the diagnostics of our model fit.
= inspect(pkfit_1cmp) res_inspect_1cmp
[ Info: Calculating predictions.
[ Info: Calculating weighted residuals.
[ Info: Calculating empirical bayes.
[ Info: Evaluating individual parameters.
[ Info: Evaluating dose control parameters.
[ Info: Done.
FittedPumasModelInspection
Fitting was successful: true
Likehood approximation used for weighted residuals : Pumas.FOCE()
= inspect(pkfit_2cmp) res_inspect_2cmp
[ Info: Calculating predictions.
[ Info: Calculating weighted residuals.
[ Info: Calculating empirical bayes.
[ Info: Evaluating individual parameters.
[ Info: Evaluating dose control parameters.
[ Info: Done.
FittedPumasModelInspection
Fitting was successful: true
Likehood approximation used for weighted residuals : Pumas.FOCE()
= goodness_of_fit(res_inspect_1cmp; figure = (; fontsize = 12)) gof_1cmp
= goodness_of_fit(res_inspect_2cmp; figure = (; fontsize = 12)) gof_2cmp
These plots clearly indicate that the 2-compartment model is a better fit compared to the 1-compartment model.
We can look at selected sample of individual plots.
= subject_fits(
fig_subject_fits
res_inspect_2cmp;= true,
separate = true,
paginate = (; combinelabels = true),
facet = (; fontsize = 18),
figure = (; xlabel = "Time (hr)", ylabel = "CTMx Concentration (ng/mL)"),
axis
)1] fig_subject_fits[
There a lot of important plotting functions you can use for your standard model diagnostics. Please make sure to read the documentation for plotting. Below, we are checking the distribution of the empirical Bayes estimates.
empirical_bayes_dist(res_inspect_2cmp; zeroline_color = :red)
empirical_bayes_vs_covariates(
res_inspect_2cmp;= [:Dose],
categorical = (; resolution = (600, 800)),
figure )
Clearly, our guess at tvka
seems off-target. Let’s try and estimate tvka
instead of fixing it to 2
:
= fit(pk_2cmp, pkpain_noplb, init_params(pk_2cmp), FOCE()) pkfit_2cmp_unfix_ka
[ Info: Checking the initial parameter values.
[ Info: The initial negative log likelihood and its gradient are finite. Check passed.
Iter Function value Gradient norm
0 -3.200734e+02 1.272671e+03
* time: 6.604194641113281e-5
1 -8.682982e+02 1.000199e+03
* time: 0.23767900466918945
2 -1.381870e+03 5.008081e+02
* time: 0.39089322090148926
3 -1.551053e+03 6.833490e+02
* time: 0.5818040370941162
4 -1.680887e+03 1.834586e+02
* time: 0.7350270748138428
5 -1.726118e+03 8.870274e+01
* time: 0.9136261940002441
6 -1.761023e+03 1.162036e+02
* time: 1.0708370208740234
7 -1.786619e+03 1.114552e+02
* time: 1.2493510246276855
8 -1.863556e+03 9.914305e+01
* time: 1.4120330810546875
9 -1.882942e+03 5.342676e+01
* time: 1.596785068511963
10 -1.888020e+03 2.010181e+01
* time: 1.7567360401153564
11 -1.889832e+03 1.867263e+01
* time: 1.9429152011871338
12 -1.891649e+03 1.668512e+01
* time: 2.1512842178344727
13 -1.892615e+03 1.820701e+01
* time: 2.2954649925231934
14 -1.893453e+03 1.745195e+01
* time: 2.4424350261688232
15 -1.894760e+03 1.850174e+01
* time: 2.5724031925201416
16 -1.895647e+03 1.773939e+01
* time: 2.718567132949829
17 -1.896597e+03 1.143462e+01
* time: 2.8360559940338135
18 -1.897114e+03 9.720097e+00
* time: 3.007000207901001
19 -1.897373e+03 6.054321e+00
* time: 3.125545024871826
20 -1.897498e+03 3.985954e+00
* time: 3.2919070720672607
21 -1.897571e+03 4.262464e+00
* time: 3.4348690509796143
22 -1.897633e+03 4.010234e+00
* time: 3.57292103767395
23 -1.897714e+03 4.805375e+00
* time: 3.7323391437530518
24 -1.897802e+03 3.508706e+00
* time: 3.8882009983062744
25 -1.897865e+03 3.691477e+00
* time: 4.029857158660889
26 -1.897900e+03 2.982720e+00
* time: 4.147497177124023
27 -1.897928e+03 2.563790e+00
* time: 4.294849157333374
28 -1.897968e+03 3.261484e+00
* time: 4.409171104431152
29 -1.898013e+03 3.064690e+00
* time: 4.547143220901489
30 -1.898040e+03 1.636525e+00
* time: 4.672741174697876
31 -1.898051e+03 1.439997e+00
* time: 4.812849998474121
32 -1.898057e+03 1.436504e+00
* time: 4.924200057983398
33 -1.898069e+03 1.881528e+00
* time: 5.044975996017456
34 -1.898095e+03 3.253165e+00
* time: 5.1781229972839355
35 -1.898142e+03 4.257942e+00
* time: 5.295518159866333
36 -1.898199e+03 3.685241e+00
* time: 5.431471109390259
37 -1.898245e+03 2.567364e+00
* time: 5.557011127471924
38 -1.898246e+03 2.561590e+00
* time: 5.74012017250061
39 -1.898251e+03 2.530885e+00
* time: 5.901984214782715
40 -1.898298e+03 2.673694e+00
* time: 6.050952196121216
41 -1.898300e+03 2.794535e+00
* time: 6.193596124649048
42 -1.898337e+03 3.768309e+00
* time: 6.371883153915405
43 -1.898415e+03 5.140597e+00
* time: 6.517460107803345
44 -1.898427e+03 4.646687e+00
* time: 6.669670104980469
45 -1.898432e+03 4.441103e+00
* time: 6.896221160888672
46 -1.898437e+03 4.297436e+00
* time: 7.1012561321258545
47 -1.898443e+03 4.170006e+00
* time: 7.318238019943237
48 -1.898448e+03 3.949707e+00
* time: 7.572393178939819
49 -1.898451e+03 3.720254e+00
* time: 7.794032096862793
50 -1.898451e+03 3.700859e+00
* time: 8.032361030578613
51 -1.898456e+03 3.967298e+00
* time: 8.225088119506836
52 -1.898479e+03 2.134121e+00
* time: 8.347379207611084
53 -1.898480e+03 7.166192e-01
* time: 8.498102188110352
54 -1.898480e+03 7.319824e-01
* time: 8.635270118713379
55 -1.898480e+03 1.358764e-01
* time: 8.782921075820923
56 -1.898480e+03 1.239822e-01
* time: 9.024269104003906
57 -1.898480e+03 1.239752e-01
* time: 9.149968147277832
58 -1.898480e+03 1.239711e-01
* time: 9.332168102264404
59 -1.898480e+03 1.239711e-01
* time: 9.505474090576172
60 -1.898480e+03 1.239711e-01
* time: 9.697363138198853
61 -1.898480e+03 1.248189e-01
* time: 9.922837018966675
62 -1.898480e+03 1.248189e-01
* time: 10.208707094192505
63 -1.898480e+03 1.248189e-01
* time: 10.450537204742432
64 -1.898480e+03 1.248189e-01
* time: 10.722050189971924
65 -1.898480e+03 1.248189e-01
* time: 11.060665130615234
66 -1.898480e+03 1.248189e-01
* time: 11.312530040740967
67 -1.898480e+03 1.248189e-01
* time: 11.606091022491455
68 -1.898480e+03 1.248189e-01
* time: 11.855343103408813
69 -1.898480e+03 1.248189e-01
* time: 12.12081003189087
70 -1.898480e+03 1.248189e-01
* time: 12.322315216064453
FittedPumasModel
Successful minimization: true
Likelihood approximation: FOCE
Log-likelihood value: 1898.4797
Number of subjects: 120
Number of parameters: Fixed Optimized
0 11
Observation records: Active Missing
Conc: 1320 0
Total: 1320 0
-------------------
Estimate
-------------------
tvcl 2.6191
tvv 11.378
tvvp 8.4529
tvq 1.3164
tvka 4.8925
Ω₁,₁ 0.13243
Ω₂,₂ 0.05967
Ω₃,₃ 0.41581
Ω₄,₄ 0.080679
Ω₅,₅ 0.24996
σ_p 0.049097
-------------------
compare_estimates(; pkfit_2cmp, pkfit_2cmp_unfix_ka)
Row | parameter | pkfit_2cmp | pkfit_2cmp_unfix_ka |
---|---|---|---|
String | Float64? | Float64? | |
1 | tvcl | 2.81378 | 2.61912 |
2 | tvv | 11.0046 | 11.3783 |
3 | tvvp | 5.53998 | 8.45294 |
4 | tvq | 1.51591 | 1.31637 |
5 | tvka | 2.0 | 4.89251 |
6 | Ω₁,₁ | 0.102669 | 0.132432 |
7 | Ω₂,₂ | 0.0607755 | 0.0596696 |
8 | Ω₃,₃ | 1.20116 | 0.415809 |
9 | Ω₄,₄ | 0.423493 | 0.0806791 |
10 | Ω₅,₅ | 0.244732 | 0.249956 |
11 | σ_p | 0.0484049 | 0.0490975 |
Let’s revaluate the goodness of fits and η distribution plots.
Not much change in the general gof
plots
= inspect(pkfit_2cmp_unfix_ka) res_inspect_2cmp_unfix_ka
[ Info: Calculating predictions.
[ Info: Calculating weighted residuals.
[ Info: Calculating empirical bayes.
[ Info: Evaluating individual parameters.
[ Info: Evaluating dose control parameters.
[ Info: Done.
FittedPumasModelInspection
Fitting was successful: true
Likehood approximation used for weighted residuals : Pumas.FOCE()
goodness_of_fit(res_inspect_2cmp_unfix_ka; figure = (; fontsize = 12))
But you can see a huge improvement in the ηka
, (η₃
) distribution which is now centered around zero
empirical_bayes_vs_covariates(
res_inspect_2cmp_unfix_ka;= [:Dose],
categorical = [:η₃],
ebes = (; resolution = (600, 800)),
figure )
Finally looking at some individual plots for the same subjects as earlier:
= subject_fits(
fig_subject_fits2
res_inspect_2cmp_unfix_ka;= true,
separate = true,
paginate = (; combinelabels = true, linkyaxes = false),
facet = (; fontsize = 18),
figure = (; xlabel = "Time (hr)", ylabel = "CTMx Concentration (ng/mL)"),
axis
)6] fig_subject_fits2[
The randomly sampled individual fits don’t seem good in some individuals, but we can evaluate this via a vpc
to see how to go about.
3.4 Visual Predictive Checks (VPC)
We can now perform a vpc
to check. The default plots provide a 80% prediction interval and a 95% simulated CI (shaded area) around each of the quantiles
= vpc(
pk_vpc
pkfit_2cmp_unfix_ka,200;
= [:Conc],
observations = [:Dose],
stratify_by = EnsembleThreads(), # multi-threading
ensemblealg )
[ Info: Continuous VPC
Visual Predictive Check
Type of VPC: Continuous VPC
Simulated populations: 200
Subjects in data: 40
Stratification variable(s): [:Dose]
Confidence level: 0.95
VPC lines: quantiles ([0.1, 0.5, 0.9])
vpc_plot(
pk_2cmp,
pk_vpc;= 1,
rows = 3,
columns = (; resolution = (1400, 1000), fontsize = 22),
figure = (;
axis = "Time (hr)",
xlabel = "Observed/Predicted\n CTMx Concentration (ng/mL)",
ylabel
),= (; combinelabels = true),
facet )
The visual predictive check suggests that the model captures the data well across all dose levels.
4 Additional Help
If you have questions regarding this tutorial, please post them on our discourse site.