using Pumas
using PumasUtilities
using NCA
using NCAUtilities
A Comprehensive Introduction to Pumas
This tutorial provides a comprehensive introduction to a modeling and simulation workflow in Pumas. The idea is not to get into the details of Pumas specifics, but instead provide a narrative on the lines of a regular workflow in our day-to-day work, with brevity where required to allow a broad overview. Wherever possible, cross-references will be provided to documentation and detailed examples that provide deeper insight into a particular topic.
As part of this workflow, you will be introduced to various aspects such as:
- Data wrangling in Julia
- Exploratory analysis in Julia
- Continuous data non-linear mixed effects modeling in Pumas
- Model comparison routines, post-processing, validation etc.
1 The Study and Design
CTMNopain is a novel anti-inflammatory agent under preliminary investigation. A dose-ranging trial was conducted comparing placebo with 3 doses of CTMNopain (5mg, 20mg and 80 mg QD). The maximum tolerated dose is 160 mg per day. Plasma concentrations (mg/L) of the drug were measured at 0, 0.5, 1, 1.5, 2, 2.5, 3-8 hours.
Pain score (0=no pain, 1=mild, 2=moderate, 3=severe) were obtained at time points when plasma concentration was collected. A pain score of 2 or more is considered as no pain relief.
The subjects can request for remedication if pain relief is not achieved after 2 hours post dose. Some subjects had remedication before 2 hours if they were not able to bear the pain. The time to remedication and the remedication status is available for subjects.
The pharmacokinetic dataset can be accessed using PharmaDatasets.jl.
2 Setup
2.1 Load libraries
These libraries provide the workhorse functionality in the Pumas ecosystem:
In addition, libraries below are good add-on’s that provide ancillary functionality:
using GLM: lm, @formula
using Random
using CSV
using DataFramesMeta
using CairoMakie
using PharmaDatasets2.2 Data Wrangling
We start by reading in the dataset and making some quick summaries.
If you want to learn more about data wrangling, don’t forget to check our Data Wrangling in Julia tutorials!
pkpain_df = dataset("pk_painrelief")
first(pkpain_df, 5)| Row | Subject | Time | Conc | PainRelief | PainScore | RemedStatus | Dose |
|---|---|---|---|---|---|---|---|
| Int64 | Float64 | Float64 | Int64 | Int64 | Int64 | String7 | |
| 1 | 1 | 0.0 | 0.0 | 0 | 3 | 1 | 20 mg |
| 2 | 1 | 0.5 | 1.15578 | 1 | 1 | 0 | 20 mg |
| 3 | 1 | 1.0 | 1.37211 | 1 | 0 | 0 | 20 mg |
| 4 | 1 | 1.5 | 1.30058 | 1 | 0 | 0 | 20 mg |
| 5 | 1 | 2.0 | 1.19195 | 1 | 1 | 0 | 20 mg |
Let’s filter out the placebo data as we don’t need that for the PK analysis.
pkpain_noplb_df = @rsubset pkpain_df :Dose != "Placebo";
first(pkpain_noplb_df, 5)| Row | Subject | Time | Conc | PainRelief | PainScore | RemedStatus | Dose |
|---|---|---|---|---|---|---|---|
| Int64 | Float64 | Float64 | Int64 | Int64 | Int64 | String7 | |
| 1 | 1 | 0.0 | 0.0 | 0 | 3 | 1 | 20 mg |
| 2 | 1 | 0.5 | 1.15578 | 1 | 1 | 0 | 20 mg |
| 3 | 1 | 1.0 | 1.37211 | 1 | 0 | 0 | 20 mg |
| 4 | 1 | 1.5 | 1.30058 | 1 | 0 | 0 | 20 mg |
| 5 | 1 | 2.0 | 1.19195 | 1 | 1 | 0 | 20 mg |
3 Analysis
3.1 Non-compartmental analysis
Let’s begin by performing a quick NCA of the concentration time profiles and view the exposure changes across doses. The input data specification for NCA analysis requires the presence of a :route column and an :amt column that specifies the dose. So, let’s add that in:
@rtransform! pkpain_noplb_df begin
:route = "ev"
:Dose = parse(Int, chop(:Dose; tail = 3))
endWe also need to create an :amt column:
@rtransform! pkpain_noplb_df :amt = :Time == 0 ? :Dose : missingNow, we map the data variables to the read_nca function that prepares the data for NCA analysis.
pkpain_nca = read_nca(
pkpain_noplb_df;
id = :Subject,
time = :Time,
amt = :amt,
observations = :Conc,
group = [:Dose],
route = :route,
)NCAPopulation (120 subjects):
Group: [["Dose" => 5], ["Dose" => 20], ["Dose" => 80]]
Number of missing observations: 0
Number of blq observations: 0
Now that we mapped the data in, let’s visualize the concentration vs time plots for a few individuals. When paginate is set to true, a vector of plots are returned and below we display the first element with 9 individuals.
f = observations_vs_time(
pkpain_nca;
paginate = true,
axis = (; xlabel = "Time (hr)", ylabel = "CTMNoPain Concentration (ng/mL)"),
)
f[1]or you can view the summary curves by dose group as passed in to the group argument in read_nca
summary_observations_vs_time(
pkpain_nca,
figure = (; fontsize = 22, size = (800, 1000)),
color = "black",
linewidth = 3,
axis = (; xlabel = "Time (hr)", ylabel = "CTMX Concentration (μg/mL)"),
)A full NCA Report is now obtained for completeness purposes using the run_nca function, but later we will only extract a couple of key metrics of interest.
pk_nca = run_nca(pkpain_nca; sigdigits = 3)We can look at the NCA fits for some subjects. Here f is a vector or figures. We’ll showcase the first image by indexing f:
f = subject_fits(
pk_nca,
paginate = true,
axis = (; xlabel = "Time (hr)", ylabel = "CTMX Concentration (μg/mL)"),
# Legend options
legend = (; position = :bottom),
)
f[1]As CTMNopain’s effect maybe mainly related to maximum concentration (cmax) or area under the curve (auc), we present some summary statistics using the summarize function from NCA.
strata = [:Dose]1-element Vector{Symbol}:
:Dose
params = [:cmax, :aucinf_obs]2-element Vector{Symbol}:
:cmax
:aucinf_obs
output = summarize(pk_nca; stratify_by = strata, parameters = params)| Row | Dose | parameters | numsamples | minimum | maximum | mean | std | geomean | geostd | geomeanCV |
|---|---|---|---|---|---|---|---|---|---|---|
| Int64 | String | Int64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | |
| 1 | 5 | cmax | 40 | 0.19 | 0.539 | 0.356075 | 0.0884129 | 0.345104 | 1.2932 | 26.1425 |
| 2 | 5 | aucinf_obs | 40 | 0.914 | 3.4 | 1.5979 | 0.490197 | 1.53373 | 1.32974 | 29.0868 |
| 3 | 20 | cmax | 40 | 0.933 | 2.7 | 1.4737 | 0.361871 | 1.43408 | 1.2633 | 23.6954 |
| 4 | 20 | aucinf_obs | 40 | 2.77 | 14.1 | 6.377 | 2.22239 | 6.02031 | 1.41363 | 35.6797 |
| 5 | 80 | cmax | 40 | 3.3 | 8.47 | 5.787 | 1.31957 | 5.64164 | 1.25757 | 23.2228 |
| 6 | 80 | aucinf_obs | 40 | 13.7 | 49.1 | 29.5 | 8.68984 | 28.2954 | 1.34152 | 30.0258 |
The statistics printed above are the default, but you can pass in your own statistics using the stats = [] argument to the summarize function.
We can look at a few parameter distribution plots.
parameters_vs_group(
pk_nca,
parameter = :cmax,
axis = (; xlabel = "Dose (mg)", ylabel = "Cₘₐₓ (ng/mL)"),
figure = (; fontsize = 18),
)Dose normalized PK parameters, cmax and aucinf were essentially dose proportional between for 5 mg, 20 mg and 80 mg doses. You can perform a simple regression to check the impact of dose on cmax:
dp = NCA.DoseLinearityPowerModel(pk_nca, :cmax; level = 0.9)Dose Linearity Power Model
Variable: cmax
Model: log(cmax) ~ log(α) + β × log(dose)
────────────────────────────────────
Estimate low CI 90% high CI 90%
────────────────────────────────────
β 1.00775 0.97571 1.0398
────────────────────────────────────
Here’s a visualization for the dose linearity using a power model for cmax:
power_model(dp; legend = (; position = :bottom))We can also visualize a dose proportionality results with respect to a specific endpoint in a NCA Report; for example cmax and aucinf_obs:
dose_vs_dose_normalized(pk_nca, :cmax)dose_vs_dose_normalized(pk_nca, :aucinf_obs)Based on visual inspection of the concentration time profiles as seen earlier, CTMNopain exhibited monophasic decline, and perhaps a one compartment model best fits the PK data.
3.2 Pharmacokinetic modeling
As seen from the plots above, the concentrations decline monoexponentially. We will evaluate both one and two compartment structural models to assess best fit. Further, different residual error models will also be tested.
We will use the results from NCA to provide us good initial estimates.
3.2.1 Data preparation for modeling
PumasNDF requires the presence of :evid and :cmt columns in the dataset.
@rtransform! pkpain_noplb_df begin
:evid = :Time == 0 ? 1 : 0
:cmt = :Time == 0 ? 1 : 2
:cmt2 = 1 # for zero order absorption
endFurther, observations at time of dosing, i.e., when evid = 1 have to be missing
@rtransform! pkpain_noplb_df :Conc = :evid == 1 ? missing : :ConcThe dataframe will now be converted to a Population using read_pumas. Note that both observations and covariates are required to be an array even if it is one element.
pkpain_noplb = read_pumas(
pkpain_noplb_df;
id = :Subject,
time = :Time,
amt = :amt,
observations = [:Conc],
covariates = [:Dose],
evid = :evid,
cmt = :cmt,
)Population
Subjects: 120
Covariates: Dose
Observations: Conc
Now that the data is transformed to a Population of subjects, we can explore different models.
3.2.2 One-compartment model
If you are not familiar yet with the @model blocks and syntax, please check our documentation.
pk_1cmp = @model begin
@metadata begin
desc = "One Compartment Model"
timeu = u"hr"
end
@param begin
"""
Clearance (L/hr)
"""
tvcl ∈ RealDomain(; lower = 0, init = 3.2)
"""
Volume (L)
"""
tvv ∈ RealDomain(; lower = 0, init = 16.4)
"""
Absorption rate constant (h-1)
"""
tvka ∈ RealDomain(; lower = 0, init = 3.8)
"""
- ΩCL
- ΩVc
- ΩKa
"""
Ω ∈ PDiagDomain(init = [0.04, 0.04, 0.04])
"""
Proportional RUV
"""
σ_p ∈ RealDomain(; lower = 0.0001, init = 0.2)
end
@random begin
η ~ MvNormal(Ω)
end
@covariates begin
"""
Dose (mg)
"""
Dose
end
@pre begin
CL = tvcl * exp(η[1])
Vc = tvv * exp(η[2])
Ka = tvka * exp(η[3])
end
@dynamics Depots1Central1
@derived begin
cp := @. Central / Vc
"""
CTMx Concentration (ng/mL)
"""
Conc ~ @. Normal(cp, abs(cp) * σ_p)
end
end┌ Warning: Covariate Dose is not used in the model. └ @ Pumas ~/run/_work/PumasTutorials.jl/PumasTutorials.jl/custom_julia_depot/packages/Pumas/6G31F/src/dsl/model_macro.jl:3167
PumasModel
Parameters: tvcl, tvv, tvka, Ω, σ_p
Random effects: η
Covariates: Dose
Dynamical system variables: Depot, Central
Dynamical system type: Closed form
Derived: Conc
Observed: Conc
Note that the local assignment := can be used to define intermediate statements that will not be carried outside of the block. This means that all the resulting data workflows from this model will not contain the intermediate variables defined with :=. We use this when we want to suppress the variable from any further output.
The idea behind := is for performance reasons. If you are not carrying the variable defined with := outside of the block, then it is not necessary to store it in the resulting data structures. Not only will your model run faster, but the resulting data structures will also be smaller.
Before going to fit the model, let’s evaluate some helpful steps via simulation to check appropriateness of data and model
# zero out the random effects
etas = zero_randeffs(pk_1cmp, pkpain_noplb, init_params(pk_1cmp))Above, we are generating a vector of η’s of the same length as the number of subjects to zero out the random effects. We do this as we are evaluating the trajectories of the concentrations at the initial set of parameters at a population level. Other helper functions here are sample_randeffs and init_randeffs. Please refer to the documentation.
simpk_iparams = simobs(pk_1cmp, pkpain_noplb, init_params(pk_1cmp), etas)Simulated population (Vector{<:Subject})
Simulated subjects: 120
Simulated variables: Conc
sim_plot(
pk_1cmp,
simpk_iparams;
observations = [:Conc],
figure = (; fontsize = 18),
axis = (;
xlabel = "Time (hr)",
ylabel = "Observed/Predicted \n CTMx Concentration (ng/mL)",
),
)Our NCA based initial guess on the parameters seem to work well.
Lets change the initial estimate of a couple of the parameters to evaluate the sensitivity.
pkparam = (; init_params(pk_1cmp)..., tvka = 2, tvv = 10)(tvcl = 3.2,
tvv = 10,
tvka = 2,
Ω = [0.04 0.0 0.0; 0.0 0.04 0.0; 0.0 0.0 0.04],
σ_p = 0.2,)
simpk_changedpars = simobs(pk_1cmp, pkpain_noplb, pkparam, etas)Simulated population (Vector{<:Subject})
Simulated subjects: 120
Simulated variables: Conc
sim_plot(
pk_1cmp,
simpk_changedpars;
observations = [:Conc],
figure = (; fontsize = 18),
axis = (
xlabel = "Time (hr)",
ylabel = "Observed/Predicted \n CTMx Concentration (ng/mL)",
),
)Changing the tvka and decreasing the tvv seemed to make an impact and observations go through the simulated lines.
To get a quick ballpark estimate of your PK parameters, we can do a NaivePooled analysis.
3.2.2.1 NaivePooled
pkfit_np = fit(pk_1cmp, pkpain_noplb, init_params(pk_1cmp), NaivePooled(); omegas = (:Ω,))[ Info: Checking the initial parameter values. [ Info: The initial negative log likelihood and its gradient are finite. Check passed. Iter Function value Gradient norm 0 7.744356e+02 3.715711e+03 * time: 0.042330026626586914 1 2.343899e+02 1.747348e+03 * time: 1.63771390914917 2 9.696232e+01 1.198088e+03 * time: 1.6413180828094482 3 -7.818699e+01 5.538151e+02 * time: 1.6439359188079834 4 -1.234803e+02 2.462514e+02 * time: 1.6465020179748535 5 -1.372888e+02 2.067458e+02 * time: 1.649204969406128 6 -1.410579e+02 1.162950e+02 * time: 1.651879072189331 7 -1.434754e+02 5.632816e+01 * time: 1.6545019149780273 8 -1.453401e+02 7.859270e+01 * time: 1.6570310592651367 9 -1.498185e+02 1.455606e+02 * time: 1.6597371101379395 10 -1.534371e+02 1.303682e+02 * time: 1.6623129844665527 11 -1.563557e+02 5.975474e+01 * time: 1.664820909500122 12 -1.575052e+02 9.308611e+00 * time: 1.667341947555542 13 -1.579357e+02 1.234484e+01 * time: 1.6700220108032227 14 -1.581874e+02 7.478196e+00 * time: 1.672605037689209 15 -1.582981e+02 2.027162e+00 * time: 1.6752350330352783 16 -1.583375e+02 5.578262e+00 * time: 1.6777660846710205 17 -1.583556e+02 4.727050e+00 * time: 1.6804981231689453 18 -1.583644e+02 2.340173e+00 * time: 1.6830580234527588 19 -1.583680e+02 7.738100e-01 * time: 1.6855559349060059 20 -1.583696e+02 3.300689e-01 * time: 1.6880531311035156 21 -1.583704e+02 3.641985e-01 * time: 1.6907470226287842 22 -1.583707e+02 4.365901e-01 * time: 1.6932449340820312 23 -1.583709e+02 3.887800e-01 * time: 1.695725917816162 24 -1.583710e+02 2.766977e-01 * time: 1.6981940269470215 25 -1.583710e+02 1.758029e-01 * time: 1.7007980346679688 26 -1.583710e+02 1.133947e-01 * time: 1.7033939361572266 27 -1.583710e+02 7.922544e-02 * time: 1.7059500217437744 28 -1.583710e+02 5.954998e-02 * time: 1.7084639072418213 29 -1.583710e+02 4.157080e-02 * time: 1.7111890316009521 30 -1.583710e+02 4.295446e-02 * time: 1.9475860595703125 31 -1.583710e+02 5.170752e-02 * time: 1.9513359069824219 32 -1.583710e+02 2.644382e-02 * time: 1.9541411399841309 33 -1.583710e+02 4.548987e-03 * time: 1.9569151401519775 34 -1.583710e+02 2.501805e-02 * time: 1.9597609043121338 35 -1.583710e+02 3.763439e-02 * time: 1.9618630409240723 36 -1.583710e+02 3.206027e-02 * time: 1.9640071392059326 37 -1.583710e+02 1.003700e-02 * time: 1.9661810398101807 38 -1.583710e+02 2.209084e-02 * time: 1.9683051109313965 39 -1.583710e+02 4.954136e-03 * time: 1.970552921295166 40 -1.583710e+02 1.609366e-02 * time: 1.9735710620880127 41 -1.583710e+02 1.579810e-02 * time: 1.975909948348999 42 -1.583710e+02 1.014156e-03 * time: 1.9782729148864746 43 -1.583710e+02 6.050792e-03 * time: 1.9814059734344482 44 -1.583710e+02 1.354381e-02 * time: 1.9838061332702637 45 -1.583710e+02 4.473216e-03 * time: 1.9860501289367676 46 -1.583710e+02 4.645458e-03 * time: 1.9882800579071045 47 -1.583710e+02 9.828063e-03 * time: 1.9904229640960693 48 -1.583710e+02 1.047215e-03 * time: 1.9925410747528076 49 -1.583710e+02 8.374104e-03 * time: 1.9946210384368896 50 -1.583710e+02 7.841995e-04 * time: 1.996819019317627
FittedPumasModel
Dynamical system type: Closed form
Number of subjects: 120
Observation records: Active Missing
Conc: 1320 0
Total: 1320 0
Number of parameters: Constant Optimized
1 6
Likelihood approximation: NaivePooled
Likelihood optimizer: BFGS
Termination Reason: GradientNorm
Log-likelihood value: 158.37103
------------------
Estimate
------------------
tvcl 3.0054
tvv 14.089
tvka 44.227
† Ω₁,₁ 0.0
† Ω₂,₂ 0.0
† Ω₃,₃ 0.0
σ_p 0.32999
------------------
† indicates constant parameters
coefficients_table(pkfit_np)| Row | Parameter | Description | Constant | Estimate |
|---|---|---|---|---|
| String | SubStrin… | Bool | Float64 | |
| 1 | tvcl | Clearance (L/hr) | false | 3.005 |
| 2 | tvv | Volume (L) | false | 14.089 |
| 3 | tvka | Absorption rate constant (h-1) | false | 44.227 |
| 4 | Ω₁,₁ | ΩCL | true | 0.0 |
| 5 | Ω₂,₂ | ΩVc | true | 0.0 |
| 6 | Ω₃,₃ | ΩKa | true | 0.0 |
| 7 | σ_p | Proportional RUV | false | 0.33 |
The final estimates from the NaivePooled approach seem reasonably close to our initial guess from NCA, except for the tvka parameter. We will stick with our initial guess.
One way to be cautious before going into a complete fitting routine is to evaluate the likelihood of the individual subjects given the initial parameter values and see if any subject(s) pops out as unreasonable. There are a few ways of doing this:
- check the
loglikelihoodsubject wise - check if there any influential subjects
Below, we are basically checking if the initial estimates for any subject are way off that we are unable to compute the initial loglikelihood.
lls = [loglikelihood(pk_1cmp, subj, pkparam, FOCE()) for subj in pkpain_noplb]
# the plot below is using native CairoMakie `hist`
hist(lls; bins = 10, normalization = :none, color = (:black, 0.5))The distribution of the loglikelihood’s suggest no extreme outliers.
A more convenient way is to use the findinfluential function that provides a list of k top influential subjects by showing the normalized (minus) loglikelihood for each subject. As you can see below, the minus loglikelihood in the range of 16 agrees with the histogram plotted above.
influential_subjects = findinfluential(pk_1cmp, pkpain_noplb, pkparam, FOCE())120-element Vector{@NamedTuple{id::String, nll::Float64}}:
(id = "148", nll = 16.65965885684477)
(id = "135", nll = 16.648985190076335)
(id = "156", nll = 15.959069556607496)
(id = "159", nll = 15.441218240496484)
(id = "149", nll = 14.71513464411951)
(id = "88", nll = 13.09709837464614)
(id = "16", nll = 12.98228052193144)
(id = "61", nll = 12.65218290230368)
(id = "71", nll = 12.500330088085505)
(id = "59", nll = 12.241510254805235)
⋮
(id = "57", nll = -22.79767423253431)
(id = "93", nll = -22.836900711478208)
(id = "12", nll = -23.007742339519247)
(id = "123", nll = -23.292751843079234)
(id = "41", nll = -23.425412534960515)
(id = "99", nll = -23.535214841901112)
(id = "29", nll = -24.025959868383083)
(id = "52", nll = -24.164757842493685)
(id = "24", nll = -25.57209232565845)
3.2.2.2 FOCE
Now that we have a good handle on our data, lets go ahead and fit a population model with FOCE:
pkfit_1cmp = fit(pk_1cmp, pkpain_noplb, pkparam, FOCE(); constantcoef = (; tvka = 2))[ Info: Checking the initial parameter values. [ Info: The initial negative log likelihood and its gradient are finite. Check passed. Iter Function value Gradient norm 0 -5.935351e+02 5.597318e+02 * time: 5.3882598876953125e-5 1 -7.022088e+02 1.707063e+02 * time: 0.7287938594818115 2 -7.314067e+02 2.903269e+02 * time: 1.1674559116363525 3 -8.520591e+02 2.285888e+02 * time: 1.2807550430297852 4 -1.120191e+03 3.795410e+02 * time: 1.5146968364715576 5 -1.178784e+03 2.323978e+02 * time: 1.641737937927246 6 -1.218320e+03 9.699907e+01 * time: 2.2205889225006104 7 -1.223641e+03 5.862105e+01 * time: 2.3235130310058594 8 -1.227620e+03 1.831402e+01 * time: 2.429313898086548 9 -1.228381e+03 2.132323e+01 * time: 4.44721794128418 10 -1.230098e+03 2.921228e+01 * time: 4.547890901565552 11 -1.230854e+03 2.029662e+01 * time: 4.646931886672974 12 -1.231116e+03 5.229099e+00 * time: 4.741170883178711 13 -1.231179e+03 1.689232e+00 * time: 4.834592819213867 14 -1.231187e+03 1.215379e+00 * time: 4.930938959121704 15 -1.231188e+03 2.770381e-01 * time: 5.017364978790283 16 -1.231188e+03 1.636652e-01 * time: 5.0932090282440186 17 -1.231188e+03 2.701149e-01 * time: 5.160506963729858 18 -1.231188e+03 3.163341e-01 * time: 5.22962498664856 19 -1.231188e+03 1.505153e-01 * time: 5.386631965637207 20 -1.231188e+03 2.485002e-02 * time: 5.452517032623291 21 -1.231188e+03 8.435209e-04 * time: 5.512495040893555
FittedPumasModel
Dynamical system type: Closed form
Number of subjects: 120
Observation records: Active Missing
Conc: 1320 0
Total: 1320 0
Number of parameters: Constant Optimized
1 6
Likelihood approximation: FOCE
Likelihood optimizer: BFGS
Termination Reason: GradientNorm
Log-likelihood value: 1231.188
-------------------
Estimate
-------------------
tvcl 3.1642
tvv 13.288
† tvka 2.0
Ω₁,₁ 0.08494
Ω₂,₂ 0.048568
Ω₃,₃ 5.5811
σ_p 0.10093
-------------------
† indicates constant parameters
infer(pkfit_1cmp)[ Info: Calculating: variance-covariance matrix. [ Info: Done.
Asymptotic inference results using sandwich estimator
Dynamical system type: Closed form
Number of subjects: 120
Observation records: Active Missing
Conc: 1320 0
Total: 1320 0
Number of parameters: Constant Optimized
1 6
Likelihood approximation: FOCE
Likelihood optimizer: BFGS
Termination Reason: GradientNorm
Log-likelihood value: 1231.188
---------------------------------------------------------
Estimate SE 95.0% C.I.
---------------------------------------------------------
tvcl 3.1642 0.08662 [ 2.9944 ; 3.334 ]
tvv 13.288 0.27481 [ 12.749 ; 13.827 ]
† tvka 2.0 NaN [ NaN ; NaN ]
Ω₁,₁ 0.08494 0.011022 [ 0.063338; 0.10654 ]
Ω₂,₂ 0.048568 0.0063502 [ 0.036122; 0.061014]
Ω₃,₃ 5.5811 1.2188 [ 3.1922 ; 7.97 ]
σ_p 0.10093 0.0057196 [ 0.089718; 0.11214 ]
---------------------------------------------------------
† indicates constant parameters
Notice that tvka is fixed to 2 as we don’t have a lot of information before tmax. From the results above, we see that the parameter precision for this model is reasonable.
3.2.3 Two-compartment model
Just to be sure, let’s fit a 2-compartment model and evaluate:
pk_2cmp = @model begin
@param begin
"""
Clearance (L/hr)
"""
tvcl ∈ RealDomain(; lower = 0, init = 3.2)
"""
Central Volume (L)
"""
tvv ∈ RealDomain(; lower = 0, init = 16.4)
"""
Peripheral Volume (L)
"""
tvvp ∈ RealDomain(; lower = 0, init = 10)
"""
Distributional Clearance (L/hr)
"""
tvq ∈ RealDomain(; lower = 0, init = 2)
"""
Absorption rate constant (h-1)
"""
tvka ∈ RealDomain(; lower = 0, init = 1.3)
"""
- ΩCL
- ΩVc
- ΩKa
- ΩVp
- ΩQ
"""
Ω ∈ PDiagDomain(init = [0.04, 0.04, 0.04, 0.04, 0.04])
"""
Proportional RUV
"""
σ_p ∈ RealDomain(; lower = 0.0001, init = 0.2)
end
@random begin
η ~ MvNormal(Ω)
end
@covariates begin
"""
Dose (mg)
"""
Dose
end
@pre begin
CL = tvcl * exp(η[1])
Vc = tvv * exp(η[2])
Ka = tvka * exp(η[3])
Vp = tvvp * exp(η[4])
Q = tvq * exp(η[5])
end
@dynamics Depots1Central1Periph1
@derived begin
cp := @. Central / Vc
"""
CTMx Concentration (ng/mL)
"""
Conc ~ @. Normal(cp, cp * σ_p)
end
end┌ Warning: Covariate Dose is not used in the model. └ @ Pumas ~/run/_work/PumasTutorials.jl/PumasTutorials.jl/custom_julia_depot/packages/Pumas/6G31F/src/dsl/model_macro.jl:3167
PumasModel
Parameters: tvcl, tvv, tvvp, tvq, tvka, Ω, σ_p
Random effects: η
Covariates: Dose
Dynamical system variables: Depot, Central, Peripheral
Dynamical system type: Closed form
Derived: Conc
Observed: Conc
3.2.3.1 FOCE
pkfit_2cmp =
fit(pk_2cmp, pkpain_noplb, init_params(pk_2cmp), FOCE(); constantcoef = (; tvka = 2))[ Info: Checking the initial parameter values. [ Info: The initial negative log likelihood and its gradient are finite. Check passed. Iter Function value Gradient norm 0 -6.302369e+02 1.021050e+03 * time: 3.0994415283203125e-5 1 -9.197817e+02 9.927951e+02 * time: 0.8132669925689697 2 -1.372640e+03 2.054986e+02 * time: 1.106928825378418 3 -1.446326e+03 1.543987e+02 * time: 1.3715379238128662 4 -1.545570e+03 1.855028e+02 * time: 1.6197879314422607 5 -1.581449e+03 1.713157e+02 * time: 1.982813835144043 6 -1.639433e+03 1.257382e+02 * time: 2.215089797973633 7 -1.695964e+03 7.450539e+01 * time: 2.4487428665161133 8 -1.722243e+03 5.961044e+01 * time: 2.687288999557495 9 -1.736883e+03 7.320921e+01 * time: 2.943798780441284 10 -1.753547e+03 7.501938e+01 * time: 3.191983938217163 11 -1.764053e+03 6.185661e+01 * time: 3.44085693359375 12 -1.778991e+03 4.831033e+01 * time: 3.707488775253296 13 -1.791492e+03 4.943278e+01 * time: 4.071454763412476 14 -1.799847e+03 2.871410e+01 * time: 4.5590739250183105 15 -1.805374e+03 7.520790e+01 * time: 4.9137938022613525 16 -1.816260e+03 2.990621e+01 * time: 5.247376918792725 17 -1.818252e+03 2.401915e+01 * time: 5.523160934448242 18 -1.822988e+03 2.587225e+01 * time: 5.807366847991943 19 -1.824653e+03 1.550517e+01 * time: 6.077073812484741 20 -1.826074e+03 1.788927e+01 * time: 6.342381954193115 21 -1.826821e+03 1.888389e+01 * time: 6.588862895965576 22 -1.827900e+03 1.432840e+01 * time: 6.864569902420044 23 -1.828511e+03 9.422040e+00 * time: 7.11158299446106 24 -1.828754e+03 5.363445e+00 * time: 7.379257917404175 25 -1.828862e+03 4.916168e+00 * time: 7.619025945663452 26 -1.829007e+03 4.695750e+00 * time: 7.859494924545288 27 -1.829358e+03 1.090244e+01 * time: 8.151124000549316 28 -1.829830e+03 1.451320e+01 * time: 8.609083890914917 29 -1.830201e+03 1.108695e+01 * time: 8.970133781433105 30 -1.830360e+03 2.892317e+00 * time: 9.262877941131592 31 -1.830390e+03 1.699265e+00 * time: 9.535704851150513 32 -1.830404e+03 1.602222e+00 * time: 9.821102857589722 33 -1.830432e+03 2.823676e+00 * time: 10.092804908752441 34 -1.830475e+03 4.121601e+00 * time: 10.34132695198059 35 -1.830527e+03 5.080494e+00 * time: 10.594703912734985 36 -1.830591e+03 2.668323e+00 * time: 10.861382961273193 37 -1.830615e+03 3.522601e+00 * time: 11.105571985244751 38 -1.830623e+03 2.203940e+00 * time: 11.349177837371826 39 -1.830625e+03 1.642394e+00 * time: 11.583603858947754 40 -1.830627e+03 9.396311e-01 * time: 11.805682897567749 41 -1.830628e+03 8.588414e-01 * time: 12.054863929748535 42 -1.830628e+03 3.457037e-01 * time: 12.273749828338623 43 -1.830629e+03 4.556038e-01 * time: 12.493207931518555 44 -1.830630e+03 6.366787e-01 * time: 12.717012882232666 45 -1.830630e+03 4.104090e-01 * time: 12.93876576423645 46 -1.830630e+03 7.434196e-02 * time: 13.171433925628662 47 -1.830630e+03 7.316846e-02 * time: 13.434865951538086 48 -1.830630e+03 7.320992e-02 * time: 13.706913948059082 49 -1.830630e+03 7.471716e-02 * time: 13.953266859054565 50 -1.830630e+03 7.471716e-02 * time: 14.229801893234253 51 -1.830630e+03 7.471716e-02 * time: 14.36996078491211
FittedPumasModel
Dynamical system type: Closed form
Number of subjects: 120
Observation records: Active Missing
Conc: 1320 0
Total: 1320 0
Number of parameters: Constant Optimized
1 10
Likelihood approximation: FOCE
Likelihood optimizer: BFGS
Termination Reason: NoXChange
Log-likelihood value: 1830.6304
-------------------
Estimate
-------------------
tvcl 2.8137
tvv 11.005
tvvp 5.5401
tvq 1.5159
† tvka 2.0
Ω₁,₁ 0.10266
Ω₂,₂ 0.060778
Ω₃,₃ 1.2012
Ω₄,₄ 0.42347
Ω₅,₅ 0.24471
σ_p 0.048404
-------------------
† indicates constant parameters
3.3 Comparing One- versus Two-compartment models
The 2-compartment model has a much lower objective function compared to the 1-compartment. Let’s compare the estimates from the 2 models using the compare_estimates function.
compare_estimates(; pkfit_1cmp, pkfit_2cmp)| Row | parameter | pkfit_1cmp | pkfit_2cmp |
|---|---|---|---|
| String | Float64? | Float64? | |
| 1 | tvcl | 3.1642 | 2.81373 |
| 2 | tvv | 13.288 | 11.0047 |
| 3 | tvka | 2.0 | 2.0 |
| 4 | Ω₁,₁ | 0.0849405 | 0.102656 |
| 5 | Ω₂,₂ | 0.0485682 | 0.060778 |
| 6 | Ω₃,₃ | 5.58107 | 1.20118 |
| 7 | σ_p | 0.100928 | 0.0484045 |
| 8 | tvvp | missing | 5.54005 |
| 9 | tvq | missing | 1.51591 |
| 10 | Ω₄,₄ | missing | 0.423466 |
| 11 | Ω₅,₅ | missing | 0.244713 |
We perform a likelihood ratio test to compare the two nested models. The test statistic and the \(p\)-value clearly indicate that a 2-compartment model should be preferred.
lrtest(pkfit_1cmp, pkfit_2cmp)Statistic: 1200.0
Degrees of freedom: 4
P-value: 0.0
We should also compare the other metrics and statistics, such ηshrinkage, ϵshrinkage, aic, and bic using the metrics_table function.
@chain metrics_table(pkfit_2cmp) begin
leftjoin(metrics_table(pkfit_1cmp); on = :Metric, makeunique = true)
rename!(:Value => :pk2cmp, :Value_1 => :pk1cmp)
endWARNING: using deprecated binding Distributions.MatrixReshaped in Pumas.
, use Distributions.ReshapedDistribution{2, S, D} where D<:Distributions.Distribution{Distributions.ArrayLikeVariate{1}, S} where S<:Distributions.ValueSupport instead.
| Row | Metric | pk2cmp | pk1cmp |
|---|---|---|---|
| String | Any | Any | |
| 1 | Successful | true | true |
| 2 | Estimation Time | 14.372 | 5.513 |
| 3 | Subjects | 120 | 120 |
| 4 | Fixed Parameters | 1 | 1 |
| 5 | Optimized Parameters | 10 | 6 |
| 6 | Conc Active Observations | 1320 | 1320 |
| 7 | Conc Missing Observations | 0 | 0 |
| 8 | Total Active Observations | 1320 | 1320 |
| 9 | Total Missing Observations | 0 | 0 |
| 10 | Likelihood Approximation | Pumas.FOCE{Optim.NewtonTrustRegion{Float64}, Optim.Options{Float64, Nothing}} | Pumas.FOCE{Optim.NewtonTrustRegion{Float64}, Optim.Options{Float64, Nothing}} |
| 11 | LogLikelihood (LL) | 1830.63 | 1231.19 |
| 12 | -2LL | -3661.26 | -2462.38 |
| 13 | AIC | -3641.26 | -2450.38 |
| 14 | BIC | -3589.41 | -2419.26 |
| 15 | (η-shrinkage) η₁ | 0.037 | 0.016 |
| 16 | (η-shrinkage) η₂ | 0.047 | 0.04 |
| 17 | (η-shrinkage) η₃ | 0.516 | 0.733 |
| 18 | (ϵ-shrinkage) Conc | 0.185 | 0.105 |
| 19 | (η-shrinkage) η₄ | 0.287 | missing |
| 20 | (η-shrinkage) η₅ | 0.154 | missing |
We next generate some goodness of fit plots to compare which model is performing better. To do this, we first inspect the diagnostics of our model fit.
res_inspect_1cmp = inspect(pkfit_1cmp)[ Info: Calculating predictions. [ Info: Calculating weighted residuals. [ Info: Calculating empirical bayes. [ Info: Evaluating dose control parameters. [ Info: Evaluating individual parameters. [ Info: Done.
FittedPumasModelInspection
Likelihood approximation used for weighted residuals: FOCE
res_inspect_2cmp = inspect(pkfit_2cmp)[ Info: Calculating predictions. [ Info: Calculating weighted residuals. [ Info: Calculating empirical bayes. [ Info: Evaluating dose control parameters. [ Info: Evaluating individual parameters. [ Info: Done.
FittedPumasModelInspection
Likelihood approximation used for weighted residuals: FOCE
gof_1cmp = goodness_of_fit(
res_inspect_1cmp;
figure = (; fontsize = 12),
legend = (; position = :bottom),
)gof_2cmp = goodness_of_fit(
res_inspect_2cmp;
figure = (; fontsize = 12),
legend = (; position = :bottom),
)These plots clearly indicate that the 2-compartment model is a better fit compared to the 1-compartment model.
We can look at selected sample of individual plots.
fig_subject_fits = subject_fits(
res_inspect_2cmp;
separate = true,
paginate = true,
figure = (; fontsize = 18),
axis = (; xlabel = "Time (hr)", ylabel = "CTMx Concentration (ng/mL)"),
)
fig_subject_fits[1]There a lot of important plotting functions you can use for your standard model diagnostics. Please make sure to read the documentation for plotting. Below, we are checking the distribution of the empirical Bayes estimates.
empirical_bayes_dist(res_inspect_2cmp; zeroline_color = :red)empirical_bayes_vs_covariates(
res_inspect_2cmp;
categorical = [:Dose],
figure = (; size = (600, 800)),
)Clearly, our guess at tvka seems off-target. Let’s try and estimate tvka instead of fixing it to 2:
pkfit_2cmp_unfix_ka = fit(pk_2cmp, pkpain_noplb, init_params(pk_2cmp), FOCE())[ Info: Checking the initial parameter values. [ Info: The initial negative log likelihood and its gradient are finite. Check passed. Iter Function value Gradient norm 0 -3.200734e+02 1.272671e+03 * time: 2.09808349609375e-5 1 -8.682982e+02 1.000199e+03 * time: 1.3310010433197021 2 -1.381870e+03 5.008081e+02 * time: 3.7517240047454834 3 -1.551053e+03 6.833490e+02 * time: 4.063700914382935 4 -1.680887e+03 1.834586e+02 * time: 4.381908893585205 5 -1.726118e+03 8.870274e+01 * time: 4.754093885421753 6 -1.761023e+03 1.162036e+02 * time: 5.020036935806274 7 -1.786619e+03 1.114552e+02 * time: 5.329725980758667 8 -1.863556e+03 9.914305e+01 * time: 5.6320579051971436 9 -1.882942e+03 5.342676e+01 * time: 5.925926923751831 10 -1.888020e+03 2.010181e+01 * time: 6.210546016693115 11 -1.889832e+03 1.867262e+01 * time: 6.503090858459473 12 -1.891649e+03 1.668510e+01 * time: 6.792168855667114 13 -1.892615e+03 1.820707e+01 * time: 7.105893850326538 14 -1.893453e+03 1.745193e+01 * time: 7.395514965057373 15 -1.894760e+03 1.850174e+01 * time: 7.679613828659058 16 -1.895647e+03 1.773921e+01 * time: 7.9875569343566895 17 -1.896597e+03 1.143421e+01 * time: 8.301094055175781 18 -1.897114e+03 9.720034e+00 * time: 8.592867851257324 19 -1.897373e+03 6.054160e+00 * time: 8.906024932861328 20 -1.897498e+03 3.985923e+00 * time: 9.192986965179443 21 -1.897571e+03 4.262502e+00 * time: 9.492432832717896 22 -1.897633e+03 4.010316e+00 * time: 9.78183102607727 23 -1.897714e+03 4.805389e+00 * time: 10.073115825653076 24 -1.897802e+03 3.508614e+00 * time: 10.383512020111084 25 -1.897865e+03 3.691472e+00 * time: 10.668675899505615 26 -1.897900e+03 2.982676e+00 * time: 10.94167685508728 27 -1.897928e+03 2.563863e+00 * time: 11.232960939407349 28 -1.897968e+03 3.261530e+00 * time: 11.50447392463684 29 -1.898013e+03 3.064695e+00 * time: 11.775993824005127 30 -1.898040e+03 1.636456e+00 * time: 12.074547052383423 31 -1.898051e+03 1.439998e+00 * time: 12.390280961990356 32 -1.898057e+03 1.436505e+00 * time: 12.647377967834473 33 -1.898069e+03 1.881592e+00 * time: 12.92654299736023 34 -1.898095e+03 3.253228e+00 * time: 13.199355840682983 35 -1.898142e+03 4.257954e+00 * time: 13.476202964782715 36 -1.898199e+03 3.685153e+00 * time: 13.772341012954712 37 -1.898245e+03 2.567377e+00 * time: 14.048964023590088 38 -1.898246e+03 2.561577e+00 * time: 14.441490888595581 39 -1.898251e+03 2.530928e+00 * time: 14.778038024902344 40 -1.898298e+03 2.673773e+00 * time: 15.078016996383667 41 -1.898300e+03 2.795859e+00 * time: 15.404392004013062 42 -1.898337e+03 3.666102e+00 * time: 15.774720907211304 43 -1.898342e+03 3.753077e+00 * time: 16.181099891662598 44 -1.898429e+03 4.461850e+00 * time: 16.568289041519165 45 -1.898461e+03 3.584769e+00 * time: 16.843650817871094 46 -1.898477e+03 2.357431e+00 * time: 17.147682905197144 47 -1.898479e+03 7.373685e-01 * time: 17.42925190925598 48 -1.898479e+03 6.197624e-01 * time: 17.755481958389282 49 -1.898480e+03 1.716594e+00 * time: 18.1113338470459 50 -1.898480e+03 1.128276e+00 * time: 18.365583896636963 51 -1.898480e+03 9.157160e-01 * time: 18.679677963256836 52 -1.898480e+03 3.051694e-01 * time: 18.983877897262573 53 -1.898480e+03 2.316257e-01 * time: 19.24250888824463 54 -1.898480e+03 2.316260e-01 * time: 19.497082948684692 55 -1.898480e+03 2.344996e-01 * time: 19.954659938812256 56 -1.898480e+03 2.372822e-01 * time: 20.41219687461853 57 -1.898480e+03 2.372809e-01 * time: 20.883453845977783 58 -1.898480e+03 2.372808e-01 * time: 21.333900928497314 59 -1.898480e+03 2.372808e-01 * time: 21.81968593597412 60 -1.898480e+03 2.372808e-01 * time: 22.30205988883972 61 -1.898480e+03 2.372808e-01 * time: 22.787372827529907 62 -1.898480e+03 2.372808e-01 * time: 23.26146101951599
FittedPumasModel
Dynamical system type: Closed form
Number of subjects: 120
Observation records: Active Missing
Conc: 1320 0
Total: 1320 0
Number of parameters: Constant Optimized
0 11
Likelihood approximation: FOCE
Likelihood optimizer: BFGS
Termination Reason: NoObjectiveChange
Log-likelihood value: 1898.4797
-----------------
Estimate
-----------------
tvcl 2.6191
tvv 11.378
tvvp 8.4529
tvq 1.3164
tvka 4.8925
Ω₁,₁ 0.13243
Ω₂,₂ 0.059669
Ω₃,₃ 0.41581
Ω₄,₄ 0.080679
Ω₅,₅ 0.24996
σ_p 0.049098
-----------------
compare_estimates(; pkfit_2cmp, pkfit_2cmp_unfix_ka)| Row | parameter | pkfit_2cmp | pkfit_2cmp_unfix_ka |
|---|---|---|---|
| String | Float64? | Float64? | |
| 1 | tvcl | 2.81373 | 2.61912 |
| 2 | tvv | 11.0047 | 11.3783 |
| 3 | tvvp | 5.54005 | 8.45292 |
| 4 | tvq | 1.51591 | 1.31636 |
| 5 | tvka | 2.0 | 4.89251 |
| 6 | Ω₁,₁ | 0.102656 | 0.132433 |
| 7 | Ω₂,₂ | 0.060778 | 0.0596692 |
| 8 | Ω₃,₃ | 1.20118 | 0.415807 |
| 9 | Ω₄,₄ | 0.423466 | 0.080679 |
| 10 | Ω₅,₅ | 0.244713 | 0.249963 |
| 11 | σ_p | 0.0484045 | 0.0490975 |
Let’s revaluate the goodness of fits and η distribution plots.
Not much change in the general gof plots
res_inspect_2cmp_unfix_ka = inspect(pkfit_2cmp_unfix_ka)[ Info: Calculating predictions. [ Info: Calculating weighted residuals. [ Info: Calculating empirical bayes. [ Info: Evaluating dose control parameters. [ Info: Evaluating individual parameters. [ Info: Done.
FittedPumasModelInspection
Likelihood approximation used for weighted residuals: FOCE
goodness_of_fit(
res_inspect_2cmp_unfix_ka;
figure = (; fontsize = 12),
legend = (; position = :bottom),
)But you can see a huge improvement in the ηka, (η₃) distribution which is now centered around zero
empirical_bayes_vs_covariates(
res_inspect_2cmp_unfix_ka;
categorical = [:Dose],
ebes = [:η₃],
figure = (; size = (600, 800)),
)Finally looking at some individual plots for the same subjects as earlier:
fig_subject_fits2 = subject_fits(
res_inspect_2cmp_unfix_ka;
separate = true,
paginate = true,
facet = (; linkyaxes = false),
figure = (; fontsize = 18),
axis = (; xlabel = "Time (hr)", ylabel = "CTMx Concentration (ng/mL)"),
)
fig_subject_fits2[6]The randomly sampled individual fits don’t seem good in some individuals, but we can evaluate this via a vpc to see how to go about.
3.4 Visual Predictive Checks (VPC)
We can now perform a vpc to check. The default plots provide a 80% prediction interval and a 95% simulated CI (shaded area) around each of the quantiles
pk_vpc = vpc(pkfit_2cmp_unfix_ka, 200; observations = [:Conc], stratify_by = [:Dose])[ Info: Continuous VPC
Visual Predictive Check
Type of VPC: Continuous VPC
Simulated populations: 200
Subjects in data: 120
Stratification variable(s): [:Dose]
Confidence level: 0.95
VPC lines: quantiles ([0.1, 0.5, 0.9])
vpc_plot(
pk_2cmp,
pk_vpc;
rows = 1,
columns = 3,
figure = (; size = (1400, 1000), fontsize = 22),
axis = (;
xlabel = "Time (hr)",
ylabel = "Observed/Predicted\n CTMx Concentration (ng/mL)",
),
facet = (; combinelabels = true),
)The visual predictive check suggests that the model captures the data well across all dose levels.
4 Additional Help
If you have questions regarding this tutorial, please post them on our discourse site.