using Pumas
using PumasUtilities
using NCA
using NCAUtilities
A Comprehensive Introduction to Pumas
This tutorial provides a comprehensive introduction to a modeling and simulation workflow in Pumas. The idea is not to get into the details of Pumas specifics, but instead provide a narrative on the lines of a regular workflow in our day-to-day work, with brevity where required to allow a broad overview. Wherever possible, cross-references will be provided to documentation and detailed examples that provide deeper insight into a particular topic.
As part of this workflow, you will be introduced to various aspects such as:
- Data wrangling in Julia
- Exploratory analysis in Julia
- Continuous data non-linear mixed effects modeling in Pumas
- Model comparison routines, post-processing, validation etc.
1 The Study and Design
CTMNopain is a novel anti-inflammatory agent under preliminary investigation. A dose-ranging trial was conducted comparing placebo with 3 doses of CTMNopain (5mg, 20mg and 80 mg QD). The maximum tolerated dose is 160 mg per day. Plasma concentrations (mg/L) of the drug were measured at 0, 0.5, 1, 1.5, 2, 2.5, 3-8 hours.
Pain score (0=no pain, 1=mild, 2=moderate, 3=severe) were obtained at time points when plasma concentration was collected. A pain score of 2 or more is considered as no pain relief.
The subjects can request for remedication if pain relief is not achieved after 2 hours post dose. Some subjects had remedication before 2 hours if they were not able to bear the pain. The time to remedication and the remedication status is available for subjects.
The pharmacokinetic dataset can be accessed using PharmaDatasets.jl.
2 Setup
2.1 Load libraries
These libraries provide the workhorse functionality in the Pumas ecosystem:
In addition, libraries below are good add-on’s that provide ancillary functionality:
using GLM: lm, @formula
using Random
using CSV
using DataFramesMeta
using CairoMakie
using PharmaDatasets2.2 Data Wrangling
We start by reading in the dataset and making some quick summaries.
If you want to learn more about data wrangling, don’t forget to check our Data Wrangling in Julia tutorials!
pkpain_df = dataset("pk_painrelief")
first(pkpain_df, 5)| Row | Subject | Time | Conc | PainRelief | PainScore | RemedStatus | Dose |
|---|---|---|---|---|---|---|---|
| Int64 | Float64 | Float64 | Int64 | Int64 | Int64 | String7 | |
| 1 | 1 | 0.0 | 0.0 | 0 | 3 | 1 | 20 mg |
| 2 | 1 | 0.5 | 1.15578 | 1 | 1 | 0 | 20 mg |
| 3 | 1 | 1.0 | 1.37211 | 1 | 0 | 0 | 20 mg |
| 4 | 1 | 1.5 | 1.30058 | 1 | 0 | 0 | 20 mg |
| 5 | 1 | 2.0 | 1.19195 | 1 | 1 | 0 | 20 mg |
Let’s filter out the placebo data as we don’t need that for the PK analysis.
pkpain_noplb_df = @rsubset pkpain_df :Dose != "Placebo";
first(pkpain_noplb_df, 5)| Row | Subject | Time | Conc | PainRelief | PainScore | RemedStatus | Dose |
|---|---|---|---|---|---|---|---|
| Int64 | Float64 | Float64 | Int64 | Int64 | Int64 | String7 | |
| 1 | 1 | 0.0 | 0.0 | 0 | 3 | 1 | 20 mg |
| 2 | 1 | 0.5 | 1.15578 | 1 | 1 | 0 | 20 mg |
| 3 | 1 | 1.0 | 1.37211 | 1 | 0 | 0 | 20 mg |
| 4 | 1 | 1.5 | 1.30058 | 1 | 0 | 0 | 20 mg |
| 5 | 1 | 2.0 | 1.19195 | 1 | 1 | 0 | 20 mg |
3 Analysis
3.1 Non-compartmental analysis
Let’s begin by performing a quick NCA of the concentration time profiles and view the exposure changes across doses. The input data specification for NCA analysis requires the presence of a :route column and an :amt column that specifies the dose. So, let’s add that in:
@rtransform! pkpain_noplb_df begin
:route = "ev"
:Dose = parse(Int, chop(:Dose; tail = 3))
endWe also need to create an :amt column:
@rtransform! pkpain_noplb_df :amt = :Time == 0 ? :Dose : missingNow, we map the data variables to the read_nca function that prepares the data for NCA analysis.
pkpain_nca = read_nca(
pkpain_noplb_df;
id = :Subject,
time = :Time,
amt = :amt,
observations = :Conc,
group = [:Dose],
route = :route,
)NCAPopulation (120 subjects):
Group: [["Dose" => 5], ["Dose" => 20], ["Dose" => 80]]
Number of missing observations: 0
Number of blq observations: 0
Now that we mapped the data in, let’s visualize the concentration vs time plots for a few individuals. When paginate is set to true, a vector of plots are returned and below we display the first element with 9 individuals.
f = observations_vs_time(
pkpain_nca;
paginate = true,
axis = (; xlabel = "Time (hr)", ylabel = "CTMNoPain Concentration (ng/mL)"),
)
f[1]or you can view the summary curves by dose group as passed in to the group argument in read_nca
summary_observations_vs_time(
pkpain_nca,
figure = (; fontsize = 22, size = (800, 1000)),
color = "black",
linewidth = 3,
axis = (; xlabel = "Time (hr)", ylabel = "CTMX Concentration (μg/mL)"),
)A full NCA Report is now obtained for completeness purposes using the run_nca function, but later we will only extract a couple of key metrics of interest.
pk_nca = run_nca(pkpain_nca; sigdigits = 3)We can look at the NCA fits for some subjects. Here f is a vector or figures. We’ll showcase the first image by indexing f:
f = subject_fits(
pk_nca,
paginate = true,
axis = (; xlabel = "Time (hr)", ylabel = "CTMX Concentration (μg/mL)"),
# Legend options
legend = (; position = :bottom),
)
f[1]As CTMNopain’s effect maybe mainly related to maximum concentration (cmax) or area under the curve (auc), we present some summary statistics using the summarize function from NCA.
strata = [:Dose]1-element Vector{Symbol}:
:Dose
params = [:cmax, :aucinf_obs]2-element Vector{Symbol}:
:cmax
:aucinf_obs
output = summarize(pk_nca; stratify_by = strata, parameters = params)| Row | Dose | parameters | numsamples | minimum | maximum | mean | std | geomean | geostd | geomeanCV |
|---|---|---|---|---|---|---|---|---|---|---|
| Int64 | String | Int64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | |
| 1 | 5 | cmax | 40 | 0.19 | 0.539 | 0.356075 | 0.0884129 | 0.345104 | 1.2932 | 26.1425 |
| 2 | 5 | aucinf_obs | 40 | 0.914 | 3.4 | 1.5979 | 0.490197 | 1.53373 | 1.32974 | 29.0868 |
| 3 | 20 | cmax | 40 | 0.933 | 2.7 | 1.4737 | 0.361871 | 1.43408 | 1.2633 | 23.6954 |
| 4 | 20 | aucinf_obs | 40 | 2.77 | 14.1 | 6.377 | 2.22239 | 6.02031 | 1.41363 | 35.6797 |
| 5 | 80 | cmax | 40 | 3.3 | 8.47 | 5.787 | 1.31957 | 5.64164 | 1.25757 | 23.2228 |
| 6 | 80 | aucinf_obs | 40 | 13.7 | 49.1 | 29.5 | 8.68984 | 28.2954 | 1.34152 | 30.0258 |
The statistics printed above are the default, but you can pass in your own statistics using the stats = [] argument to the summarize function.
We can look at a few parameter distribution plots.
parameters_vs_group(
pk_nca,
parameter = :cmax,
axis = (; xlabel = "Dose (mg)", ylabel = "Cₘₐₓ (ng/mL)"),
figure = (; fontsize = 18),
)Dose normalized PK parameters, cmax and aucinf were essentially dose proportional between for 5 mg, 20 mg and 80 mg doses. You can perform a simple regression to check the impact of dose on cmax:
dp = NCA.DoseLinearityPowerModel(pk_nca, :cmax; level = 0.9)Dose Linearity Power Model
Variable: cmax
Model: log(cmax) ~ log(α) + β × log(dose)
StatsBase.CoefTable([[1.0077528981236414], [0.9757097548045869], [1.039796041442696]], ["Estimate", "low CI 90%", "high CI 90%"], ["β"], 0, 0)
Here’s a visualization for the dose linearity using a power model for cmax:
power_model(dp; legend = (; position = :bottom))We can also visualize a dose proportionality results with respect to a specific endpoint in a NCA Report; for example cmax and aucinf_obs:
dose_vs_dose_normalized(pk_nca, :cmax)dose_vs_dose_normalized(pk_nca, :aucinf_obs)Based on visual inspection of the concentration time profiles as seen earlier, CTMNopain exhibited monophasic decline, and perhaps a one compartment model best fits the PK data.
3.2 Pharmacokinetic modeling
As seen from the plots above, the concentrations decline monoexponentially. We will evaluate both one and two compartment structural models to assess best fit. Further, different residual error models will also be tested.
We will use the results from NCA to provide us good initial estimates.
3.2.1 Data preparation for modeling
PumasNDF requires the presence of :evid and :cmt columns in the dataset.
@rtransform! pkpain_noplb_df begin
:evid = :Time == 0 ? 1 : 0
:cmt = :Time == 0 ? 1 : 2
:cmt2 = 1 # for zero order absorption
endFurther, observations at time of dosing, i.e., when evid = 1 have to be missing
@rtransform! pkpain_noplb_df :Conc = :evid == 1 ? missing : :ConcThe dataframe will now be converted to a Population using read_pumas. Note that both observations and covariates are required to be an array even if it is one element.
pkpain_noplb = read_pumas(
pkpain_noplb_df;
id = :Subject,
time = :Time,
amt = :amt,
observations = [:Conc],
covariates = [:Dose],
evid = :evid,
cmt = :cmt,
)Population
Subjects: 120
Covariates: Dose
Observations: Conc
Now that the data is transformed to a Population of subjects, we can explore different models.
3.2.2 One-compartment model
If you are not familiar yet with the @model blocks and syntax, please check our documentation.
pk_1cmp = @model begin
@metadata begin
desc = "One Compartment Model"
timeu = u"hr"
end
@param begin
"""
Clearance (L/hr)
"""
tvcl ∈ RealDomain(; lower = 0, init = 3.2)
"""
Volume (L)
"""
tvv ∈ RealDomain(; lower = 0, init = 16.4)
"""
Absorption rate constant (h-1)
"""
tvka ∈ RealDomain(; lower = 0, init = 3.8)
"""
- ΩCL
- ΩVc
- ΩKa
"""
Ω ∈ PDiagDomain(init = [0.04, 0.04, 0.04])
"""
Proportional RUV
"""
σ_p ∈ RealDomain(; lower = 0.0001, init = 0.2)
end
@random begin
η ~ MvNormal(Ω)
end
@covariates begin
"""
Dose (mg)
"""
Dose
end
@pre begin
CL = tvcl * exp(η[1])
Vc = tvv * exp(η[2])
Ka = tvka * exp(η[3])
end
@dynamics Depots1Central1
@derived begin
cp := @. Central / Vc
"""
CTMx Concentration (ng/mL)
"""
Conc ~ @. ProportionalNormal(cp, σ_p)
end
end┌ Warning: Covariate Dose is not used in the model. └ @ Pumas ~/run/_work/PumasTutorials.jl/PumasTutorials.jl/custom_julia_depot/packages/Pumas/qxx5c/src/dsl/model_macro.jl:3399
PumasModel
Parameters: tvcl, tvv, tvka, Ω, σ_p
Random effects: η
Covariates: Dose
Dynamical system variables: Depot, Central
Dynamical system type: Closed form
Derived: Conc
Observed: Conc
Note that the local assignment := can be used to define intermediate statements that will not be carried outside of the block. This means that all the resulting data workflows from this model will not contain the intermediate variables defined with :=. We use this when we want to suppress the variable from any further output.
The idea behind := is for performance reasons. If you are not carrying the variable defined with := outside of the block, then it is not necessary to store it in the resulting data structures. Not only will your model run faster, but the resulting data structures will also be smaller.
Before going to fit the model, let’s evaluate some helpful steps via simulation to check appropriateness of data and model
# zero out the random effects
etas = zero_randeffs(pk_1cmp, pkpain_noplb, init_params(pk_1cmp))Above, we are generating a vector of η’s of the same length as the number of subjects to zero out the random effects. We do this as we are evaluating the trajectories of the concentrations at the initial set of parameters at a population level. Other helper functions here are sample_randeffs and init_randeffs. Please refer to the documentation.
simpk_iparams = simobs(pk_1cmp, pkpain_noplb, init_params(pk_1cmp), etas)Simulated population (Vector{<:Subject})
Simulated subjects: 120
Simulated variables: Conc
sim_plot(
pk_1cmp,
simpk_iparams;
observations = [:Conc],
figure = (; fontsize = 18),
axis = (;
xlabel = "Time (hr)",
ylabel = "Observed/Predicted \n CTMx Concentration (ng/mL)",
),
)Our NCA based initial guess on the parameters seem to work well.
Lets change the initial estimate of a couple of the parameters to evaluate the sensitivity.
pkparam = (; init_params(pk_1cmp)..., tvka = 2, tvv = 10)(tvcl = 3.2, tvv = 10, tvka = 2, Ω = [0.04 0.0 0.0; 0.0 0.04 0.0; 0.0 0.0 0.04], σ_p = 0.2)
simpk_changedpars = simobs(pk_1cmp, pkpain_noplb, pkparam, etas)Simulated population (Vector{<:Subject})
Simulated subjects: 120
Simulated variables: Conc
sim_plot(
pk_1cmp,
simpk_changedpars;
observations = [:Conc],
figure = (; fontsize = 18),
axis = (
xlabel = "Time (hr)",
ylabel = "Observed/Predicted \n CTMx Concentration (ng/mL)",
),
)Changing the tvka and decreasing the tvv seemed to make an impact and observations go through the simulated lines.
To get a quick ballpark estimate of your PK parameters, we can do a NaivePooled analysis.
3.2.2.1 NaivePooled
pkfit_np = fit(pk_1cmp, pkpain_noplb, init_params(pk_1cmp), NaivePooled(); omegas = (:Ω,))┌ Warning: The `omegas` keyword argument is deprecated, use instead the `constantcoef` keyword argument to fix parameters to values for which the random effect distributions collapse to Dirac measures. └ @ Pumas ~/run/_work/PumasTutorials.jl/PumasTutorials.jl/custom_julia_depot/packages/Pumas/qxx5c/src/estimation/likelihoods.jl:5868 [ Info: Checking the initial parameter values. [ Info: The initial negative log likelihood and its gradient are finite. Check passed. Iter Function value Gradient norm 0 7.744356e+02 3.715711e+03 * time: 0.05847787857055664 1 2.343899e+02 1.747348e+03 * time: 2.548849105834961 2 9.696232e+01 1.198088e+03 * time: 2.5532519817352295 3 -7.818699e+01 5.538151e+02 * time: 2.556318998336792 4 -1.234803e+02 2.462514e+02 * time: 2.559623956680298 5 -1.372888e+02 2.067458e+02 * time: 2.5626699924468994 6 -1.410579e+02 1.162950e+02 * time: 2.565701961517334 7 -1.434754e+02 5.632816e+01 * time: 2.5687930583953857 8 -1.453401e+02 7.859270e+01 * time: 2.5719308853149414 9 -1.498185e+02 1.455606e+02 * time: 2.5747900009155273 10 -1.534371e+02 1.303682e+02 * time: 2.577686071395874 11 -1.563557e+02 5.975474e+01 * time: 2.580605983734131 12 -1.575052e+02 9.308611e+00 * time: 2.5836570262908936 13 -1.579357e+02 1.234484e+01 * time: 2.586484909057617 14 -1.581874e+02 7.478196e+00 * time: 2.5894479751586914 15 -1.582981e+02 2.027162e+00 * time: 2.5924899578094482 16 -1.583375e+02 5.578262e+00 * time: 2.595649003982544 17 -1.583556e+02 4.727050e+00 * time: 2.5988481044769287 18 -1.583644e+02 2.340173e+00 * time: 2.6017770767211914 19 -1.583680e+02 7.738100e-01 * time: 2.6047871112823486 20 -1.583696e+02 3.300689e-01 * time: 2.607938051223755 21 -1.583704e+02 3.641985e-01 * time: 2.6109659671783447 22 -1.583707e+02 4.365901e-01 * time: 2.613940954208374 23 -1.583709e+02 3.887800e-01 * time: 2.6168060302734375 24 -1.583710e+02 2.766977e-01 * time: 2.6198689937591553 25 -1.583710e+02 1.758029e-01 * time: 2.622728109359741 26 -1.583710e+02 1.133947e-01 * time: 2.625714063644409 27 -1.583710e+02 7.922544e-02 * time: 2.628567934036255 28 -1.583710e+02 5.954998e-02 * time: 2.631582021713257 29 -1.583710e+02 4.157080e-02 * time: 2.6344919204711914 30 -1.583710e+02 4.295446e-02 * time: 2.6374130249023438 31 -1.583710e+02 5.170752e-02 * time: 2.6402928829193115 32 -1.583710e+02 2.644382e-02 * time: 2.644141912460327 33 -1.583710e+02 4.548987e-03 * time: 2.6479361057281494 34 -1.583710e+02 2.501805e-02 * time: 2.6516740322113037 35 -1.583710e+02 3.763439e-02 * time: 2.6545040607452393 36 -1.583710e+02 3.206027e-02 * time: 2.6573400497436523 37 -1.583710e+02 1.003700e-02 * time: 2.6603760719299316 38 -1.583710e+02 2.209084e-02 * time: 2.663209915161133 39 -1.583710e+02 4.954136e-03 * time: 2.666029930114746 40 -1.583710e+02 1.609366e-02 * time: 2.669965982437134 41 -1.583710e+02 1.579810e-02 * time: 2.6728620529174805 42 -1.583710e+02 1.014156e-03 * time: 2.675676107406616 43 -1.583710e+02 6.050792e-03 * time: 2.6795730590820312 44 -1.583710e+02 1.354381e-02 * time: 2.682497024536133 45 -1.583710e+02 4.473216e-03 * time: 2.686980962753296 46 -1.583710e+02 4.645458e-03 * time: 2.689847946166992 47 -1.583710e+02 9.828063e-03 * time: 2.6929259300231934 48 -1.583710e+02 1.047215e-03 * time: 2.69602108001709 49 -1.583710e+02 8.374104e-03 * time: 2.6992640495300293 50 -1.583710e+02 7.841995e-04 * time: 2.7023138999938965
FittedPumasModel
Dynamical system type: Closed form
Number of subjects: 120
Observation records: Active Missing
Conc: 1320 0
Total: 1320 0
Number of parameters: Constant Optimized
1 6
Likelihood approximation: NaivePooled
Likelihood optimizer: BFGS
Termination Reason: GradientNorm
Log-likelihood value: 158.37103
------------------
Estimate
------------------
tvcl 3.0054
tvv 14.089
tvka 44.227
† Ω₁,₁ 0.0
† Ω₂,₂ 0.0
† Ω₃,₃ 0.0
σ_p 0.32999
------------------
† indicates constant parameters
coefficients_table(pkfit_np)| Row | Parameter | Description | Constant | Estimate |
|---|---|---|---|---|
| String | SubStrin… | Bool | Float64 | |
| 1 | tvcl | Clearance (L/hr) | false | 3.005 |
| 2 | tvv | Volume (L) | false | 14.089 |
| 3 | tvka | Absorption rate constant (h-1) | false | 44.227 |
| 4 | Ω₁,₁ | ΩCL | true | 0.0 |
| 5 | Ω₂,₂ | ΩVc | true | 0.0 |
| 6 | Ω₃,₃ | ΩKa | true | 0.0 |
| 7 | σ_p | Proportional RUV | false | 0.33 |
The final estimates from the NaivePooled approach seem reasonably close to our initial guess from NCA, except for the tvka parameter. We will stick with our initial guess.
One way to be cautious before going into a complete fitting routine is to evaluate the likelihood of the individual subjects given the initial parameter values and see if any subject(s) pops out as unreasonable. There are a few ways of doing this:
- check the
loglikelihoodsubject wise - check if there any influential subjects
Below, we are basically checking if the initial estimates for any subject are way off that we are unable to compute the initial loglikelihood.
lls = [loglikelihood(pk_1cmp, subj, pkparam, FOCE()) for subj in pkpain_noplb]
# the plot below is using native CairoMakie `hist`
hist(lls; bins = 10, normalization = :none, color = (:black, 0.5))The distribution of the loglikelihood’s suggest no extreme outliers.
A more convenient way is to use the findinfluential function that provides a list of k top influential subjects by showing the normalized (minus) loglikelihood for each subject. As you can see below, the minus loglikelihood in the range of 16 agrees with the histogram plotted above.
influential_subjects = findinfluential(pk_1cmp, pkpain_noplb, pkparam, FOCE())120-element Vector{@NamedTuple{id::String, nll::Float64}}:
(id = "148", nll = 16.659658856844782)
(id = "135", nll = 16.648985190076324)
(id = "156", nll = 15.9590695566075)
(id = "159", nll = 15.441218240496484)
(id = "149", nll = 14.715134644119514)
(id = "88", nll = 13.09709837464614)
(id = "16", nll = 12.982280521931417)
(id = "61", nll = 12.652182902303675)
(id = "71", nll = 12.500330088085486)
(id = "59", nll = 12.241510254805224)
⋮
(id = "57", nll = -22.797674232534305)
(id = "93", nll = -22.836900711478222)
(id = "12", nll = -23.007742339519236)
(id = "123", nll = -23.292751843079227)
(id = "41", nll = -23.425412534960515)
(id = "99", nll = -23.53521484190102)
(id = "29", nll = -24.025959868383097)
(id = "52", nll = -24.164757842493685)
(id = "24", nll = -25.572092325658446)
3.2.2.2 FOCE
Now that we have a good handle on our data, lets go ahead and fit a population model with FOCE:
pkfit_1cmp = fit(pk_1cmp, pkpain_noplb, pkparam, FOCE(); constantcoef = (; tvka = 2))[ Info: Checking the initial parameter values. [ Info: The initial negative log likelihood and its gradient are finite. Check passed. Iter Function value Gradient norm 0 -5.935351e+02 5.597318e+02 * time: 2.5033950805664062e-5 1 -7.022088e+02 1.707063e+02 * time: 0.7596111297607422 2 -7.314067e+02 2.903269e+02 * time: 1.8447060585021973 3 -8.520591e+02 2.285888e+02 * time: 2.1373469829559326 4 -1.120191e+03 3.795410e+02 * time: 2.474898099899292 5 -1.178784e+03 2.323978e+02 * time: 2.6116750240325928 6 -1.218320e+03 9.699907e+01 * time: 2.824363946914673 7 -1.223641e+03 5.862105e+01 * time: 2.949517011642456 8 -1.227620e+03 1.831402e+01 * time: 3.0958690643310547 9 -1.228381e+03 2.132323e+01 * time: 3.211419105529785 10 -1.230098e+03 2.921228e+01 * time: 3.3863439559936523 11 -1.230854e+03 2.029661e+01 * time: 3.5006630420684814 12 -1.231116e+03 5.229098e+00 * time: 3.6245639324188232 13 -1.231179e+03 1.689232e+00 * time: 3.816879987716675 14 -1.231187e+03 1.215379e+00 * time: 3.958785057067871 15 -1.231188e+03 2.770378e-01 * time: 4.129499912261963 16 -1.231188e+03 1.636651e-01 * time: 4.298337936401367 17 -1.231188e+03 2.701140e-01 * time: 4.5186240673065186 18 -1.231188e+03 3.163344e-01 * time: 4.62663197517395 19 -1.231188e+03 1.505255e-01 * time: 4.725805997848511 20 -1.231188e+03 2.483984e-02 * time: 4.851361036300659 21 -1.231188e+03 8.344378e-04 * time: 4.9513020515441895
FittedPumasModel
Dynamical system type: Closed form
Number of subjects: 120
Observation records: Active Missing
Conc: 1320 0
Total: 1320 0
Number of parameters: Constant Optimized
1 6
Likelihood approximation: FOCE
Likelihood optimizer: BFGS
Termination Reason: GradientNorm
Log-likelihood value: 1231.188
-------------------
Estimate
-------------------
tvcl 3.1642
tvv 13.288
† tvka 2.0
Ω₁,₁ 0.08494
Ω₂,₂ 0.048568
Ω₃,₃ 5.5811
σ_p 0.10093
-------------------
† indicates constant parameters
infer(pkfit_1cmp)[ Info: Calculating: variance-covariance matrix. [ Info: Done.
Asymptotic inference results using sandwich estimator
Dynamical system type: Closed form
Number of subjects: 120
Observation records: Active Missing
Conc: 1320 0
Total: 1320 0
Number of parameters: Constant Optimized
1 6
Likelihood approximation: FOCE
Likelihood optimizer: BFGS
Termination Reason: GradientNorm
Log-likelihood value: 1231.188
---------------------------------------------------------
Estimate SE 95.0% C.I.
---------------------------------------------------------
tvcl 3.1642 0.08662 [ 2.9944 ; 3.334 ]
tvv 13.288 0.27481 [ 12.749 ; 13.827 ]
† tvka 2.0 NaN [ NaN ; NaN ]
Ω₁,₁ 0.08494 0.011022 [ 0.063338; 0.10654 ]
Ω₂,₂ 0.048568 0.0063502 [ 0.036122; 0.061014]
Ω₃,₃ 5.5811 1.2189 [ 3.1922 ; 7.97 ]
σ_p 0.10093 0.0057196 [ 0.089718; 0.11214 ]
---------------------------------------------------------
† indicates constant parameters
Notice that tvka is fixed to 2 as we don’t have a lot of information before tmax. From the results above, we see that the parameter precision for this model is reasonable.
3.2.3 Two-compartment model
Just to be sure, let’s fit a 2-compartment model and evaluate:
pk_2cmp = @model begin
@param begin
"""
Clearance (L/hr)
"""
tvcl ∈ RealDomain(; lower = 0, init = 3.2)
"""
Central Volume (L)
"""
tvv ∈ RealDomain(; lower = 0, init = 16.4)
"""
Peripheral Volume (L)
"""
tvvp ∈ RealDomain(; lower = 0, init = 10)
"""
Distributional Clearance (L/hr)
"""
tvq ∈ RealDomain(; lower = 0, init = 2)
"""
Absorption rate constant (h-1)
"""
tvka ∈ RealDomain(; lower = 0, init = 1.3)
"""
- ΩCL
- ΩVc
- ΩKa
- ΩVp
- ΩQ
"""
Ω ∈ PDiagDomain(init = [0.04, 0.04, 0.04, 0.04, 0.04])
"""
Proportional RUV
"""
σ_p ∈ RealDomain(; lower = 0.0001, init = 0.2)
end
@random begin
η ~ MvNormal(Ω)
end
@covariates begin
"""
Dose (mg)
"""
Dose
end
@pre begin
CL = tvcl * exp(η[1])
Vc = tvv * exp(η[2])
Ka = tvka * exp(η[3])
Vp = tvvp * exp(η[4])
Q = tvq * exp(η[5])
end
@dynamics Depots1Central1Periph1
@derived begin
cp := @. Central / Vc
"""
CTMx Concentration (ng/mL)
"""
Conc ~ @. ProportionalNormal(cp, σ_p)
end
end┌ Warning: Covariate Dose is not used in the model. └ @ Pumas ~/run/_work/PumasTutorials.jl/PumasTutorials.jl/custom_julia_depot/packages/Pumas/qxx5c/src/dsl/model_macro.jl:3399
PumasModel
Parameters: tvcl, tvv, tvvp, tvq, tvka, Ω, σ_p
Random effects: η
Covariates: Dose
Dynamical system variables: Depot, Central, Peripheral
Dynamical system type: Closed form
Derived: Conc
Observed: Conc
3.2.3.1 FOCE
pkfit_2cmp =
fit(pk_2cmp, pkpain_noplb, init_params(pk_2cmp), FOCE(); constantcoef = (; tvka = 2))[ Info: Checking the initial parameter values. [ Info: The initial negative log likelihood and its gradient are finite. Check passed. Iter Function value Gradient norm 0 -6.302369e+02 1.021050e+03 * time: 3.0994415283203125e-5 1 -9.197817e+02 9.927951e+02 * time: 1.5425379276275635 2 -1.372640e+03 2.054986e+02 * time: 1.876384973526001 3 -1.446326e+03 1.543987e+02 * time: 2.196737051010132 4 -1.545570e+03 1.855028e+02 * time: 2.5119059085845947 5 -1.581449e+03 1.713157e+02 * time: 2.914141893386841 6 -1.639433e+03 1.257382e+02 * time: 3.197417974472046 7 -1.695964e+03 7.450539e+01 * time: 3.475191116333008 8 -1.722243e+03 5.961044e+01 * time: 3.7582290172576904 9 -1.736883e+03 7.320921e+01 * time: 4.037923097610474 10 -1.753547e+03 7.501938e+01 * time: 4.323484897613525 11 -1.764053e+03 6.185661e+01 * time: 4.615663051605225 12 -1.778991e+03 4.831033e+01 * time: 4.919928073883057 13 -1.791492e+03 4.943278e+01 * time: 5.23445200920105 14 -1.799847e+03 2.871410e+01 * time: 5.647458076477051 15 -1.805374e+03 7.520790e+01 * time: 6.034950017929077 16 -1.816260e+03 2.990621e+01 * time: 6.371522903442383 17 -1.818252e+03 2.401915e+01 * time: 6.670742034912109 18 -1.822988e+03 2.587225e+01 * time: 6.973051071166992 19 -1.824653e+03 1.550517e+01 * time: 7.263144016265869 20 -1.826074e+03 1.788927e+01 * time: 7.542484998703003 21 -1.826821e+03 1.888389e+01 * time: 7.787806987762451 22 -1.827900e+03 1.432840e+01 * time: 8.035577058792114 23 -1.828511e+03 9.422040e+00 * time: 8.291120052337646 24 -1.828754e+03 5.363445e+00 * time: 8.551244974136353 25 -1.828862e+03 4.916168e+00 * time: 8.802757024765015 26 -1.829007e+03 4.695750e+00 * time: 9.058438062667847 27 -1.829358e+03 1.090244e+01 * time: 9.322493076324463 28 -1.829830e+03 1.451320e+01 * time: 9.592745065689087 29 -1.830201e+03 1.108695e+01 * time: 9.950222969055176 30 -1.830360e+03 2.892317e+00 * time: 10.385570049285889 31 -1.830390e+03 1.699267e+00 * time: 10.761224031448364 32 -1.830404e+03 1.602222e+00 * time: 11.058763027191162 33 -1.830432e+03 2.823304e+00 * time: 11.35127305984497 34 -1.830475e+03 4.117188e+00 * time: 11.651657104492188 35 -1.830527e+03 5.083753e+00 * time: 11.959084033966064 36 -1.830591e+03 2.670227e+00 * time: 12.269340991973877 37 -1.830615e+03 3.508079e+00 * time: 12.577236890792847 38 -1.830623e+03 2.313741e+00 * time: 12.885963916778564 39 -1.830625e+03 1.681301e+00 * time: 13.179666996002197 40 -1.830627e+03 9.723876e-01 * time: 13.463012933731079 41 -1.830628e+03 9.410007e-01 * time: 13.755944967269897 42 -1.830628e+03 3.486773e-01 * time: 13.999456882476807 43 -1.830629e+03 4.526039e-01 * time: 14.214864015579224 44 -1.830630e+03 6.846533e-01 * time: 14.457966089248657 45 -1.830630e+03 4.526146e-01 * time: 14.70970106124878 46 -1.830630e+03 8.729710e-02 * time: 14.947016954421997 47 -1.830630e+03 5.368952e-03 * time: 15.164999961853027 48 -1.830630e+03 2.370727e-03 * time: 15.37197494506836 49 -1.830630e+03 9.048753e-04 * time: 15.585478067398071
FittedPumasModel
Dynamical system type: Closed form
Number of subjects: 120
Observation records: Active Missing
Conc: 1320 0
Total: 1320 0
Number of parameters: Constant Optimized
1 10
Likelihood approximation: FOCE
Likelihood optimizer: BFGS
Termination Reason: GradientNorm
Log-likelihood value: 1830.6305
-------------------
Estimate
-------------------
tvcl 2.8138
tvv 11.005
tvvp 5.54
tvq 1.5159
† tvka 2.0
Ω₁,₁ 0.10267
Ω₂,₂ 0.060776
Ω₃,₃ 1.2012
Ω₄,₄ 0.42349
Ω₅,₅ 0.24473
σ_p 0.048405
-------------------
† indicates constant parameters
3.3 Comparing One- versus Two-compartment models
The 2-compartment model has a much lower objective function compared to the 1-compartment. Let’s compare the estimates from the 2 models using the compare_estimates function.
compare_estimates(; pkfit_1cmp, pkfit_2cmp)| Row | parameter | pkfit_1cmp | pkfit_2cmp |
|---|---|---|---|
| String | Float64? | Float64? | |
| 1 | tvcl | 3.1642 | 2.81378 |
| 2 | tvv | 13.288 | 11.0046 |
| 3 | tvka | 2.0 | 2.0 |
| 4 | Ω₁,₁ | 0.0849405 | 0.102669 |
| 5 | Ω₂,₂ | 0.0485682 | 0.0607756 |
| 6 | Ω₃,₃ | 5.58107 | 1.20116 |
| 7 | σ_p | 0.100928 | 0.0484049 |
| 8 | tvvp | missing | 5.53998 |
| 9 | tvq | missing | 1.51591 |
| 10 | Ω₄,₄ | missing | 0.423494 |
| 11 | Ω₅,₅ | missing | 0.244731 |
We perform a likelihood ratio test to compare the two nested models. The test statistic and the \(p\)-value clearly indicate that a 2-compartment model should be preferred.
lrtest(pkfit_1cmp, pkfit_2cmp)Statistic: 1200.0
Degrees of freedom: 4
P-value: 0.0
We should also compare the other metrics and statistics, such ηshrinkage, ϵshrinkage, aic, and bic using the metrics_table function.
@chain metrics_table(pkfit_2cmp) begin
leftjoin(metrics_table(pkfit_1cmp); on = :Metric, makeunique = true)
rename!(:Value => :pk2cmp, :Value_1 => :pk1cmp)
endWARNING: using deprecated binding Distributions.MatrixReshaped in Pumas.
, use Distributions.ReshapedDistribution{2, S, D} where D<:Distributions.Distribution{Distributions.ArrayLikeVariate{1}, S} where S<:Distributions.ValueSupport instead.
| Row | Metric | pk2cmp | pk1cmp |
|---|---|---|---|
| String | Any | Any | |
| 1 | Successful | true | true |
| 2 | Estimation Time | 15.586 | 4.952 |
| 3 | Subjects | 120 | 120 |
| 4 | Fixed Parameters | 1 | 1 |
| 5 | Optimized Parameters | 10 | 6 |
| 6 | Conc Active Observations | 1320 | 1320 |
| 7 | Conc Missing Observations | 0 | 0 |
| 8 | Total Active Observations | 1320 | 1320 |
| 9 | Total Missing Observations | 0 | 0 |
| 10 | Likelihood Approximation | Pumas.FOCE{Optim.NewtonTrustRegion{Float64}, Optim.Options{Float64, Nothing}} | Pumas.FOCE{Optim.NewtonTrustRegion{Float64}, Optim.Options{Float64, Nothing}} |
| 11 | LogLikelihood (LL) | 1830.63 | 1231.19 |
| 12 | -2LL | -3661.26 | -2462.38 |
| 13 | AIC | -3641.26 | -2450.38 |
| 14 | BIC | -3589.41 | -2419.26 |
| 15 | (η-shrinkage) η₁ | 0.037 | 0.016 |
| 16 | (η-shrinkage) η₂ | 0.047 | 0.04 |
| 17 | (η-shrinkage) η₃ | 0.516 | 0.733 |
| 18 | (ϵ-shrinkage) Conc | 0.185 | 0.105 |
| 19 | (η-shrinkage) η₄ | 0.287 | missing |
| 20 | (η-shrinkage) η₅ | 0.154 | missing |
We next generate some goodness of fit plots to compare which model is performing better. To do this, we first inspect the diagnostics of our model fit.
res_inspect_1cmp = inspect(pkfit_1cmp)[ Info: Calculating predictions. [ Info: Calculating weighted residuals. [ Info: Calculating empirical bayes. [ Info: Evaluating dose control parameters. [ Info: Evaluating individual parameters. [ Info: Done.
FittedPumasModelInspection
Likelihood approximation used for weighted residuals: FOCE
res_inspect_2cmp = inspect(pkfit_2cmp)[ Info: Calculating predictions. [ Info: Calculating weighted residuals. [ Info: Calculating empirical bayes. [ Info: Evaluating dose control parameters. [ Info: Evaluating individual parameters. [ Info: Done.
FittedPumasModelInspection
Likelihood approximation used for weighted residuals: FOCE
gof_1cmp = goodness_of_fit(
res_inspect_1cmp;
figure = (; fontsize = 12),
legend = (; position = :bottom),
)gof_2cmp = goodness_of_fit(
res_inspect_2cmp;
figure = (; fontsize = 12),
legend = (; position = :bottom),
)These plots clearly indicate that the 2-compartment model is a better fit compared to the 1-compartment model.
We can look at selected sample of individual plots.
fig_subject_fits = subject_fits(
res_inspect_2cmp;
separate = true,
paginate = true,
figure = (; fontsize = 18),
axis = (; xlabel = "Time (hr)", ylabel = "CTMx Concentration (ng/mL)"),
)
fig_subject_fits[1]There a lot of important plotting functions you can use for your standard model diagnostics. Please make sure to read the documentation for plotting. Below, we are checking the distribution of the empirical Bayes estimates.
empirical_bayes_dist(res_inspect_2cmp; zeroline_color = :red)empirical_bayes_vs_covariates(
res_inspect_2cmp;
categorical = [:Dose],
figure = (; size = (600, 800)),
)Clearly, our guess at tvka seems off-target. Let’s try and estimate tvka instead of fixing it to 2:
pkfit_2cmp_unfix_ka = fit(pk_2cmp, pkpain_noplb, init_params(pk_2cmp), FOCE())[ Info: Checking the initial parameter values. [ Info: The initial negative log likelihood and its gradient are finite. Check passed. Iter Function value Gradient norm 0 -3.200734e+02 1.272671e+03 * time: 1.9788742065429688e-5 1 -8.682982e+02 1.000199e+03 * time: 1.3752689361572266 2 -1.381870e+03 5.008081e+02 * time: 4.710907936096191 3 -1.551053e+03 6.833490e+02 * time: 5.021253824234009 4 -1.680887e+03 1.834586e+02 * time: 5.356519937515259 5 -1.726118e+03 8.870274e+01 * time: 5.787344932556152 6 -1.761023e+03 1.162036e+02 * time: 6.062230825424194 7 -1.786619e+03 1.114552e+02 * time: 6.374704837799072 8 -1.863556e+03 9.914305e+01 * time: 6.759783983230591 9 -1.882942e+03 5.342676e+01 * time: 7.058395862579346 10 -1.888020e+03 2.010181e+01 * time: 7.38804292678833 11 -1.889832e+03 1.867263e+01 * time: 7.716746807098389 12 -1.891649e+03 1.668512e+01 * time: 8.052470922470093 13 -1.892615e+03 1.820701e+01 * time: 8.381013870239258 14 -1.893453e+03 1.745195e+01 * time: 8.677191972732544 15 -1.894760e+03 1.850174e+01 * time: 8.977573871612549 16 -1.895647e+03 1.773939e+01 * time: 9.269444942474365 17 -1.896597e+03 1.143462e+01 * time: 9.563838005065918 18 -1.897114e+03 9.720097e+00 * time: 9.861094951629639 19 -1.897373e+03 6.054321e+00 * time: 10.17992877960205 20 -1.897498e+03 3.985954e+00 * time: 10.474714994430542 21 -1.897571e+03 4.262464e+00 * time: 10.7740159034729 22 -1.897633e+03 4.010234e+00 * time: 11.080947875976562 23 -1.897714e+03 4.805375e+00 * time: 11.390622854232788 24 -1.897802e+03 3.508706e+00 * time: 11.70629596710205 25 -1.897865e+03 3.691475e+00 * time: 12.017661809921265 26 -1.897900e+03 2.982721e+00 * time: 12.312055826187134 27 -1.897928e+03 2.563790e+00 * time: 12.598448991775513 28 -1.897968e+03 3.261485e+00 * time: 12.880934000015259 29 -1.898013e+03 3.064689e+00 * time: 13.135291814804077 30 -1.898040e+03 1.636525e+00 * time: 13.423425912857056 31 -1.898051e+03 1.439997e+00 * time: 13.707693815231323 32 -1.898057e+03 1.436504e+00 * time: 13.983926773071289 33 -1.898069e+03 1.881528e+00 * time: 14.264985799789429 34 -1.898095e+03 3.253164e+00 * time: 14.549277782440186 35 -1.898142e+03 4.257941e+00 * time: 14.840016841888428 36 -1.898199e+03 3.685241e+00 * time: 15.14122200012207 37 -1.898245e+03 2.567364e+00 * time: 15.444238901138306 38 -1.898246e+03 2.561569e+00 * time: 15.864982843399048 39 -1.898251e+03 2.530909e+00 * time: 16.235780954360962 40 -1.898298e+03 2.673535e+00 * time: 16.549152851104736 41 -1.898300e+03 2.796030e+00 * time: 16.922921895980835 42 -1.898337e+03 3.655488e+00 * time: 17.33476686477661 43 -1.898342e+03 3.774385e+00 * time: 17.780325889587402 44 -1.898433e+03 4.521858e+00 * time: 18.187306880950928 45 -1.898463e+03 3.637306e+00 * time: 18.498716831207275 46 -1.898477e+03 2.417136e+00 * time: 18.797863960266113 47 -1.898479e+03 1.837133e+00 * time: 19.075655937194824 48 -1.898479e+03 5.285171e-01 * time: 19.407160997390747 49 -1.898479e+03 4.637580e-01 * time: 19.767156839370728 50 -1.898480e+03 1.403921e+00 * time: 20.051596879959106 51 -1.898480e+03 3.206388e+00 * time: 20.34808588027954 52 -1.898480e+03 8.490526e-03 * time: 20.668336868286133 53 -1.898480e+03 9.592087e-03 * time: 20.91326594352722 54 -1.898480e+03 1.163416e-02 * time: 21.163734912872314 55 -1.898480e+03 8.048338e-03 * time: 21.412750005722046 56 -1.898480e+03 6.842725e-03 * time: 21.68645477294922 57 -1.898480e+03 1.556896e-02 * time: 21.947558879852295 58 -1.898480e+03 1.556896e-02 * time: 22.23923087120056 59 -1.898480e+03 2.222981e-02 * time: 22.490635871887207 60 -1.898480e+03 2.226260e-02 * time: 22.76900887489319 61 -1.898480e+03 2.226073e-02 * time: 23.05666995048523 62 -1.898480e+03 2.225956e-02 * time: 23.381205797195435 63 -1.898480e+03 2.225954e-02 * time: 23.72714877128601 64 -1.898480e+03 2.225953e-02 * time: 24.103964805603027 65 -1.898480e+03 2.225952e-02 * time: 24.448258876800537 66 -1.898480e+03 2.225952e-02 * time: 24.808435916900635 67 -1.898480e+03 2.225952e-02 * time: 25.172363996505737 68 -1.898480e+03 2.225952e-02 * time: 25.531762838363647 69 -1.898480e+03 2.225952e-02 * time: 25.87802791595459 70 -1.898480e+03 2.225952e-02 * time: 26.248551845550537 71 -1.898480e+03 2.225951e-02 * time: 26.588859796524048 72 -1.898480e+03 2.225951e-02 * time: 26.944074869155884 73 -1.898480e+03 2.225951e-02 * time: 27.30164384841919 74 -1.898480e+03 2.225951e-02 * time: 27.66710591316223 75 -1.898480e+03 2.225951e-02 * time: 28.069926977157593 76 -1.898480e+03 2.225951e-02 * time: 28.442469835281372 77 -1.898480e+03 2.225951e-02 * time: 28.8099308013916 78 -1.898480e+03 2.225951e-02 * time: 29.153974771499634 79 -1.898480e+03 2.225951e-02 * time: 29.498966932296753 80 -1.898480e+03 2.225951e-02 * time: 29.87049889564514 81 -1.898480e+03 2.225951e-02 * time: 30.233165979385376
FittedPumasModel
Dynamical system type: Closed form
Number of subjects: 120
Observation records: Active Missing
Conc: 1320 0
Total: 1320 0
Number of parameters: Constant Optimized
0 11
Likelihood approximation: FOCE
Likelihood optimizer: BFGS
Termination Reason: NoObjectiveChange
Log-likelihood value: 1898.4797
-----------------
Estimate
-----------------
tvcl 2.6191
tvv 11.378
tvvp 8.4529
tvq 1.3164
tvka 4.8925
Ω₁,₁ 0.13243
Ω₂,₂ 0.05967
Ω₃,₃ 0.41581
Ω₄,₄ 0.080678
Ω₅,₅ 0.24996
σ_p 0.049098
-----------------
compare_estimates(; pkfit_2cmp, pkfit_2cmp_unfix_ka)| Row | parameter | pkfit_2cmp | pkfit_2cmp_unfix_ka |
|---|---|---|---|
| String | Float64? | Float64? | |
| 1 | tvcl | 2.81378 | 2.61912 |
| 2 | tvv | 11.0046 | 11.3783 |
| 3 | tvvp | 5.53998 | 8.45295 |
| 4 | tvq | 1.51591 | 1.31636 |
| 5 | tvka | 2.0 | 4.89252 |
| 6 | Ω₁,₁ | 0.102669 | 0.132433 |
| 7 | Ω₂,₂ | 0.0607756 | 0.05967 |
| 8 | Ω₃,₃ | 1.20116 | 0.415807 |
| 9 | Ω₄,₄ | 0.423494 | 0.0806779 |
| 10 | Ω₅,₅ | 0.244731 | 0.249961 |
| 11 | σ_p | 0.0484049 | 0.0490976 |
Let’s revaluate the goodness of fits and η distribution plots.
Not much change in the general gof plots
res_inspect_2cmp_unfix_ka = inspect(pkfit_2cmp_unfix_ka)[ Info: Calculating predictions. [ Info: Calculating weighted residuals. [ Info: Calculating empirical bayes. [ Info: Evaluating dose control parameters. [ Info: Evaluating individual parameters. [ Info: Done.
FittedPumasModelInspection
Likelihood approximation used for weighted residuals: FOCE
goodness_of_fit(
res_inspect_2cmp_unfix_ka;
figure = (; fontsize = 12),
legend = (; position = :bottom),
)But you can see a huge improvement in the ηka, (η₃) distribution which is now centered around zero
empirical_bayes_vs_covariates(
res_inspect_2cmp_unfix_ka;
categorical = [:Dose],
ebes = [:η₃],
figure = (; size = (600, 800)),
)Finally looking at some individual plots for the same subjects as earlier:
fig_subject_fits2 = subject_fits(
res_inspect_2cmp_unfix_ka;
separate = true,
paginate = true,
facet = (; linkyaxes = false),
figure = (; fontsize = 18),
axis = (; xlabel = "Time (hr)", ylabel = "CTMx Concentration (ng/mL)"),
)
fig_subject_fits2[6]The randomly sampled individual fits don’t seem good in some individuals, but we can evaluate this via a vpc to see how to go about.
3.4 Visual Predictive Checks (VPC)
We can now perform a vpc to check. The default plots provide a 80% prediction interval and a 95% simulated CI (shaded area) around each of the quantiles
pk_vpc = vpc(pkfit_2cmp_unfix_ka, 200; observations = [:Conc], stratify_by = [:Dose])[ Info: Continuous VPC
Visual Predictive Check
Type of VPC: Continuous VPC
Simulated populations: 200
Subjects in data: 120
Stratification variable(s): [:Dose]
Confidence level: 0.95
VPC lines: quantiles ([0.1, 0.5, 0.9])
vpc_plot(
pk_2cmp,
pk_vpc;
rows = 1,
columns = 3,
figure = (; size = (1400, 1000), fontsize = 22),
axis = (;
xlabel = "Time (hr)",
ylabel = "Observed/Predicted\n CTMx Concentration (ng/mL)",
),
facet = (; combinelabels = true),
)The visual predictive check suggests that the model captures the data well across all dose levels.
4 Additional Help
If you have questions regarding this tutorial, please post them on our discourse site.