A Comprehensive Introduction to Pumas

Authors

Vijay Ivaturi

Jose Storopoli

This tutorial provides a comprehensive introduction to a modeling and simulation workflow in Pumas. The idea is not to get into the details of Pumas specifics, but instead provide a narrative on the lines of a regular workflow in our day-to-day work, with brevity where required to allow a broad overview. Wherever possible, cross-references will be provided to documentation and detailed examples that provide deeper insight into a particular topic.

As part of this workflow, you will be introduced to various aspects such as:

  1. Data wrangling in Julia
  2. Exploratory analysis in Julia
  3. Continuous data non-linear mixed effects modeling in Pumas
  4. Model comparison routines, post-processing, validation etc.

1 The Study and Design

CTMNopain is a novel anti-inflammatory agent under preliminary investigation. A dose-ranging trial was conducted comparing placebo with 3 doses of CTMNopain (5mg, 20mg and 80 mg QD). The maximum tolerated dose is 160 mg per day. Plasma concentrations (mg/L) of the drug were measured at 0, 0.5, 1, 1.5, 2, 2.5, 3-8 hours.

Pain score (0=no pain, 1=mild, 2=moderate, 3=severe) were obtained at time points when plasma concentration was collected. A pain score of 2 or more is considered as no pain relief.

The subjects can request for remedication if pain relief is not achieved after 2 hours post dose. Some subjects had remedication before 2 hours if they were not able to bear the pain. The time to remedication and the remedication status is available for subjects.

The pharmacokinetic dataset can be accessed using PharmaDatasets.jl.

2 Setup

2.1 Load libraries

These libraries provide the workhorse functionality in the Pumas ecosystem:

using Pumas
using PumasUtilities
using NCA
using NCAUtilities

In addition, libraries below are good add-on’s that provide ancillary functionality:

using GLM: lm, @formula
using Random
using CSV
using DataFramesMeta
using CairoMakie
using PharmaDatasets

2.2 Data Wrangling

We start by reading in the dataset and making some quick summaries.

Tip

If you want to learn more about data wrangling, don’t forget to check our Data Wrangling in Julia tutorials!

pkpain_df = dataset("pk_painrelief")
first(pkpain_df, 5)
5×7 DataFrame
Row Subject Time Conc PainRelief PainScore RemedStatus Dose
Int64 Float64 Float64 Int64 Int64 Int64 String7
1 1 0.0 0.0 0 3 1 20 mg
2 1 0.5 1.15578 1 1 0 20 mg
3 1 1.0 1.37211 1 0 0 20 mg
4 1 1.5 1.30058 1 0 0 20 mg
5 1 2.0 1.19195 1 1 0 20 mg

Let’s filter out the placebo data as we don’t need that for the PK analysis.

pkpain_noplb_df = @rsubset pkpain_df :Dose != "Placebo";
first(pkpain_noplb_df, 5)
5×7 DataFrame
Row Subject Time Conc PainRelief PainScore RemedStatus Dose
Int64 Float64 Float64 Int64 Int64 Int64 String7
1 1 0.0 0.0 0 3 1 20 mg
2 1 0.5 1.15578 1 1 0 20 mg
3 1 1.0 1.37211 1 0 0 20 mg
4 1 1.5 1.30058 1 0 0 20 mg
5 1 2.0 1.19195 1 1 0 20 mg

3 Analysis

3.1 Non-compartmental analysis

Let’s begin by performing a quick NCA of the concentration time profiles and view the exposure changes across doses. The input data specification for NCA analysis requires the presence of a :route column and an :amt column that specifies the dose. So, let’s add that in:

@rtransform! pkpain_noplb_df begin
    :route = "ev"
    :Dose = parse(Int, chop(:Dose; tail = 3))
end

We also need to create an :amt column:

@rtransform! pkpain_noplb_df :amt = :Time == 0 ? :Dose : missing

Now, we map the data variables to the read_nca function that prepares the data for NCA analysis.

pkpain_nca = read_nca(
    pkpain_noplb_df;
    id = :Subject,
    time = :Time,
    amt = :amt,
    observations = :Conc,
    group = [:Dose],
    route = :route,
)
NCAPopulation (120 subjects):
  Group: [["Dose" => 5], ["Dose" => 20], ["Dose" => 80]]
  Number of missing observations: 0
  Number of blq observations: 0

Now that we mapped the data in, let’s visualize the concentration vs time plots for a few individuals. When paginate is set to true, a vector of plots are returned and below we display the first element with 9 individuals.

f = observations_vs_time(
    pkpain_nca;
    paginate = true,
    axis = (; xlabel = "Time (hr)", ylabel = "CTMNoPain Concentration (ng/mL)"),
)
f[1]

An observations versus time profile for all subjects

Observations versus Time

or you can view the summary curves by dose group as passed in to the group argument in read_nca

summary_observations_vs_time(
    pkpain_nca,
    figure = (; fontsize = 22, size = (800, 1000)),
    color = "black",
    linewidth = 3,
    axis = (; xlabel = "Time (hr)", ylabel = "CTMX Concentration (μg/mL)"),
)

An observations versus time profile for all subjects in a summarized manner

Summary Observations versus Time

A full NCA Report is now obtained for completeness purposes using the run_nca function, but later we will only extract a couple of key metrics of interest.

pk_nca = run_nca(pkpain_nca; sigdigits = 3)

We can look at the NCA fits for some subjects. Here f is a vector or figures. We’ll showcase the first image by indexing f:

f = subject_fits(
    pk_nca,
    paginate = true,
    axis = (; xlabel = "Time (hr)", ylabel = "CTMX Concentration (μg/mL)"),

    # Legend options
    legend = (; position = :bottom),
)
f[1]

Trend plot with observations for all individual subjects over time

Subject Fits

As CTMNopain’s effect maybe mainly related to maximum concentration (cmax) or area under the curve (auc), we present some summary statistics using the summarize function from NCA.

strata = [:Dose]
1-element Vector{Symbol}:
 :Dose
params = [:cmax, :aucinf_obs]
2-element Vector{Symbol}:
 :cmax
 :aucinf_obs
output = summarize(pk_nca; stratify_by = strata, parameters = params)
6×10 DataFrame
Row Dose parameters numsamples minimum maximum mean std geomean geostd geomeanCV
Int64 String Int64 Float64 Float64 Float64 Float64 Float64 Float64 Float64
1 5 cmax 40 0.19 0.539 0.356075 0.0884129 0.345104 1.2932 26.1425
2 5 aucinf_obs 40 0.914 3.4 1.5979 0.490197 1.53373 1.32974 29.0868
3 20 cmax 40 0.933 2.7 1.4737 0.361871 1.43408 1.2633 23.6954
4 20 aucinf_obs 40 2.77 14.1 6.377 2.22239 6.02031 1.41363 35.6797
5 80 cmax 40 3.3 8.47 5.787 1.31957 5.64164 1.25757 23.2228
6 80 aucinf_obs 40 13.7 49.1 29.5 8.68984 28.2954 1.34152 30.0258

The statistics printed above are the default, but you can pass in your own statistics using the stats = [] argument to the summarize function.

We can look at a few parameter distribution plots.

parameters_vs_group(
    pk_nca,
    parameter = :cmax,
    axis = (; xlabel = "Dose (mg)", ylabel = "Cₘₐₓ (ng/mL)"),
    figure = (; fontsize = 18),
)

A violin plot for the Cmax distribution for each dose group

Cmax for each Dose Group

Dose normalized PK parameters, cmax and aucinf were essentially dose proportional between for 5 mg, 20 mg and 80 mg doses. You can perform a simple regression to check the impact of dose on cmax:

dp = NCA.DoseLinearityPowerModel(pk_nca, :cmax; level = 0.9)
Dose Linearity Power Model
Variable: cmax
Model: log(cmax) ~ log(α) + β × log(dose)
────────────────────────────────────
   Estimate  low CI 90%  high CI 90%
────────────────────────────────────
β   1.00775     0.97571       1.0398
────────────────────────────────────

Here’s a visualization for the dose linearity using a power model for cmax:

power_model(dp; legend = (; position = :bottom))

A dose linearity power model plot for Cmax

Dose Linearity Plot

We can also visualize a dose proportionality results with respect to a specific endpoint in a NCA Report; for example cmax and aucinf_obs:

dose_vs_dose_normalized(pk_nca, :cmax)

A dose proportionality plot for Cmax

Dose Proportionality Plot
dose_vs_dose_normalized(pk_nca, :aucinf_obs)

A dose proportionality plot for AUC

Dose Proportionality Plot

Based on visual inspection of the concentration time profiles as seen earlier, CTMNopain exhibited monophasic decline, and perhaps a one compartment model best fits the PK data.

3.2 Pharmacokinetic modeling

As seen from the plots above, the concentrations decline monoexponentially. We will evaluate both one and two compartment structural models to assess best fit. Further, different residual error models will also be tested.

We will use the results from NCA to provide us good initial estimates.

3.2.1 Data preparation for modeling

PumasNDF requires the presence of :evid and :cmt columns in the dataset.

@rtransform! pkpain_noplb_df begin
    :evid = :Time == 0 ? 1 : 0
    :cmt = :Time == 0 ? 1 : 2
    :cmt2 = 1 # for zero order absorption
end

Further, observations at time of dosing, i.e., when evid = 1 have to be missing

@rtransform! pkpain_noplb_df :Conc = :evid == 1 ? missing : :Conc

The dataframe will now be converted to a Population using read_pumas. Note that both observations and covariates are required to be an array even if it is one element.

pkpain_noplb = read_pumas(
    pkpain_noplb_df;
    id = :Subject,
    time = :Time,
    amt = :amt,
    observations = [:Conc],
    covariates = [:Dose],
    evid = :evid,
    cmt = :cmt,
)
Population
  Subjects: 120
  Covariates: Dose
  Observations: Conc

Now that the data is transformed to a Population of subjects, we can explore different models.

3.2.2 One-compartment model

Note

If you are not familiar yet with the @model blocks and syntax, please check our documentation.

pk_1cmp = @model begin

    @metadata begin
        desc = "One Compartment Model"
        timeu = u"hr"
    end

    @param begin
        """
        Clearance (L/hr)
        """
        tvcl  RealDomain(; lower = 0, init = 3.2)
        """
        Volume (L)
        """
        tvv  RealDomain(; lower = 0, init = 16.4)
        """
        Absorption rate constant (h-1)
        """
        tvka  RealDomain(; lower = 0, init = 3.8)
        """
          - ΩCL
          - ΩVc
          - ΩKa
        """
        Ω  PDiagDomain(init = [0.04, 0.04, 0.04])
        """
        Proportional RUV
        """
        σ_p  RealDomain(; lower = 0.0001, init = 0.2)
    end

    @random begin
        η ~ MvNormal(Ω)
    end

    @covariates begin
        """
        Dose (mg)
        """
        Dose
    end

    @pre begin
        CL = tvcl * exp(η[1])
        Vc = tvv * exp(η[2])
        Ka = tvka * exp(η[3])
    end

    @dynamics Depots1Central1

    @derived begin
        cp := @. Central / Vc
        """
        CTMx Concentration (ng/mL)
        """
        Conc ~ @. Normal(cp, abs(cp) * σ_p)
    end

end
Warning: Covariate Dose is not used in the model.
@ Pumas ~/run/_work/PumasTutorials.jl/PumasTutorials.jl/custom_julia_depot/packages/Pumas/6G31F/src/dsl/model_macro.jl:3167
PumasModel
  Parameters: tvcl, tvv, tvka, Ω, σ_p
  Random effects: η
  Covariates: Dose
  Dynamical system variables: Depot, Central
  Dynamical system type: Closed form
  Derived: Conc
  Observed: Conc
Tip

Note that the local assignment := can be used to define intermediate statements that will not be carried outside of the block. This means that all the resulting data workflows from this model will not contain the intermediate variables defined with :=. We use this when we want to suppress the variable from any further output.

The idea behind := is for performance reasons. If you are not carrying the variable defined with := outside of the block, then it is not necessary to store it in the resulting data structures. Not only will your model run faster, but the resulting data structures will also be smaller.

Before going to fit the model, let’s evaluate some helpful steps via simulation to check appropriateness of data and model

# zero out the random effects
etas = zero_randeffs(pk_1cmp, pkpain_noplb, init_params(pk_1cmp))

Above, we are generating a vector of η’s of the same length as the number of subjects to zero out the random effects. We do this as we are evaluating the trajectories of the concentrations at the initial set of parameters at a population level. Other helper functions here are sample_randeffs and init_randeffs. Please refer to the documentation.

simpk_iparams = simobs(pk_1cmp, pkpain_noplb, init_params(pk_1cmp), etas)
Simulated population (Vector{<:Subject})
  Simulated subjects: 120
  Simulated variables: Conc
sim_plot(
    pk_1cmp,
    simpk_iparams;
    observations = [:Conc],
    figure = (; fontsize = 18),
    axis = (;
        xlabel = "Time (hr)",
        ylabel = "Observed/Predicted \n CTMx Concentration (ng/mL)",
    ),
)

A simulated observations versus time plot overlaid with the scatter plot of the observed observations

Simulated Observations Plot

Our NCA based initial guess on the parameters seem to work well.

Lets change the initial estimate of a couple of the parameters to evaluate the sensitivity.

pkparam = (; init_params(pk_1cmp)..., tvka = 2, tvv = 10)
(tvcl = 3.2,
 tvv = 10,
 tvka = 2,
 Ω = [0.04 0.0 0.0; 0.0 0.04 0.0; 0.0 0.0 0.04],
 σ_p = 0.2,)
simpk_changedpars = simobs(pk_1cmp, pkpain_noplb, pkparam, etas)
Simulated population (Vector{<:Subject})
  Simulated subjects: 120
  Simulated variables: Conc
sim_plot(
    pk_1cmp,
    simpk_changedpars;
    observations = [:Conc],
    figure = (; fontsize = 18),
    axis = (
        xlabel = "Time (hr)",
        ylabel = "Observed/Predicted \n CTMx Concentration (ng/mL)",
    ),
)

A simulated observations versus time plot overlaid with the scatter plot of the observed observations

Simulated Observations Plot

Changing the tvka and decreasing the tvv seemed to make an impact and observations go through the simulated lines.

To get a quick ballpark estimate of your PK parameters, we can do a NaivePooled analysis.

3.2.2.1 NaivePooled
pkfit_np = fit(pk_1cmp, pkpain_noplb, init_params(pk_1cmp), NaivePooled(); omegas = (:Ω,))
[ Info: Checking the initial parameter values.
[ Info: The initial negative log likelihood and its gradient are finite. Check passed.
Iter     Function value   Gradient norm 
     0     7.744356e+02     3.715711e+03
 * time: 0.042330026626586914
     1     2.343899e+02     1.747348e+03
 * time: 1.63771390914917
     2     9.696232e+01     1.198088e+03
 * time: 1.6413180828094482
     3    -7.818699e+01     5.538151e+02
 * time: 1.6439359188079834
     4    -1.234803e+02     2.462514e+02
 * time: 1.6465020179748535
     5    -1.372888e+02     2.067458e+02
 * time: 1.649204969406128
     6    -1.410579e+02     1.162950e+02
 * time: 1.651879072189331
     7    -1.434754e+02     5.632816e+01
 * time: 1.6545019149780273
     8    -1.453401e+02     7.859270e+01
 * time: 1.6570310592651367
     9    -1.498185e+02     1.455606e+02
 * time: 1.6597371101379395
    10    -1.534371e+02     1.303682e+02
 * time: 1.6623129844665527
    11    -1.563557e+02     5.975474e+01
 * time: 1.664820909500122
    12    -1.575052e+02     9.308611e+00
 * time: 1.667341947555542
    13    -1.579357e+02     1.234484e+01
 * time: 1.6700220108032227
    14    -1.581874e+02     7.478196e+00
 * time: 1.672605037689209
    15    -1.582981e+02     2.027162e+00
 * time: 1.6752350330352783
    16    -1.583375e+02     5.578262e+00
 * time: 1.6777660846710205
    17    -1.583556e+02     4.727050e+00
 * time: 1.6804981231689453
    18    -1.583644e+02     2.340173e+00
 * time: 1.6830580234527588
    19    -1.583680e+02     7.738100e-01
 * time: 1.6855559349060059
    20    -1.583696e+02     3.300689e-01
 * time: 1.6880531311035156
    21    -1.583704e+02     3.641985e-01
 * time: 1.6907470226287842
    22    -1.583707e+02     4.365901e-01
 * time: 1.6932449340820312
    23    -1.583709e+02     3.887800e-01
 * time: 1.695725917816162
    24    -1.583710e+02     2.766977e-01
 * time: 1.6981940269470215
    25    -1.583710e+02     1.758029e-01
 * time: 1.7007980346679688
    26    -1.583710e+02     1.133947e-01
 * time: 1.7033939361572266
    27    -1.583710e+02     7.922544e-02
 * time: 1.7059500217437744
    28    -1.583710e+02     5.954998e-02
 * time: 1.7084639072418213
    29    -1.583710e+02     4.157080e-02
 * time: 1.7111890316009521
    30    -1.583710e+02     4.295446e-02
 * time: 1.9475860595703125
    31    -1.583710e+02     5.170752e-02
 * time: 1.9513359069824219
    32    -1.583710e+02     2.644382e-02
 * time: 1.9541411399841309
    33    -1.583710e+02     4.548987e-03
 * time: 1.9569151401519775
    34    -1.583710e+02     2.501805e-02
 * time: 1.9597609043121338
    35    -1.583710e+02     3.763439e-02
 * time: 1.9618630409240723
    36    -1.583710e+02     3.206027e-02
 * time: 1.9640071392059326
    37    -1.583710e+02     1.003700e-02
 * time: 1.9661810398101807
    38    -1.583710e+02     2.209084e-02
 * time: 1.9683051109313965
    39    -1.583710e+02     4.954136e-03
 * time: 1.970552921295166
    40    -1.583710e+02     1.609366e-02
 * time: 1.9735710620880127
    41    -1.583710e+02     1.579810e-02
 * time: 1.975909948348999
    42    -1.583710e+02     1.014156e-03
 * time: 1.9782729148864746
    43    -1.583710e+02     6.050792e-03
 * time: 1.9814059734344482
    44    -1.583710e+02     1.354381e-02
 * time: 1.9838061332702637
    45    -1.583710e+02     4.473216e-03
 * time: 1.9860501289367676
    46    -1.583710e+02     4.645458e-03
 * time: 1.9882800579071045
    47    -1.583710e+02     9.828063e-03
 * time: 1.9904229640960693
    48    -1.583710e+02     1.047215e-03
 * time: 1.9925410747528076
    49    -1.583710e+02     8.374104e-03
 * time: 1.9946210384368896
    50    -1.583710e+02     7.841995e-04
 * time: 1.996819019317627
FittedPumasModel

Dynamical system type:                 Closed form

Number of subjects:                            120

Observation records:         Active        Missing
    Conc:                      1320              0
    Total:                     1320              0

Number of parameters:      Constant      Optimized
                                  1              6

Likelihood approximation:              NaivePooled
Likelihood optimizer:                         BFGS

Termination Reason:                   GradientNorm
Log-likelihood value:                    158.37103

------------------
         Estimate
------------------
  tvcl    3.0054
  tvv    14.089
  tvka   44.227
† Ω₁,₁    0.0
† Ω₂,₂    0.0
† Ω₃,₃    0.0
  σ_p     0.32999
------------------
† indicates constant parameters
coefficients_table(pkfit_np)
7×4 DataFrame
Row Parameter Description Constant Estimate
String SubStrin… Bool Float64
1 tvcl Clearance (L/hr) false 3.005
2 tvv Volume (L) false 14.089
3 tvka Absorption rate constant (h-1) false 44.227
4 Ω₁,₁ ΩCL true 0.0
5 Ω₂,₂ ΩVc true 0.0
6 Ω₃,₃ ΩKa true 0.0
7 σ_p Proportional RUV false 0.33

The final estimates from the NaivePooled approach seem reasonably close to our initial guess from NCA, except for the tvka parameter. We will stick with our initial guess.

One way to be cautious before going into a complete fitting routine is to evaluate the likelihood of the individual subjects given the initial parameter values and see if any subject(s) pops out as unreasonable. There are a few ways of doing this:

  • check the loglikelihood subject wise
  • check if there any influential subjects

Below, we are basically checking if the initial estimates for any subject are way off that we are unable to compute the initial loglikelihood.

lls = [loglikelihood(pk_1cmp, subj, pkparam, FOCE()) for subj in pkpain_noplb]
# the plot below is using native CairoMakie `hist`
hist(lls; bins = 10, normalization = :none, color = (:black, 0.5))

A histogram of the individual loglikelihoods

Histogram of Loglikelihoods

The distribution of the loglikelihood’s suggest no extreme outliers.

A more convenient way is to use the findinfluential function that provides a list of k top influential subjects by showing the normalized (minus) loglikelihood for each subject. As you can see below, the minus loglikelihood in the range of 16 agrees with the histogram plotted above.

influential_subjects = findinfluential(pk_1cmp, pkpain_noplb, pkparam, FOCE())
120-element Vector{@NamedTuple{id::String, nll::Float64}}:
 (id = "148", nll = 16.65965885684477)
 (id = "135", nll = 16.648985190076335)
 (id = "156", nll = 15.959069556607496)
 (id = "159", nll = 15.441218240496484)
 (id = "149", nll = 14.71513464411951)
 (id = "88", nll = 13.09709837464614)
 (id = "16", nll = 12.98228052193144)
 (id = "61", nll = 12.65218290230368)
 (id = "71", nll = 12.500330088085505)
 (id = "59", nll = 12.241510254805235)
 ⋮
 (id = "57", nll = -22.79767423253431)
 (id = "93", nll = -22.836900711478208)
 (id = "12", nll = -23.007742339519247)
 (id = "123", nll = -23.292751843079234)
 (id = "41", nll = -23.425412534960515)
 (id = "99", nll = -23.535214841901112)
 (id = "29", nll = -24.025959868383083)
 (id = "52", nll = -24.164757842493685)
 (id = "24", nll = -25.57209232565845)
3.2.2.2 FOCE

Now that we have a good handle on our data, lets go ahead and fit a population model with FOCE:

pkfit_1cmp = fit(pk_1cmp, pkpain_noplb, pkparam, FOCE(); constantcoef = (; tvka = 2))
[ Info: Checking the initial parameter values.
[ Info: The initial negative log likelihood and its gradient are finite. Check passed.
Iter     Function value   Gradient norm 
     0    -5.935351e+02     5.597318e+02
 * time: 5.3882598876953125e-5
     1    -7.022088e+02     1.707063e+02
 * time: 0.7287938594818115
     2    -7.314067e+02     2.903269e+02
 * time: 1.1674559116363525
     3    -8.520591e+02     2.285888e+02
 * time: 1.2807550430297852
     4    -1.120191e+03     3.795410e+02
 * time: 1.5146968364715576
     5    -1.178784e+03     2.323978e+02
 * time: 1.641737937927246
     6    -1.218320e+03     9.699907e+01
 * time: 2.2205889225006104
     7    -1.223641e+03     5.862105e+01
 * time: 2.3235130310058594
     8    -1.227620e+03     1.831402e+01
 * time: 2.429313898086548
     9    -1.228381e+03     2.132323e+01
 * time: 4.44721794128418
    10    -1.230098e+03     2.921228e+01
 * time: 4.547890901565552
    11    -1.230854e+03     2.029662e+01
 * time: 4.646931886672974
    12    -1.231116e+03     5.229099e+00
 * time: 4.741170883178711
    13    -1.231179e+03     1.689232e+00
 * time: 4.834592819213867
    14    -1.231187e+03     1.215379e+00
 * time: 4.930938959121704
    15    -1.231188e+03     2.770381e-01
 * time: 5.017364978790283
    16    -1.231188e+03     1.636652e-01
 * time: 5.0932090282440186
    17    -1.231188e+03     2.701149e-01
 * time: 5.160506963729858
    18    -1.231188e+03     3.163341e-01
 * time: 5.22962498664856
    19    -1.231188e+03     1.505153e-01
 * time: 5.386631965637207
    20    -1.231188e+03     2.485002e-02
 * time: 5.452517032623291
    21    -1.231188e+03     8.435209e-04
 * time: 5.512495040893555
FittedPumasModel

Dynamical system type:                 Closed form

Number of subjects:                            120

Observation records:         Active        Missing
    Conc:                      1320              0
    Total:                     1320              0

Number of parameters:      Constant      Optimized
                                  1              6

Likelihood approximation:                     FOCE
Likelihood optimizer:                         BFGS

Termination Reason:                   GradientNorm
Log-likelihood value:                     1231.188

-------------------
         Estimate
-------------------
  tvcl    3.1642
  tvv    13.288
† tvka    2.0
  Ω₁,₁    0.08494
  Ω₂,₂    0.048568
  Ω₃,₃    5.5811
  σ_p     0.10093
-------------------
† indicates constant parameters
infer(pkfit_1cmp)
[ Info: Calculating: variance-covariance matrix.
[ Info: Done.
Asymptotic inference results using sandwich estimator

Dynamical system type:                 Closed form

Number of subjects:                            120

Observation records:         Active        Missing
    Conc:                      1320              0
    Total:                     1320              0

Number of parameters:      Constant      Optimized
                                  1              6

Likelihood approximation:                     FOCE
Likelihood optimizer:                         BFGS

Termination Reason:                   GradientNorm
Log-likelihood value:                     1231.188

---------------------------------------------------------
         Estimate    SE          95.0% C.I.
---------------------------------------------------------
  tvcl    3.1642     0.08662     [  2.9944  ;  3.334   ]
  tvv    13.288      0.27481     [ 12.749   ; 13.827   ]
† tvka    2.0        NaN         [  NaN     ;  NaN     ]
  Ω₁,₁    0.08494    0.011022    [  0.063338;  0.10654 ]
  Ω₂,₂    0.048568   0.0063502   [  0.036122;  0.061014]
  Ω₃,₃    5.5811     1.2188      [  3.1922  ;  7.97    ]
  σ_p     0.10093    0.0057196   [  0.089718;  0.11214 ]
---------------------------------------------------------
† indicates constant parameters

Notice that tvka is fixed to 2 as we don’t have a lot of information before tmax. From the results above, we see that the parameter precision for this model is reasonable.

3.2.3 Two-compartment model

Just to be sure, let’s fit a 2-compartment model and evaluate:

pk_2cmp = @model begin

    @param begin
        """
        Clearance (L/hr)
        """
        tvcl  RealDomain(; lower = 0, init = 3.2)
        """
        Central Volume (L)
        """
        tvv  RealDomain(; lower = 0, init = 16.4)
        """
        Peripheral Volume (L)
        """
        tvvp  RealDomain(; lower = 0, init = 10)
        """
        Distributional Clearance (L/hr)
        """
        tvq  RealDomain(; lower = 0, init = 2)
        """
        Absorption rate constant (h-1)
        """
        tvka  RealDomain(; lower = 0, init = 1.3)
        """
          - ΩCL
          - ΩVc
          - ΩKa
          - ΩVp
          - ΩQ
        """
        Ω  PDiagDomain(init = [0.04, 0.04, 0.04, 0.04, 0.04])
        """
        Proportional RUV
        """
        σ_p  RealDomain(; lower = 0.0001, init = 0.2)
    end

    @random begin
        η ~ MvNormal(Ω)
    end

    @covariates begin
        """
        Dose (mg)
        """
        Dose
    end

    @pre begin
        CL = tvcl * exp(η[1])
        Vc = tvv * exp(η[2])
        Ka = tvka * exp(η[3])
        Vp = tvvp * exp(η[4])
        Q = tvq * exp(η[5])
    end

    @dynamics Depots1Central1Periph1

    @derived begin
        cp := @. Central / Vc
        """
        CTMx Concentration (ng/mL)
        """
        Conc ~ @. Normal(cp, cp * σ_p)
    end
end
Warning: Covariate Dose is not used in the model.
@ Pumas ~/run/_work/PumasTutorials.jl/PumasTutorials.jl/custom_julia_depot/packages/Pumas/6G31F/src/dsl/model_macro.jl:3167
PumasModel
  Parameters: tvcl, tvv, tvvp, tvq, tvka, Ω, σ_p
  Random effects: η
  Covariates: Dose
  Dynamical system variables: Depot, Central, Peripheral
  Dynamical system type: Closed form
  Derived: Conc
  Observed: Conc
3.2.3.1 FOCE
pkfit_2cmp =
    fit(pk_2cmp, pkpain_noplb, init_params(pk_2cmp), FOCE(); constantcoef = (; tvka = 2))
[ Info: Checking the initial parameter values.
[ Info: The initial negative log likelihood and its gradient are finite. Check passed.
Iter     Function value   Gradient norm 
     0    -6.302369e+02     1.021050e+03
 * time: 3.0994415283203125e-5
     1    -9.197817e+02     9.927951e+02
 * time: 0.8132669925689697
     2    -1.372640e+03     2.054986e+02
 * time: 1.106928825378418
     3    -1.446326e+03     1.543987e+02
 * time: 1.3715379238128662
     4    -1.545570e+03     1.855028e+02
 * time: 1.6197879314422607
     5    -1.581449e+03     1.713157e+02
 * time: 1.982813835144043
     6    -1.639433e+03     1.257382e+02
 * time: 2.215089797973633
     7    -1.695964e+03     7.450539e+01
 * time: 2.4487428665161133
     8    -1.722243e+03     5.961044e+01
 * time: 2.687288999557495
     9    -1.736883e+03     7.320921e+01
 * time: 2.943798780441284
    10    -1.753547e+03     7.501938e+01
 * time: 3.191983938217163
    11    -1.764053e+03     6.185661e+01
 * time: 3.44085693359375
    12    -1.778991e+03     4.831033e+01
 * time: 3.707488775253296
    13    -1.791492e+03     4.943278e+01
 * time: 4.071454763412476
    14    -1.799847e+03     2.871410e+01
 * time: 4.5590739250183105
    15    -1.805374e+03     7.520790e+01
 * time: 4.9137938022613525
    16    -1.816260e+03     2.990621e+01
 * time: 5.247376918792725
    17    -1.818252e+03     2.401915e+01
 * time: 5.523160934448242
    18    -1.822988e+03     2.587225e+01
 * time: 5.807366847991943
    19    -1.824653e+03     1.550517e+01
 * time: 6.077073812484741
    20    -1.826074e+03     1.788927e+01
 * time: 6.342381954193115
    21    -1.826821e+03     1.888389e+01
 * time: 6.588862895965576
    22    -1.827900e+03     1.432840e+01
 * time: 6.864569902420044
    23    -1.828511e+03     9.422040e+00
 * time: 7.11158299446106
    24    -1.828754e+03     5.363445e+00
 * time: 7.379257917404175
    25    -1.828862e+03     4.916168e+00
 * time: 7.619025945663452
    26    -1.829007e+03     4.695750e+00
 * time: 7.859494924545288
    27    -1.829358e+03     1.090244e+01
 * time: 8.151124000549316
    28    -1.829830e+03     1.451320e+01
 * time: 8.609083890914917
    29    -1.830201e+03     1.108695e+01
 * time: 8.970133781433105
    30    -1.830360e+03     2.892317e+00
 * time: 9.262877941131592
    31    -1.830390e+03     1.699265e+00
 * time: 9.535704851150513
    32    -1.830404e+03     1.602222e+00
 * time: 9.821102857589722
    33    -1.830432e+03     2.823676e+00
 * time: 10.092804908752441
    34    -1.830475e+03     4.121601e+00
 * time: 10.34132695198059
    35    -1.830527e+03     5.080494e+00
 * time: 10.594703912734985
    36    -1.830591e+03     2.668323e+00
 * time: 10.861382961273193
    37    -1.830615e+03     3.522601e+00
 * time: 11.105571985244751
    38    -1.830623e+03     2.203940e+00
 * time: 11.349177837371826
    39    -1.830625e+03     1.642394e+00
 * time: 11.583603858947754
    40    -1.830627e+03     9.396311e-01
 * time: 11.805682897567749
    41    -1.830628e+03     8.588414e-01
 * time: 12.054863929748535
    42    -1.830628e+03     3.457037e-01
 * time: 12.273749828338623
    43    -1.830629e+03     4.556038e-01
 * time: 12.493207931518555
    44    -1.830630e+03     6.366787e-01
 * time: 12.717012882232666
    45    -1.830630e+03     4.104090e-01
 * time: 12.93876576423645
    46    -1.830630e+03     7.434196e-02
 * time: 13.171433925628662
    47    -1.830630e+03     7.316846e-02
 * time: 13.434865951538086
    48    -1.830630e+03     7.320992e-02
 * time: 13.706913948059082
    49    -1.830630e+03     7.471716e-02
 * time: 13.953266859054565
    50    -1.830630e+03     7.471716e-02
 * time: 14.229801893234253
    51    -1.830630e+03     7.471716e-02
 * time: 14.36996078491211
FittedPumasModel

Dynamical system type:                 Closed form

Number of subjects:                            120

Observation records:         Active        Missing
    Conc:                      1320              0
    Total:                     1320              0

Number of parameters:      Constant      Optimized
                                  1             10

Likelihood approximation:                     FOCE
Likelihood optimizer:                         BFGS

Termination Reason:                      NoXChange
Log-likelihood value:                    1830.6304

-------------------
         Estimate
-------------------
  tvcl    2.8137
  tvv    11.005
  tvvp    5.5401
  tvq     1.5159
† tvka    2.0
  Ω₁,₁    0.10266
  Ω₂,₂    0.060778
  Ω₃,₃    1.2012
  Ω₄,₄    0.42347
  Ω₅,₅    0.24471
  σ_p     0.048404
-------------------
† indicates constant parameters

3.3 Comparing One- versus Two-compartment models

The 2-compartment model has a much lower objective function compared to the 1-compartment. Let’s compare the estimates from the 2 models using the compare_estimates function.

compare_estimates(; pkfit_1cmp, pkfit_2cmp)
11×3 DataFrame
Row parameter pkfit_1cmp pkfit_2cmp
String Float64? Float64?
1 tvcl 3.1642 2.81373
2 tvv 13.288 11.0047
3 tvka 2.0 2.0
4 Ω₁,₁ 0.0849405 0.102656
5 Ω₂,₂ 0.0485682 0.060778
6 Ω₃,₃ 5.58107 1.20118
7 σ_p 0.100928 0.0484045
8 tvvp missing 5.54005
9 tvq missing 1.51591
10 Ω₄,₄ missing 0.423466
11 Ω₅,₅ missing 0.244713

We perform a likelihood ratio test to compare the two nested models. The test statistic and the \(p\)-value clearly indicate that a 2-compartment model should be preferred.

lrtest(pkfit_1cmp, pkfit_2cmp)
Statistic:          1200.0
Degrees of freedom:      4
P-value:               0.0

We should also compare the other metrics and statistics, such ηshrinkage, ϵshrinkage, aic, and bic using the metrics_table function.

@chain metrics_table(pkfit_2cmp) begin
    leftjoin(metrics_table(pkfit_1cmp); on = :Metric, makeunique = true)
    rename!(:Value => :pk2cmp, :Value_1 => :pk1cmp)
end
WARNING: using deprecated binding Distributions.MatrixReshaped in Pumas.
, use Distributions.ReshapedDistribution{2, S, D} where D<:Distributions.Distribution{Distributions.ArrayLikeVariate{1}, S} where S<:Distributions.ValueSupport instead.
20×3 DataFrame
Row Metric pk2cmp pk1cmp
String Any Any
1 Successful true true
2 Estimation Time 14.372 5.513
3 Subjects 120 120
4 Fixed Parameters 1 1
5 Optimized Parameters 10 6
6 Conc Active Observations 1320 1320
7 Conc Missing Observations 0 0
8 Total Active Observations 1320 1320
9 Total Missing Observations 0 0
10 Likelihood Approximation Pumas.FOCE{Optim.NewtonTrustRegion{Float64}, Optim.Options{Float64, Nothing}} Pumas.FOCE{Optim.NewtonTrustRegion{Float64}, Optim.Options{Float64, Nothing}}
11 LogLikelihood (LL) 1830.63 1231.19
12 -2LL -3661.26 -2462.38
13 AIC -3641.26 -2450.38
14 BIC -3589.41 -2419.26
15 (η-shrinkage) η₁ 0.037 0.016
16 (η-shrinkage) η₂ 0.047 0.04
17 (η-shrinkage) η₃ 0.516 0.733
18 (ϵ-shrinkage) Conc 0.185 0.105
19 (η-shrinkage) η₄ 0.287 missing
20 (η-shrinkage) η₅ 0.154 missing

We next generate some goodness of fit plots to compare which model is performing better. To do this, we first inspect the diagnostics of our model fit.

res_inspect_1cmp = inspect(pkfit_1cmp)
[ Info: Calculating predictions.
[ Info: Calculating weighted residuals.
[ Info: Calculating empirical bayes.
[ Info: Evaluating dose control parameters.
[ Info: Evaluating individual parameters.
[ Info: Done.
FittedPumasModelInspection

Likelihood approximation used for weighted residuals: FOCE
res_inspect_2cmp = inspect(pkfit_2cmp)
[ Info: Calculating predictions.
[ Info: Calculating weighted residuals.
[ Info: Calculating empirical bayes.
[ Info: Evaluating dose control parameters.
[ Info: Evaluating individual parameters.
[ Info: Done.
FittedPumasModelInspection

Likelihood approximation used for weighted residuals: FOCE
gof_1cmp = goodness_of_fit(
    res_inspect_1cmp;
    figure = (; fontsize = 12),
    legend = (; position = :bottom),
)

A 4-mosaic goodness of fit plot showing the 1-compartment model

Goodness of Fit Plots
gof_2cmp = goodness_of_fit(
    res_inspect_2cmp;
    figure = (; fontsize = 12),
    legend = (; position = :bottom),
)

Trend plot with observations for all individual subjects over time

Subject Fits

These plots clearly indicate that the 2-compartment model is a better fit compared to the 1-compartment model.

We can look at selected sample of individual plots.

fig_subject_fits = subject_fits(
    res_inspect_2cmp;
    separate = true,
    paginate = true,
    figure = (; fontsize = 18),
    axis = (; xlabel = "Time (hr)", ylabel = "CTMx Concentration (ng/mL)"),
)
fig_subject_fits[1]

Trend plot with observations for 9 individual subjects over time

Subject Fits for 9 Individuals

There a lot of important plotting functions you can use for your standard model diagnostics. Please make sure to read the documentation for plotting. Below, we are checking the distribution of the empirical Bayes estimates.

empirical_bayes_dist(res_inspect_2cmp; zeroline_color = :red)

A histogram for the empirical Bayes distribution of all subject-specific parameters

Empirical Bayes Distribution
empirical_bayes_vs_covariates(
    res_inspect_2cmp;
    categorical = [:Dose],
    figure = (; size = (600, 800)),
)

A histogram for the empirical Bayes distribution of all subject-specific parameters stratified by categorical covariates

Empirical Bayes Distribution Stratified by Covariates

Clearly, our guess at tvka seems off-target. Let’s try and estimate tvka instead of fixing it to 2:

pkfit_2cmp_unfix_ka = fit(pk_2cmp, pkpain_noplb, init_params(pk_2cmp), FOCE())
[ Info: Checking the initial parameter values.
[ Info: The initial negative log likelihood and its gradient are finite. Check passed.
Iter     Function value   Gradient norm 
     0    -3.200734e+02     1.272671e+03
 * time: 2.09808349609375e-5
     1    -8.682982e+02     1.000199e+03
 * time: 1.3310010433197021
     2    -1.381870e+03     5.008081e+02
 * time: 3.7517240047454834
     3    -1.551053e+03     6.833490e+02
 * time: 4.063700914382935
     4    -1.680887e+03     1.834586e+02
 * time: 4.381908893585205
     5    -1.726118e+03     8.870274e+01
 * time: 4.754093885421753
     6    -1.761023e+03     1.162036e+02
 * time: 5.020036935806274
     7    -1.786619e+03     1.114552e+02
 * time: 5.329725980758667
     8    -1.863556e+03     9.914305e+01
 * time: 5.6320579051971436
     9    -1.882942e+03     5.342676e+01
 * time: 5.925926923751831
    10    -1.888020e+03     2.010181e+01
 * time: 6.210546016693115
    11    -1.889832e+03     1.867262e+01
 * time: 6.503090858459473
    12    -1.891649e+03     1.668510e+01
 * time: 6.792168855667114
    13    -1.892615e+03     1.820707e+01
 * time: 7.105893850326538
    14    -1.893453e+03     1.745193e+01
 * time: 7.395514965057373
    15    -1.894760e+03     1.850174e+01
 * time: 7.679613828659058
    16    -1.895647e+03     1.773921e+01
 * time: 7.9875569343566895
    17    -1.896597e+03     1.143421e+01
 * time: 8.301094055175781
    18    -1.897114e+03     9.720034e+00
 * time: 8.592867851257324
    19    -1.897373e+03     6.054160e+00
 * time: 8.906024932861328
    20    -1.897498e+03     3.985923e+00
 * time: 9.192986965179443
    21    -1.897571e+03     4.262502e+00
 * time: 9.492432832717896
    22    -1.897633e+03     4.010316e+00
 * time: 9.78183102607727
    23    -1.897714e+03     4.805389e+00
 * time: 10.073115825653076
    24    -1.897802e+03     3.508614e+00
 * time: 10.383512020111084
    25    -1.897865e+03     3.691472e+00
 * time: 10.668675899505615
    26    -1.897900e+03     2.982676e+00
 * time: 10.94167685508728
    27    -1.897928e+03     2.563863e+00
 * time: 11.232960939407349
    28    -1.897968e+03     3.261530e+00
 * time: 11.50447392463684
    29    -1.898013e+03     3.064695e+00
 * time: 11.775993824005127
    30    -1.898040e+03     1.636456e+00
 * time: 12.074547052383423
    31    -1.898051e+03     1.439998e+00
 * time: 12.390280961990356
    32    -1.898057e+03     1.436505e+00
 * time: 12.647377967834473
    33    -1.898069e+03     1.881592e+00
 * time: 12.92654299736023
    34    -1.898095e+03     3.253228e+00
 * time: 13.199355840682983
    35    -1.898142e+03     4.257954e+00
 * time: 13.476202964782715
    36    -1.898199e+03     3.685153e+00
 * time: 13.772341012954712
    37    -1.898245e+03     2.567377e+00
 * time: 14.048964023590088
    38    -1.898246e+03     2.561577e+00
 * time: 14.441490888595581
    39    -1.898251e+03     2.530928e+00
 * time: 14.778038024902344
    40    -1.898298e+03     2.673773e+00
 * time: 15.078016996383667
    41    -1.898300e+03     2.795859e+00
 * time: 15.404392004013062
    42    -1.898337e+03     3.666102e+00
 * time: 15.774720907211304
    43    -1.898342e+03     3.753077e+00
 * time: 16.181099891662598
    44    -1.898429e+03     4.461850e+00
 * time: 16.568289041519165
    45    -1.898461e+03     3.584769e+00
 * time: 16.843650817871094
    46    -1.898477e+03     2.357431e+00
 * time: 17.147682905197144
    47    -1.898479e+03     7.373685e-01
 * time: 17.42925190925598
    48    -1.898479e+03     6.197624e-01
 * time: 17.755481958389282
    49    -1.898480e+03     1.716594e+00
 * time: 18.1113338470459
    50    -1.898480e+03     1.128276e+00
 * time: 18.365583896636963
    51    -1.898480e+03     9.157160e-01
 * time: 18.679677963256836
    52    -1.898480e+03     3.051694e-01
 * time: 18.983877897262573
    53    -1.898480e+03     2.316257e-01
 * time: 19.24250888824463
    54    -1.898480e+03     2.316260e-01
 * time: 19.497082948684692
    55    -1.898480e+03     2.344996e-01
 * time: 19.954659938812256
    56    -1.898480e+03     2.372822e-01
 * time: 20.41219687461853
    57    -1.898480e+03     2.372809e-01
 * time: 20.883453845977783
    58    -1.898480e+03     2.372808e-01
 * time: 21.333900928497314
    59    -1.898480e+03     2.372808e-01
 * time: 21.81968593597412
    60    -1.898480e+03     2.372808e-01
 * time: 22.30205988883972
    61    -1.898480e+03     2.372808e-01
 * time: 22.787372827529907
    62    -1.898480e+03     2.372808e-01
 * time: 23.26146101951599
FittedPumasModel

Dynamical system type:                 Closed form

Number of subjects:                            120

Observation records:         Active        Missing
    Conc:                      1320              0
    Total:                     1320              0

Number of parameters:      Constant      Optimized
                                  0             11

Likelihood approximation:                     FOCE
Likelihood optimizer:                         BFGS

Termination Reason:              NoObjectiveChange
Log-likelihood value:                    1898.4797

-----------------
       Estimate
-----------------
tvcl    2.6191
tvv    11.378
tvvp    8.4529
tvq     1.3164
tvka    4.8925
Ω₁,₁    0.13243
Ω₂,₂    0.059669
Ω₃,₃    0.41581
Ω₄,₄    0.080679
Ω₅,₅    0.24996
σ_p     0.049098
-----------------
compare_estimates(; pkfit_2cmp, pkfit_2cmp_unfix_ka)
11×3 DataFrame
Row parameter pkfit_2cmp pkfit_2cmp_unfix_ka
String Float64? Float64?
1 tvcl 2.81373 2.61912
2 tvv 11.0047 11.3783
3 tvvp 5.54005 8.45292
4 tvq 1.51591 1.31636
5 tvka 2.0 4.89251
6 Ω₁,₁ 0.102656 0.132433
7 Ω₂,₂ 0.060778 0.0596692
8 Ω₃,₃ 1.20118 0.415807
9 Ω₄,₄ 0.423466 0.080679
10 Ω₅,₅ 0.244713 0.249963
11 σ_p 0.0484045 0.0490975

Let’s revaluate the goodness of fits and η distribution plots.

Not much change in the general gof plots

res_inspect_2cmp_unfix_ka = inspect(pkfit_2cmp_unfix_ka)
[ Info: Calculating predictions.
[ Info: Calculating weighted residuals.
[ Info: Calculating empirical bayes.
[ Info: Evaluating dose control parameters.
[ Info: Evaluating individual parameters.
[ Info: Done.
FittedPumasModelInspection

Likelihood approximation used for weighted residuals: FOCE
goodness_of_fit(
    res_inspect_2cmp_unfix_ka;
    figure = (; fontsize = 12),
    legend = (; position = :bottom),
)

A 4-mosaic goodness of fit plot showing the 2-compartment model

Goodness of Fit Plots

But you can see a huge improvement in the ηka, (η₃) distribution which is now centered around zero

empirical_bayes_vs_covariates(
    res_inspect_2cmp_unfix_ka;
    categorical = [:Dose],
    ebes = [:η₃],
    figure = (; size = (600, 800)),
)

A histogram for the empirical Bayes distribution of all subject-specific parameters stratified by categorical covariates

Empirical Bayes Distribution Stratified by Covariates

Finally looking at some individual plots for the same subjects as earlier:

fig_subject_fits2 = subject_fits(
    res_inspect_2cmp_unfix_ka;
    separate = true,
    paginate = true,
    facet = (; linkyaxes = false),
    figure = (; fontsize = 18),
    axis = (; xlabel = "Time (hr)", ylabel = "CTMx Concentration (ng/mL)"),
)
fig_subject_fits2[6]

Trend plot with observations for 9 individual subjects over time

Subject Fits for 9 Individuals

The randomly sampled individual fits don’t seem good in some individuals, but we can evaluate this via a vpc to see how to go about.

3.4 Visual Predictive Checks (VPC)

We can now perform a vpc to check. The default plots provide a 80% prediction interval and a 95% simulated CI (shaded area) around each of the quantiles

pk_vpc = vpc(pkfit_2cmp_unfix_ka, 200; observations = [:Conc], stratify_by = [:Dose])
[ Info: Continuous VPC
Visual Predictive Check
  Type of VPC: Continuous VPC
  Simulated populations: 200
  Subjects in data: 120
  Stratification variable(s): [:Dose]
  Confidence level: 0.95
  VPC lines: quantiles ([0.1, 0.5, 0.9])
vpc_plot(
    pk_2cmp,
    pk_vpc;
    rows = 1,
    columns = 3,
    figure = (; size = (1400, 1000), fontsize = 22),
    axis = (;
        xlabel = "Time (hr)",
        ylabel = "Observed/Predicted\n CTMx Concentration (ng/mL)",
    ),
    facet = (; combinelabels = true),
)

A visual predictive plot stratified by dose group

Visual Predictive Plots

The visual predictive check suggests that the model captures the data well across all dose levels.

4 Additional Help

If you have questions regarding this tutorial, please post them on our discourse site.