A Comprehensive Introduction to Pumas

Authors

Vijay Ivaturi

Jose Storopoli

This tutorial provides a comprehensive introduction to a modeling and simulation workflow in Pumas. The idea is not to get into the details of Pumas specifics, but instead provide a narrative on the lines of a regular workflow in our day-to-day work, with brevity where required to allow a broad overview. Wherever possible, cross-references will be provided to documentation and detailed examples that provide deeper insight into a particular topic.

As part of this workflow, you will be introduced to various aspects such as:

  1. Data wrangling in Julia
  2. Exploratory analysis in Julia
  3. Continuous data non-linear mixed effects modeling in Pumas
  4. Model comparison routines, post-processing, validation etc.

1 The Study and Design

CTMNopain is a novel anti-inflammatory agent under preliminary investigation. A dose-ranging trial was conducted comparing placebo with 3 doses of CTMNopain (5mg, 20mg and 80 mg QD). The maximum tolerated dose is 160 mg per day. Plasma concentrations (mg/L) of the drug were measured at 0, 0.5, 1, 1.5, 2, 2.5, 3-8 hours.

Pain score (0=no pain, 1=mild, 2=moderate, 3=severe) were obtained at time points when plasma concentration was collected. A pain score of 2 or more is considered as no pain relief.

The subjects can request for remedication if pain relief is not achieved after 2 hours post dose. Some subjects had remedication before 2 hours if they were not able to bear the pain. The time to remedication and the remedication status is available for subjects.

The pharmacokinetic dataset can be accessed using PharmaDatasets.jl.

2 Setup

2.1 Load libraries

These libraries provide the workhorse functionality in the Pumas ecosystem:

using Pumas
using PumasUtilities
using NCA
using NCAUtilities

In addition, libraries below are good add-on’s that provide ancillary functionality:

using GLM: lm, @formula
using Random
using CSV
using DataFramesMeta
using CairoMakie
using PharmaDatasets

2.2 Data Wrangling

We start by reading in the dataset and making some quick summaries.

Tip

If you want to learn more about data wrangling, don’t forget to check our Data Wrangling in Julia tutorials!

pkpain_df = dataset("pk_painrelief")
first(pkpain_df, 5)
5×7 DataFrame
Row Subject Time Conc PainRelief PainScore RemedStatus Dose
Int64 Float64 Float64 Int64 Int64 Int64 String7
1 1 0.0 0.0 0 3 1 20 mg
2 1 0.5 1.15578 1 1 0 20 mg
3 1 1.0 1.37211 1 0 0 20 mg
4 1 1.5 1.30058 1 0 0 20 mg
5 1 2.0 1.19195 1 1 0 20 mg

Let’s filter out the placebo data as we don’t need that for the PK analysis.

pkpain_noplb_df = @rsubset pkpain_df :Dose != "Placebo";
first(pkpain_noplb_df, 5)
5×7 DataFrame
Row Subject Time Conc PainRelief PainScore RemedStatus Dose
Int64 Float64 Float64 Int64 Int64 Int64 String7
1 1 0.0 0.0 0 3 1 20 mg
2 1 0.5 1.15578 1 1 0 20 mg
3 1 1.0 1.37211 1 0 0 20 mg
4 1 1.5 1.30058 1 0 0 20 mg
5 1 2.0 1.19195 1 1 0 20 mg

3 Analysis

3.1 Non-compartmental analysis

Let’s begin by performing a quick NCA of the concentration time profiles and view the exposure changes across doses. The input data specification for NCA analysis requires the presence of a :route column and an :amt column that specifies the dose. So, let’s add that in:

@rtransform! pkpain_noplb_df begin
    :route = "ev"
    :Dose = parse(Int, chop(:Dose; tail = 3))
end

We also need to create an :amt column:

@rtransform! pkpain_noplb_df :amt = :Time == 0 ? :Dose : missing

Now, we map the data variables to the read_nca function that prepares the data for NCA analysis.

pkpain_nca = read_nca(
    pkpain_noplb_df;
    id = :Subject,
    time = :Time,
    amt = :amt,
    observations = :Conc,
    group = [:Dose],
    route = :route,
)
NCAPopulation (120 subjects):
  Group: [["Dose" => 5], ["Dose" => 20], ["Dose" => 80]]
  Number of missing observations: 0
  Number of blq observations: 0

Now that we mapped the data in, let’s visualize the concentration vs time plots for a few individuals. When paginate is set to true, a vector of plots are returned and below we display the first element with 9 individuals.

f = observations_vs_time(
    pkpain_nca;
    paginate = true,
    axis = (; xlabel = "Time (hr)", ylabel = "CTMNoPain Concentration (ng/mL)"),
)
f[1]

An observations versus time profile for all subjects

Observations versus Time

or you can view the summary curves by dose group as passed in to the group argument in read_nca

summary_observations_vs_time(
    pkpain_nca,
    figure = (; fontsize = 22, size = (800, 1000)),
    color = "black",
    linewidth = 3,
    axis = (; xlabel = "Time (hr)", ylabel = "CTMX Concentration (μg/mL)"),
)

An observations versus time profile for all subjects in a summarized manner

Summary Observations versus Time

A full NCA Report is now obtained for completeness purposes using the run_nca function, but later we will only extract a couple of key metrics of interest.

pk_nca = run_nca(pkpain_nca; sigdigits = 3)

We can look at the NCA fits for some subjects. Here f is a vector or figures. We’ll showcase the first image by indexing f:

f = subject_fits(
    pk_nca,
    paginate = true,
    axis = (; xlabel = "Time (hr)", ylabel = "CTMX Concentration (μg/mL)"),

    # Legend options
    legend = (; position = :bottom),
)
f[1]

Trend plot with observations for all individual subjects over time

Subject Fits

As CTMNopain’s effect maybe mainly related to maximum concentration (cmax) or area under the curve (auc), we present some summary statistics using the summarize function from NCA.

strata = [:Dose]
1-element Vector{Symbol}:
 :Dose
params = [:cmax, :aucinf_obs]
2-element Vector{Symbol}:
 :cmax
 :aucinf_obs
output = summarize(pk_nca; stratify_by = strata, parameters = params)
6×10 DataFrame
Row Dose parameters numsamples minimum maximum mean std geomean geostd geomeanCV
Int64 String Int64 Float64 Float64 Float64 Float64 Float64 Float64 Float64
1 5 cmax 40 0.19 0.539 0.356075 0.0884129 0.345104 1.2932 26.1425
2 5 aucinf_obs 40 0.914 3.4 1.5979 0.490197 1.53373 1.32974 29.0868
3 20 cmax 40 0.933 2.7 1.4737 0.361871 1.43408 1.2633 23.6954
4 20 aucinf_obs 40 2.77 14.1 6.377 2.22239 6.02031 1.41363 35.6797
5 80 cmax 40 3.3 8.47 5.787 1.31957 5.64164 1.25757 23.2228
6 80 aucinf_obs 40 13.7 49.1 29.5 8.68984 28.2954 1.34152 30.0258

The statistics printed above are the default, but you can pass in your own statistics using the stats = [] argument to the summarize function.

We can look at a few parameter distribution plots.

parameters_vs_group(
    pk_nca,
    parameter = :cmax,
    axis = (; xlabel = "Dose (mg)", ylabel = "Cₘₐₓ (ng/mL)"),
    figure = (; fontsize = 18),
)

A violin plot for the Cmax distribution for each dose group

Cmax for each Dose Group

Dose normalized PK parameters, cmax and aucinf were essentially dose proportional between for 5 mg, 20 mg and 80 mg doses. You can perform a simple regression to check the impact of dose on cmax:

dp = NCA.DoseLinearityPowerModel(pk_nca, :cmax; level = 0.9)
Dose Linearity Power Model
Variable: cmax
Model: log(cmax) ~ log(α) + β × log(dose)
────────────────────────────────────
   Estimate  low CI 90%  high CI 90%
────────────────────────────────────
β   1.00775     0.97571       1.0398
────────────────────────────────────

Here’s a visualization for the dose linearity using a power model for cmax:

power_model(dp; legend = (; position = :bottom))

A dose linearity power model plot for Cmax

Dose Linearity Plot

We can also visualize a dose proportionality results with respect to a specific endpoint in a NCA Report; for example cmax and aucinf_obs:

dose_vs_dose_normalized(pk_nca, :cmax)

A dose proportionality plot for Cmax

Dose Proportionality Plot
dose_vs_dose_normalized(pk_nca, :aucinf_obs)

A dose proportionality plot for AUC

Dose Proportionality Plot

Based on visual inspection of the concentration time profiles as seen earlier, CTMNopain exhibited monophasic decline, and perhaps a one compartment model best fits the PK data.

3.2 Pharmacokinetic modeling

As seen from the plots above, the concentrations decline monoexponentially. We will evaluate both one and two compartment structural models to assess best fit. Further, different residual error models will also be tested.

We will use the results from NCA to provide us good initial estimates.

3.2.1 Data preparation for modeling

PumasNDF requires the presence of :evid and :cmt columns in the dataset.

@rtransform! pkpain_noplb_df begin
    :evid = :Time == 0 ? 1 : 0
    :cmt = :Time == 0 ? 1 : 2
    :cmt2 = 1 # for zero order absorption
end

Further, observations at time of dosing, i.e., when evid = 1 have to be missing

@rtransform! pkpain_noplb_df :Conc = :evid == 1 ? missing : :Conc

The dataframe will now be converted to a Population using read_pumas. Note that both observations and covariates are required to be an array even if it is one element.

pkpain_noplb = read_pumas(
    pkpain_noplb_df;
    id = :Subject,
    time = :Time,
    amt = :amt,
    observations = [:Conc],
    covariates = [:Dose],
    evid = :evid,
    cmt = :cmt,
)
Population
  Subjects: 120
  Covariates: Dose
  Observations: Conc

Now that the data is transformed to a Population of subjects, we can explore different models.

3.2.2 One-compartment model

Note

If you are not familiar yet with the @model blocks and syntax, please check our documentation.

pk_1cmp = @model begin

    @metadata begin
        desc = "One Compartment Model"
        timeu = u"hr"
    end

    @param begin
        """
        Clearance (L/hr)
        """
        tvcl  RealDomain(; lower = 0, init = 3.2)
        """
        Volume (L)
        """
        tvv  RealDomain(; lower = 0, init = 16.4)
        """
        Absorption rate constant (h-1)
        """
        tvka  RealDomain(; lower = 0, init = 3.8)
        """
          - ΩCL
          - ΩVc
          - ΩKa
        """
        Ω  PDiagDomain(init = [0.04, 0.04, 0.04])
        """
        Proportional RUV
        """
        σ_p  RealDomain(; lower = 0.0001, init = 0.2)
    end

    @random begin
        η ~ MvNormal(Ω)
    end

    @covariates begin
        """
        Dose (mg)
        """
        Dose
    end

    @pre begin
        CL = tvcl * exp(η[1])
        Vc = tvv * exp(η[2])
        Ka = tvka * exp(η[3])
    end

    @dynamics Depots1Central1

    @derived begin
        cp := @. Central / Vc
        """
        CTMx Concentration (ng/mL)
        """
        Conc ~ @. Normal(cp, abs(cp) * σ_p)
    end

end
Warning: Covariate Dose is not used in the model.
@ Pumas ~/run/_work/PumasTutorials.jl/PumasTutorials.jl/custom_julia_depot/packages/Pumas/LN24a/src/dsl/model_macro.jl:3153
PumasModel
  Parameters: tvcl, tvv, tvka, Ω, σ_p
  Random effects: η
  Covariates: Dose
  Dynamical system variables: Depot, Central
  Dynamical system type: Closed form
  Derived: Conc
  Observed: Conc
Tip

Note that the local assignment := can be used to define intermediate statements that will not be carried outside of the block. This means that all the resulting data workflows from this model will not contain the intermediate variables defined with :=. We use this when we want to suppress the variable from any further output.

The idea behind := is for performance reasons. If you are not carrying the variable defined with := outside of the block, then it is not necessary to store it in the resulting data structures. Not only will your model run faster, but the resulting data structures will also be smaller.

Before going to fit the model, let’s evaluate some helpful steps via simulation to check appropriateness of data and model

# zero out the random effects
etas = zero_randeffs(pk_1cmp, pkpain_noplb, init_params(pk_1cmp))

Above, we are generating a vector of η’s of the same length as the number of subjects to zero out the random effects. We do this as we are evaluating the trajectories of the concentrations at the initial set of parameters at a population level. Other helper functions here are sample_randeffs and init_randeffs. Please refer to the documentation.

simpk_iparams = simobs(pk_1cmp, pkpain_noplb, init_params(pk_1cmp), etas)
Simulated population (Vector{<:Subject})
  Simulated subjects: 120
  Simulated variables: Conc
sim_plot(
    pk_1cmp,
    simpk_iparams;
    observations = [:Conc],
    figure = (; fontsize = 18),
    axis = (;
        xlabel = "Time (hr)",
        ylabel = "Observed/Predicted \n CTMx Concentration (ng/mL)",
    ),
)

A simulated observations versus time plot overlaid with the scatter plot of the observed observations

Simulated Observations Plot

Our NCA based initial guess on the parameters seem to work well.

Lets change the initial estimate of a couple of the parameters to evaluate the sensitivity.

pkparam = (; init_params(pk_1cmp)..., tvka = 2, tvv = 10)
(tvcl = 3.2,
 tvv = 10,
 tvka = 2,
 Ω = [0.04 0.0 0.0; 0.0 0.04 0.0; 0.0 0.0 0.04],
 σ_p = 0.2,)
simpk_changedpars = simobs(pk_1cmp, pkpain_noplb, pkparam, etas)
Simulated population (Vector{<:Subject})
  Simulated subjects: 120
  Simulated variables: Conc
sim_plot(
    pk_1cmp,
    simpk_changedpars;
    observations = [:Conc],
    figure = (; fontsize = 18),
    axis = (
        xlabel = "Time (hr)",
        ylabel = "Observed/Predicted \n CTMx Concentration (ng/mL)",
    ),
)

A simulated observations versus time plot overlaid with the scatter plot of the observed observations

Simulated Observations Plot

Changing the tvka and decreasing the tvv seemed to make an impact and observations go through the simulated lines.

To get a quick ballpark estimate of your PK parameters, we can do a NaivePooled analysis.

3.2.2.1 NaivePooled
pkfit_np = fit(pk_1cmp, pkpain_noplb, init_params(pk_1cmp), NaivePooled(); omegas = (:Ω,))
[ Info: Checking the initial parameter values.
[ Info: The initial negative log likelihood and its gradient are finite. Check passed.
Iter     Function value   Gradient norm 
     0     7.744356e+02     3.715711e+03
 * time: 0.07605504989624023
     1     2.343899e+02     1.747348e+03
 * time: 2.315423011779785
     2     9.696232e+01     1.198088e+03
 * time: 2.320127010345459
     3    -7.818699e+01     5.538151e+02
 * time: 2.323575973510742
     4    -1.234803e+02     2.462514e+02
 * time: 2.326970100402832
     5    -1.372888e+02     2.067458e+02
 * time: 2.3318960666656494
     6    -1.410579e+02     1.162950e+02
 * time: 2.335287094116211
     7    -1.434754e+02     5.632816e+01
 * time: 2.338449001312256
     8    -1.453401e+02     7.859270e+01
 * time: 2.3415050506591797
     9    -1.498185e+02     1.455606e+02
 * time: 2.3447279930114746
    10    -1.534371e+02     1.303682e+02
 * time: 2.3479771614074707
    11    -1.563557e+02     5.975474e+01
 * time: 2.351196050643921
    12    -1.575052e+02     9.308611e+00
 * time: 2.3542771339416504
    13    -1.579357e+02     1.234484e+01
 * time: 2.3575069904327393
    14    -1.581874e+02     7.478196e+00
 * time: 2.3606221675872803
    15    -1.582981e+02     2.027162e+00
 * time: 2.3637871742248535
    16    -1.583375e+02     5.578262e+00
 * time: 2.366858959197998
    17    -1.583556e+02     4.727050e+00
 * time: 2.370090961456299
    18    -1.583644e+02     2.340173e+00
 * time: 2.373183012008667
    19    -1.583680e+02     7.738100e-01
 * time: 2.376235008239746
    20    -1.583696e+02     3.300689e-01
 * time: 2.379223108291626
    21    -1.583704e+02     3.641985e-01
 * time: 2.3823740482330322
    22    -1.583707e+02     4.365901e-01
 * time: 2.3854329586029053
    23    -1.583709e+02     3.887800e-01
 * time: 2.388512134552002
    24    -1.583710e+02     2.766977e-01
 * time: 2.3915271759033203
    25    -1.583710e+02     1.758029e-01
 * time: 2.394792079925537
    26    -1.583710e+02     1.133947e-01
 * time: 2.397952079772949
    27    -1.583710e+02     7.922544e-02
 * time: 2.401085138320923
    28    -1.583710e+02     5.954998e-02
 * time: 2.404100179672241
    29    -1.583710e+02     4.157080e-02
 * time: 2.4073121547698975
    30    -1.583710e+02     4.295446e-02
 * time: 2.410396099090576
    31    -1.583710e+02     5.170752e-02
 * time: 2.4134249687194824
    32    -1.583710e+02     2.644382e-02
 * time: 2.4174771308898926
    33    -1.583710e+02     4.548987e-03
 * time: 2.4213991165161133
    34    -1.583710e+02     2.501805e-02
 * time: 2.4255411624908447
    35    -1.583710e+02     3.763439e-02
 * time: 2.4285590648651123
    36    -1.583710e+02     3.206027e-02
 * time: 2.4316530227661133
    37    -1.583710e+02     1.003700e-02
 * time: 2.434627056121826
    38    -1.583710e+02     2.209084e-02
 * time: 2.437901020050049
    39    -1.583710e+02     4.954136e-03
 * time: 2.4409351348876953
    40    -1.583710e+02     1.609366e-02
 * time: 2.445019006729126
    41    -1.583710e+02     1.579810e-02
 * time: 2.4482710361480713
    42    -1.583710e+02     1.014156e-03
 * time: 2.4513840675354004
    43    -1.583710e+02     6.050792e-03
 * time: 2.455404043197632
    44    -1.583710e+02     1.354381e-02
 * time: 2.4585909843444824
    45    -1.583710e+02     4.473216e-03
 * time: 2.461724042892456
    46    -1.583710e+02     4.645458e-03
 * time: 2.4648079872131348
    47    -1.583710e+02     9.828063e-03
 * time: 2.4678330421447754
    48    -1.583710e+02     1.047215e-03
 * time: 2.4711029529571533
    49    -1.583710e+02     8.374104e-03
 * time: 2.474297046661377
    50    -1.583710e+02     7.841995e-04
 * time: 2.4775030612945557
FittedPumasModel

Dynamical system type:                 Closed form

Number of subjects:                            120

Observation records:         Active        Missing
    Conc:                      1320              0
    Total:                     1320              0

Number of parameters:      Constant      Optimized
                                  1              6

Likelihood approximation:              NaivePooled
Likelihood optimizer:                         BFGS

Termination Reason:                   GradientNorm
Log-likelihood value:                    158.37103

------------------
         Estimate
------------------
  tvcl    3.0054
  tvv    14.089
  tvka   44.227
† Ω₁,₁    0.0
† Ω₂,₂    0.0
† Ω₃,₃    0.0
  σ_p     0.32999
------------------
† indicates constant parameters
coefficients_table(pkfit_np)
7×4 DataFrame
Row Parameter Description Constant Estimate
String SubStrin… Bool Float64
1 tvcl Clearance (L/hr) false 3.005
2 tvv Volume (L) false 14.089
3 tvka Absorption rate constant (h-1) false 44.227
4 Ω₁,₁ ΩCL true 0.0
5 Ω₂,₂ ΩVc true 0.0
6 Ω₃,₃ ΩKa true 0.0
7 σ_p Proportional RUV false 0.33

The final estimates from the NaivePooled approach seem reasonably close to our initial guess from NCA, except for the tvka parameter. We will stick with our initial guess.

One way to be cautious before going into a complete fitting routine is to evaluate the likelihood of the individual subjects given the initial parameter values and see if any subject(s) pops out as unreasonable. There are a few ways of doing this:

  • check the loglikelihood subject wise
  • check if there any influential subjects

Below, we are basically checking if the initial estimates for any subject are way off that we are unable to compute the initial loglikelihood.

lls = [loglikelihood(pk_1cmp, subj, pkparam, FOCE()) for subj in pkpain_noplb]
# the plot below is using native CairoMakie `hist`
hist(lls; bins = 10, normalization = :none, color = (:black, 0.5))

A histogram of the individual loglikelihoods

Histogram of Loglikelihoods

The distribution of the loglikelihood’s suggest no extreme outliers.

A more convenient way is to use the findinfluential function that provides a list of k top influential subjects by showing the normalized (minus) loglikelihood for each subject. As you can see below, the minus loglikelihood in the range of 16 agrees with the histogram plotted above.

influential_subjects = findinfluential(pk_1cmp, pkpain_noplb, pkparam, FOCE())
120-element Vector{@NamedTuple{id::String, nll::Float64}}:
 (id = "148", nll = 16.65965885684477)
 (id = "135", nll = 16.648985190076335)
 (id = "156", nll = 15.959069556607496)
 (id = "159", nll = 15.441218240496484)
 (id = "149", nll = 14.71513464411951)
 (id = "88", nll = 13.09709837464614)
 (id = "16", nll = 12.98228052193144)
 (id = "61", nll = 12.65218290230368)
 (id = "71", nll = 12.500330088085505)
 (id = "59", nll = 12.241510254805235)
 ⋮
 (id = "57", nll = -22.79767423253431)
 (id = "93", nll = -22.836900711478208)
 (id = "12", nll = -23.007742339519247)
 (id = "123", nll = -23.292751843079234)
 (id = "41", nll = -23.425412534960515)
 (id = "99", nll = -23.535214841901112)
 (id = "29", nll = -24.025959868383083)
 (id = "52", nll = -24.164757842493685)
 (id = "24", nll = -25.57209232565845)
3.2.2.2 FOCE

Now that we have a good handle on our data, lets go ahead and fit a population model with FOCE:

pkfit_1cmp = fit(pk_1cmp, pkpain_noplb, pkparam, FOCE(); constantcoef = (; tvka = 2))
[ Info: Checking the initial parameter values.
[ Info: The initial negative log likelihood and its gradient are finite. Check passed.
Iter     Function value   Gradient norm 
     0    -5.935351e+02     5.597318e+02
 * time: 2.8848648071289062e-5
     1    -7.022088e+02     1.707063e+02
 * time: 0.6749019622802734
     2    -7.314067e+02     2.903269e+02
 * time: 0.7896158695220947
     3    -8.520591e+02     2.285888e+02
 * time: 0.9046328067779541
     4    -1.120191e+03     3.795410e+02
 * time: 1.2159268856048584
     5    -1.178784e+03     2.323978e+02
 * time: 1.3268909454345703
     6    -1.218320e+03     9.699907e+01
 * time: 1.4310028553009033
     7    -1.223641e+03     5.862105e+01
 * time: 2.0299289226531982
     8    -1.227620e+03     1.831402e+01
 * time: 2.1307928562164307
     9    -1.228381e+03     2.132323e+01
 * time: 2.2298879623413086
    10    -1.230098e+03     2.921228e+01
 * time: 4.391286849975586
    11    -1.230854e+03     2.029662e+01
 * time: 4.490128993988037
    12    -1.231116e+03     5.229099e+00
 * time: 4.583923816680908
    13    -1.231179e+03     1.689232e+00
 * time: 4.681553840637207
    14    -1.231187e+03     1.215379e+00
 * time: 4.782559871673584
    15    -1.231188e+03     2.770381e-01
 * time: 4.878432989120483
    16    -1.231188e+03     1.636652e-01
 * time: 4.964958906173706
    17    -1.231188e+03     2.701149e-01
 * time: 5.0537238121032715
    18    -1.231188e+03     3.163341e-01
 * time: 5.144328832626343
    19    -1.231188e+03     1.505153e-01
 * time: 5.231457948684692
    20    -1.231188e+03     2.485002e-02
 * time: 5.392660856246948
    21    -1.231188e+03     8.435209e-04
 * time: 5.448892831802368
FittedPumasModel

Dynamical system type:                 Closed form

Number of subjects:                            120

Observation records:         Active        Missing
    Conc:                      1320              0
    Total:                     1320              0

Number of parameters:      Constant      Optimized
                                  1              6

Likelihood approximation:                     FOCE
Likelihood optimizer:                         BFGS

Termination Reason:                   GradientNorm
Log-likelihood value:                     1231.188

-------------------
         Estimate
-------------------
  tvcl    3.1642
  tvv    13.288
† tvka    2.0
  Ω₁,₁    0.08494
  Ω₂,₂    0.048568
  Ω₃,₃    5.5811
  σ_p     0.10093
-------------------
† indicates constant parameters
infer(pkfit_1cmp)
[ Info: Calculating: variance-covariance matrix.
[ Info: Done.
Asymptotic inference results using sandwich estimator

Dynamical system type:                 Closed form

Number of subjects:                            120

Observation records:         Active        Missing
    Conc:                      1320              0
    Total:                     1320              0

Number of parameters:      Constant      Optimized
                                  1              6

Likelihood approximation:                     FOCE
Likelihood optimizer:                         BFGS

Termination Reason:                   GradientNorm
Log-likelihood value:                     1231.188

---------------------------------------------------------
         Estimate    SE          95.0% C.I.
---------------------------------------------------------
  tvcl    3.1642     0.08662     [  2.9944  ;  3.334   ]
  tvv    13.288      0.27481     [ 12.749   ; 13.827   ]
† tvka    2.0        NaN         [  NaN     ;  NaN     ]
  Ω₁,₁    0.08494    0.011022    [  0.063338;  0.10654 ]
  Ω₂,₂    0.048568   0.0063502   [  0.036122;  0.061014]
  Ω₃,₃    5.5811     1.2188      [  3.1922  ;  7.97    ]
  σ_p     0.10093    0.0057196   [  0.089718;  0.11214 ]
---------------------------------------------------------
† indicates constant parameters

Notice that tvka is fixed to 2 as we don’t have a lot of information before tmax. From the results above, we see that the parameter precision for this model is reasonable.

3.2.3 Two-compartment model

Just to be sure, let’s fit a 2-compartment model and evaluate:

pk_2cmp = @model begin

    @param begin
        """
        Clearance (L/hr)
        """
        tvcl  RealDomain(; lower = 0, init = 3.2)
        """
        Central Volume (L)
        """
        tvv  RealDomain(; lower = 0, init = 16.4)
        """
        Peripheral Volume (L)
        """
        tvvp  RealDomain(; lower = 0, init = 10)
        """
        Distributional Clearance (L/hr)
        """
        tvq  RealDomain(; lower = 0, init = 2)
        """
        Absorption rate constant (h-1)
        """
        tvka  RealDomain(; lower = 0, init = 1.3)
        """
          - ΩCL
          - ΩVc
          - ΩKa
          - ΩVp
          - ΩQ
        """
        Ω  PDiagDomain(init = [0.04, 0.04, 0.04, 0.04, 0.04])
        """
        Proportional RUV
        """
        σ_p  RealDomain(; lower = 0.0001, init = 0.2)
    end

    @random begin
        η ~ MvNormal(Ω)
    end

    @covariates begin
        """
        Dose (mg)
        """
        Dose
    end

    @pre begin
        CL = tvcl * exp(η[1])
        Vc = tvv * exp(η[2])
        Ka = tvka * exp(η[3])
        Vp = tvvp * exp(η[4])
        Q = tvq * exp(η[5])
    end

    @dynamics Depots1Central1Periph1

    @derived begin
        cp := @. Central / Vc
        """
        CTMx Concentration (ng/mL)
        """
        Conc ~ @. Normal(cp, cp * σ_p)
    end
end
Warning: Covariate Dose is not used in the model.
@ Pumas ~/run/_work/PumasTutorials.jl/PumasTutorials.jl/custom_julia_depot/packages/Pumas/LN24a/src/dsl/model_macro.jl:3153
PumasModel
  Parameters: tvcl, tvv, tvvp, tvq, tvka, Ω, σ_p
  Random effects: η
  Covariates: Dose
  Dynamical system variables: Depot, Central, Peripheral
  Dynamical system type: Closed form
  Derived: Conc
  Observed: Conc
3.2.3.1 FOCE
pkfit_2cmp =
    fit(pk_2cmp, pkpain_noplb, init_params(pk_2cmp), FOCE(); constantcoef = (; tvka = 2))
[ Info: Checking the initial parameter values.
[ Info: The initial negative log likelihood and its gradient are finite. Check passed.
Iter     Function value   Gradient norm 
     0    -6.302369e+02     1.021050e+03
 * time: 1.5974044799804688e-5
     1    -9.197817e+02     9.927951e+02
 * time: 0.7824268341064453
     2    -1.372640e+03     2.054986e+02
 * time: 1.0309619903564453
     3    -1.446326e+03     1.543987e+02
 * time: 1.271009922027588
     4    -1.545570e+03     1.855028e+02
 * time: 1.5110459327697754
     5    -1.581449e+03     1.713157e+02
 * time: 1.8391668796539307
     6    -1.639433e+03     1.257382e+02
 * time: 2.087116003036499
     7    -1.695964e+03     7.450539e+01
 * time: 2.315091848373413
     8    -1.722243e+03     5.961044e+01
 * time: 2.542961835861206
     9    -1.736883e+03     7.320921e+01
 * time: 2.7721588611602783
    10    -1.753547e+03     7.501938e+01
 * time: 3.0058889389038086
    11    -1.764053e+03     6.185661e+01
 * time: 3.2697129249572754
    12    -1.778991e+03     4.831033e+01
 * time: 3.5174968242645264
    13    -1.791492e+03     4.943278e+01
 * time: 3.77160382270813
    14    -1.799847e+03     2.871410e+01
 * time: 4.05883002281189
    15    -1.805374e+03     7.520790e+01
 * time: 4.327724933624268
    16    -1.816260e+03     2.990621e+01
 * time: 4.586523056030273
    17    -1.818252e+03     2.401915e+01
 * time: 4.81342887878418
    18    -1.822988e+03     2.587225e+01
 * time: 5.065968036651611
    19    -1.824653e+03     1.550517e+01
 * time: 5.287530899047852
    20    -1.826074e+03     1.788927e+01
 * time: 5.516044855117798
    21    -1.826821e+03     1.888389e+01
 * time: 5.741992950439453
    22    -1.827900e+03     1.432840e+01
 * time: 5.988553047180176
    23    -1.828511e+03     9.422040e+00
 * time: 6.229126930236816
    24    -1.828754e+03     5.363445e+00
 * time: 6.475062847137451
    25    -1.828862e+03     4.916168e+00
 * time: 6.711834907531738
    26    -1.829007e+03     4.695750e+00
 * time: 6.967119932174683
    27    -1.829358e+03     1.090244e+01
 * time: 7.210114002227783
    28    -1.829830e+03     1.451320e+01
 * time: 7.451592922210693
    29    -1.830201e+03     1.108695e+01
 * time: 7.698290824890137
    30    -1.830360e+03     2.892317e+00
 * time: 7.9604949951171875
    31    -1.830390e+03     1.699265e+00
 * time: 8.195772886276245
    32    -1.830404e+03     1.602222e+00
 * time: 8.424339056015015
    33    -1.830432e+03     2.823676e+00
 * time: 8.655365943908691
    34    -1.830475e+03     4.121601e+00
 * time: 8.8936128616333
    35    -1.830527e+03     5.080494e+00
 * time: 9.159162044525146
    36    -1.830591e+03     2.668323e+00
 * time: 9.406501054763794
    37    -1.830615e+03     3.522601e+00
 * time: 9.656627893447876
    38    -1.830623e+03     2.203940e+00
 * time: 9.893674850463867
    39    -1.830625e+03     1.642394e+00
 * time: 10.15145993232727
    40    -1.830627e+03     9.396311e-01
 * time: 10.402334928512573
    41    -1.830628e+03     8.588414e-01
 * time: 10.65546989440918
    42    -1.830628e+03     3.457037e-01
 * time: 10.879348039627075
    43    -1.830629e+03     4.556038e-01
 * time: 11.103724002838135
    44    -1.830630e+03     6.366787e-01
 * time: 11.32477593421936
    45    -1.830630e+03     4.104090e-01
 * time: 11.564074993133545
    46    -1.830630e+03     7.434196e-02
 * time: 11.773545980453491
    47    -1.830630e+03     7.316846e-02
 * time: 12.032345056533813
    48    -1.830630e+03     7.320992e-02
 * time: 12.325500011444092
    49    -1.830630e+03     7.471716e-02
 * time: 12.550195932388306
    50    -1.830630e+03     7.471716e-02
 * time: 12.828661918640137
    51    -1.830630e+03     7.471716e-02
 * time: 12.982517957687378
FittedPumasModel

Dynamical system type:                 Closed form

Number of subjects:                            120

Observation records:         Active        Missing
    Conc:                      1320              0
    Total:                     1320              0

Number of parameters:      Constant      Optimized
                                  1             10

Likelihood approximation:                     FOCE
Likelihood optimizer:                         BFGS

Termination Reason:                      NoXChange
Log-likelihood value:                    1830.6304

-------------------
         Estimate
-------------------
  tvcl    2.8137
  tvv    11.005
  tvvp    5.5401
  tvq     1.5159
† tvka    2.0
  Ω₁,₁    0.10266
  Ω₂,₂    0.060778
  Ω₃,₃    1.2012
  Ω₄,₄    0.42347
  Ω₅,₅    0.24471
  σ_p     0.048404
-------------------
† indicates constant parameters

3.3 Comparing One- versus Two-compartment models

The 2-compartment model has a much lower objective function compared to the 1-compartment. Let’s compare the estimates from the 2 models using the compare_estimates function.

compare_estimates(; pkfit_1cmp, pkfit_2cmp)
11×3 DataFrame
Row parameter pkfit_1cmp pkfit_2cmp
String Float64? Float64?
1 tvcl 3.1642 2.81373
2 tvv 13.288 11.0047
3 tvka 2.0 2.0
4 Ω₁,₁ 0.0849405 0.102656
5 Ω₂,₂ 0.0485682 0.060778
6 Ω₃,₃ 5.58107 1.20118
7 σ_p 0.100928 0.0484045
8 tvvp missing 5.54005
9 tvq missing 1.51591
10 Ω₄,₄ missing 0.423466
11 Ω₅,₅ missing 0.244713

We perform a likelihood ratio test to compare the two nested models. The test statistic and the \(p\)-value clearly indicate that a 2-compartment model should be preferred.

lrtest(pkfit_1cmp, pkfit_2cmp)
Statistic:          1200.0
Degrees of freedom:      4
P-value:               0.0

We should also compare the other metrics and statistics, such ηshrinkage, ϵshrinkage, aic, and bic using the metrics_table function.

@chain metrics_table(pkfit_2cmp) begin
    leftjoin(metrics_table(pkfit_1cmp); on = :Metric, makeunique = true)
    rename!(:Value => :pk2cmp, :Value_1 => :pk1cmp)
end
WARNING: using deprecated binding Distributions.MatrixReshaped in Pumas.
, use Distributions.ReshapedDistribution{2, S, D} where D<:Distributions.Distribution{Distributions.ArrayLikeVariate{1}, S} where S<:Distributions.ValueSupport instead.
20×3 DataFrame
Row Metric pk2cmp pk1cmp
Any Any Any
1 Successful true true
2 Estimation Time 12.985 5.449
3 Subjects 120 120
4 Fixed Parameters 1 1
5 Optimized Parameters 10 6
6 Conc Active Observations 1320 1320
7 Conc Missing Observations 0 0
8 Total Active Observations 1320 1320
9 Total Missing Observations 0 0
10 Likelihood Approximation Pumas.FOCE{Optim.NewtonTrustRegion{Float64}, Optim.Options{Float64, Nothing}} Pumas.FOCE{Optim.NewtonTrustRegion{Float64}, Optim.Options{Float64, Nothing}}
11 RawTypst("LogLikelihood (LL)") 1830.63 1231.19
12 RawTypst("-2LL") -3661.26 -2462.38
13 AIC -3641.26 -2450.38
14 BIC -3589.41 -2419.26
15 RawTypst("(η-shrinkage) η₁") 0.037 0.016
16 RawTypst("(η-shrinkage) η₂") 0.047 0.04
17 RawTypst("(η-shrinkage) η₃") 0.516 0.733
18 RawTypst("(ϵ-shrinkage) Conc") 0.185 0.105
19 RawTypst("(η-shrinkage) η₄") 0.287 missing
20 RawTypst("(η-shrinkage) η₅") 0.154 missing

We next generate some goodness of fit plots to compare which model is performing better. To do this, we first inspect the diagnostics of our model fit.

res_inspect_1cmp = inspect(pkfit_1cmp)
[ Info: Calculating predictions.
[ Info: Calculating weighted residuals.
[ Info: Calculating empirical bayes.
[ Info: Evaluating dose control parameters.
[ Info: Evaluating individual parameters.
[ Info: Done.
FittedPumasModelInspection

Likelihood approximation used for weighted residuals: FOCE
res_inspect_2cmp = inspect(pkfit_2cmp)
[ Info: Calculating predictions.
[ Info: Calculating weighted residuals.
[ Info: Calculating empirical bayes.
[ Info: Evaluating dose control parameters.
[ Info: Evaluating individual parameters.
[ Info: Done.
FittedPumasModelInspection

Likelihood approximation used for weighted residuals: FOCE
gof_1cmp = goodness_of_fit(
    res_inspect_1cmp;
    figure = (; fontsize = 12),
    legend = (; position = :bottom),
)

A 4-mosaic goodness of fit plot showing the 1-compartment model

Goodness of Fit Plots
gof_2cmp = goodness_of_fit(
    res_inspect_2cmp;
    figure = (; fontsize = 12),
    legend = (; position = :bottom),
)

Trend plot with observations for all individual subjects over time

Subject Fits

These plots clearly indicate that the 2-compartment model is a better fit compared to the 1-compartment model.

We can look at selected sample of individual plots.

fig_subject_fits = subject_fits(
    res_inspect_2cmp;
    separate = true,
    paginate = true,
    figure = (; fontsize = 18),
    axis = (; xlabel = "Time (hr)", ylabel = "CTMx Concentration (ng/mL)"),
)
fig_subject_fits[1]

Trend plot with observations for 9 individual subjects over time

Subject Fits for 9 Individuals

There a lot of important plotting functions you can use for your standard model diagnostics. Please make sure to read the documentation for plotting. Below, we are checking the distribution of the empirical Bayes estimates.

empirical_bayes_dist(res_inspect_2cmp; zeroline_color = :red)

A histogram for the empirical Bayes distribution of all subject-specific parameters

Empirical Bayes Distribution
empirical_bayes_vs_covariates(
    res_inspect_2cmp;
    categorical = [:Dose],
    figure = (; size = (600, 800)),
)

A histogram for the empirical Bayes distribution of all subject-specific parameters stratified by categorical covariates

Empirical Bayes Distribution Stratified by Covariates

Clearly, our guess at tvka seems off-target. Let’s try and estimate tvka instead of fixing it to 2:

pkfit_2cmp_unfix_ka = fit(pk_2cmp, pkpain_noplb, init_params(pk_2cmp), FOCE())
[ Info: Checking the initial parameter values.
[ Info: The initial negative log likelihood and its gradient are finite. Check passed.
Iter     Function value   Gradient norm 
     0    -3.200734e+02     1.272671e+03
 * time: 2.8848648071289062e-5
     1    -8.682982e+02     1.000199e+03
 * time: 3.590601921081543
     2    -1.381870e+03     5.008081e+02
 * time: 3.8490729331970215
     3    -1.551053e+03     6.833490e+02
 * time: 4.141964912414551
     4    -1.680887e+03     1.834586e+02
 * time: 4.557896852493286
     5    -1.726118e+03     8.870274e+01
 * time: 4.797769784927368
     6    -1.761023e+03     1.162036e+02
 * time: 5.143439769744873
     7    -1.786619e+03     1.114552e+02
 * time: 5.388704776763916
     8    -1.863556e+03     9.914305e+01
 * time: 5.70662784576416
     9    -1.882942e+03     5.342676e+01
 * time: 5.965419769287109
    10    -1.888020e+03     2.010181e+01
 * time: 6.261753797531128
    11    -1.889832e+03     1.867262e+01
 * time: 6.580699920654297
    12    -1.891649e+03     1.668510e+01
 * time: 6.886764764785767
    13    -1.892615e+03     1.820707e+01
 * time: 7.185096979141235
    14    -1.893453e+03     1.745193e+01
 * time: 7.459592819213867
    15    -1.894760e+03     1.850174e+01
 * time: 7.76335597038269
    16    -1.895647e+03     1.773921e+01
 * time: 8.038359880447388
    17    -1.896597e+03     1.143421e+01
 * time: 8.314603805541992
    18    -1.897114e+03     9.720034e+00
 * time: 8.607655763626099
    19    -1.897373e+03     6.054160e+00
 * time: 8.878468990325928
    20    -1.897498e+03     3.985923e+00
 * time: 9.149141788482666
    21    -1.897571e+03     4.262502e+00
 * time: 9.436415910720825
    22    -1.897633e+03     4.010316e+00
 * time: 9.697462797164917
    23    -1.897714e+03     4.805389e+00
 * time: 9.959414958953857
    24    -1.897802e+03     3.508614e+00
 * time: 10.246753931045532
    25    -1.897865e+03     3.691472e+00
 * time: 10.509668827056885
    26    -1.897900e+03     2.982676e+00
 * time: 10.770537853240967
    27    -1.897928e+03     2.563863e+00
 * time: 11.070770978927612
    28    -1.897968e+03     3.261530e+00
 * time: 11.331854820251465
    29    -1.898013e+03     3.064695e+00
 * time: 11.598069906234741
    30    -1.898040e+03     1.636456e+00
 * time: 11.88095498085022
    31    -1.898051e+03     1.439998e+00
 * time: 12.139494895935059
    32    -1.898057e+03     1.436505e+00
 * time: 12.401907920837402
    33    -1.898069e+03     1.881592e+00
 * time: 12.663623809814453
    34    -1.898095e+03     3.253228e+00
 * time: 12.946459770202637
    35    -1.898142e+03     4.257954e+00
 * time: 13.228419780731201
    36    -1.898199e+03     3.685153e+00
 * time: 13.492685794830322
    37    -1.898245e+03     2.567377e+00
 * time: 13.789976835250854
    38    -1.898246e+03     2.561577e+00
 * time: 14.166137933731079
    39    -1.898251e+03     2.530928e+00
 * time: 14.519227981567383
    40    -1.898298e+03     2.673773e+00
 * time: 14.792173862457275
    41    -1.898300e+03     2.795859e+00
 * time: 15.141010999679565
    42    -1.898337e+03     3.666102e+00
 * time: 15.483920812606812
    43    -1.898342e+03     3.753077e+00
 * time: 15.927880764007568
    44    -1.898429e+03     4.461850e+00
 * time: 16.275630950927734
    45    -1.898461e+03     3.584769e+00
 * time: 16.588422775268555
    46    -1.898477e+03     2.357431e+00
 * time: 16.86016583442688
    47    -1.898479e+03     7.373685e-01
 * time: 17.13435387611389
    48    -1.898479e+03     6.197624e-01
 * time: 17.49106478691101
    49    -1.898480e+03     1.716594e+00
 * time: 17.830471992492676
    50    -1.898480e+03     1.128276e+00
 * time: 18.112990856170654
    51    -1.898480e+03     9.157160e-01
 * time: 18.42747187614441
    52    -1.898480e+03     3.051694e-01
 * time: 18.7495379447937
    53    -1.898480e+03     2.316257e-01
 * time: 19.03283977508545
    54    -1.898480e+03     2.316260e-01
 * time: 19.29244589805603
    55    -1.898480e+03     2.344996e-01
 * time: 19.769754886627197
    56    -1.898480e+03     2.372822e-01
 * time: 20.249743938446045
    57    -1.898480e+03     2.372809e-01
 * time: 20.7313449382782
    58    -1.898480e+03     2.372808e-01
 * time: 21.175105810165405
    59    -1.898480e+03     2.372808e-01
 * time: 21.65797996520996
    60    -1.898480e+03     2.372808e-01
 * time: 22.13892698287964
    61    -1.898480e+03     2.372808e-01
 * time: 22.614396810531616
    62    -1.898480e+03     2.372808e-01
 * time: 23.0898699760437
FittedPumasModel

Dynamical system type:                 Closed form

Number of subjects:                            120

Observation records:         Active        Missing
    Conc:                      1320              0
    Total:                     1320              0

Number of parameters:      Constant      Optimized
                                  0             11

Likelihood approximation:                     FOCE
Likelihood optimizer:                         BFGS

Termination Reason:              NoObjectiveChange
Log-likelihood value:                    1898.4797

-----------------
       Estimate
-----------------
tvcl    2.6191
tvv    11.378
tvvp    8.4529
tvq     1.3164
tvka    4.8925
Ω₁,₁    0.13243
Ω₂,₂    0.059669
Ω₃,₃    0.41581
Ω₄,₄    0.080679
Ω₅,₅    0.24996
σ_p     0.049098
-----------------
compare_estimates(; pkfit_2cmp, pkfit_2cmp_unfix_ka)
11×3 DataFrame
Row parameter pkfit_2cmp pkfit_2cmp_unfix_ka
String Float64? Float64?
1 tvcl 2.81373 2.61912
2 tvv 11.0047 11.3783
3 tvvp 5.54005 8.45292
4 tvq 1.51591 1.31636
5 tvka 2.0 4.89251
6 Ω₁,₁ 0.102656 0.132433
7 Ω₂,₂ 0.060778 0.0596692
8 Ω₃,₃ 1.20118 0.415807
9 Ω₄,₄ 0.423466 0.080679
10 Ω₅,₅ 0.244713 0.249963
11 σ_p 0.0484045 0.0490975

Let’s revaluate the goodness of fits and η distribution plots.

Not much change in the general gof plots

res_inspect_2cmp_unfix_ka = inspect(pkfit_2cmp_unfix_ka)
[ Info: Calculating predictions.
[ Info: Calculating weighted residuals.
[ Info: Calculating empirical bayes.
[ Info: Evaluating dose control parameters.
[ Info: Evaluating individual parameters.
[ Info: Done.
FittedPumasModelInspection

Likelihood approximation used for weighted residuals: FOCE
goodness_of_fit(
    res_inspect_2cmp_unfix_ka;
    figure = (; fontsize = 12),
    legend = (; position = :bottom),
)

A 4-mosaic goodness of fit plot showing the 2-compartment model

Goodness of Fit Plots

But you can see a huge improvement in the ηka, (η₃) distribution which is now centered around zero

empirical_bayes_vs_covariates(
    res_inspect_2cmp_unfix_ka;
    categorical = [:Dose],
    ebes = [:η₃],
    figure = (; size = (600, 800)),
)

A histogram for the empirical Bayes distribution of all subject-specific parameters stratified by categorical covariates

Empirical Bayes Distribution Stratified by Covariates

Finally looking at some individual plots for the same subjects as earlier:

fig_subject_fits2 = subject_fits(
    res_inspect_2cmp_unfix_ka;
    separate = true,
    paginate = true,
    facet = (; linkyaxes = false),
    figure = (; fontsize = 18),
    axis = (; xlabel = "Time (hr)", ylabel = "CTMx Concentration (ng/mL)"),
)
fig_subject_fits2[6]

Trend plot with observations for 9 individual subjects over time

Subject Fits for 9 Individuals

The randomly sampled individual fits don’t seem good in some individuals, but we can evaluate this via a vpc to see how to go about.

3.4 Visual Predictive Checks (VPC)

We can now perform a vpc to check. The default plots provide a 80% prediction interval and a 95% simulated CI (shaded area) around each of the quantiles

pk_vpc = vpc(pkfit_2cmp_unfix_ka, 200; observations = [:Conc], stratify_by = [:Dose])
[ Info: Continuous VPC
Visual Predictive Check
  Type of VPC: Continuous VPC
  Simulated populations: 200
  Subjects in data: 120
  Stratification variable(s): [:Dose]
  Confidence level: 0.95
  VPC lines: quantiles ([0.1, 0.5, 0.9])
vpc_plot(
    pk_2cmp,
    pk_vpc;
    rows = 1,
    columns = 3,
    figure = (; size = (1400, 1000), fontsize = 22),
    axis = (;
        xlabel = "Time (hr)",
        ylabel = "Observed/Predicted\n CTMx Concentration (ng/mL)",
    ),
    facet = (; combinelabels = true),
)

A visual predictive plot stratified by dose group

Visual Predictive Plots

The visual predictive check suggests that the model captures the data well across all dose levels.

4 Additional Help

If you have questions regarding this tutorial, please post them on our discourse site.