Case Study II - Building a PopPK model with multiple doses

Authors

Patrick Kofod Mogensen

Vijay Ivaturi

using Pumas
using DataFramesMeta
using PumasUtilities
using CairoMakie

1 Second of three case studies

This second tutorial based on in a series of case studies found here . The second case study is not too different from the first. It has oral dosing so a slightly more advanced model is needed. For this reason, we add some bonus material to this case study not found in the original. We show how to perform visual predictive checks, bootstrap and SIR inference, and how to simulate alternative dosing regimens.

2 Getting the data

Please download the data file for this tutorial and place it in the same location as this tutorial. The data file consists of data obtained from 10 individuals who were treated with 500mg dose every eight hours for five days. The dataset is loaded it into memory as a DataFrame using CSV.read. We use the first(input, number_of_elements) function to show the first 10 rows of the DataFrame.

using CSV
pkdata = CSV.read("CS2_oralestaddl.csv", DataFrame; header = 5)
first(pkdata, 10)
10×7 DataFrame
Row CID TIME CONC AMT ADDL II MDV
Int64 Float64 Float64 Int64 Int64 Int64 Int64
1 1 0.0 0.0 500 14 8 1
2 1 0.5 4.5799 0 0 0 0
3 1 1.0 8.5717 0 0 0 0
4 1 3.0 10.979 0 0 0 0
5 1 4.0 11.347 0 0 0 0
6 1 6.0 8.7705 0 0 0 0
7 1 8.0 7.0155 0 0 0 0
8 1 8.5 10.889 0 0 0 0
9 1 16.0 9.7015 0 0 0 0
10 1 16.5 14.115 0 0 0 0

The file contains 11 columns:

  • CID: subject identification number.
  • TIME: Blood sampling time. Needs to be monotonically increasing.
  • CONC: The dependent variable. In this case, a plasma concentration.
  • AMT: Amount of drug administered at dosing time in the TIME column or zero for observation rows
  • DOSE: Dose group. Included for post-processing and asset generation (tables and plots).
  • ADDL: Number of additional doses. The value of the column is 14 in event rows for a total of 15 doses (3 a day for 5 days).
  • II: the inter-dose interval. The value is 8 (Hr) in event rows.
  • MDV: Missing dependent variable. Can be used to flag observations that should be ignored.

The names in the dataset reflects that it originates from a NONMEM case study. There are several conventions that are historical in nature and that relate to the functionality of NONMEM. For example, the columns names are all uppercase. We rename the ID column as follows:

rename!(pkdata, :CID => :ID)

The final step before our analysis data is ready is to add event columns. These indicate whether a row is an event or an observation. If the row specifies an event it needs to indicate which compartment the dose applies to. The EVID column indicates the type. If it is 0 the row is an observation and if it is 1 the row in a dosing event.

@rtransform!(pkdata, :EVID = :AMT == 0 ? 0 : 1)
@rtransform!(pkdata, :CMT = :AMT == 0 ? missing : Symbol("Depot"))

Instead of using the MDV functionality that would typically be used in NONMEM we set the CONC column to missing for event records.

@rtransform! pkdata :CONC = :EVID == 1 ? missing : :CONC

3 Mapping to a population

In Pumas, we take tabular datasets and convert them in to Populations. A Population is nothing but a collection of Subjects. Subjects can be constructed programmatically or they can be read from tabular data such as a DataFrame. To map from tabular data to a Population we use the read_pumas function. In this case, the population is read using the following code:

population = read_pumas(
    pkdata;
    id = :ID,
    time = :TIME,
    observations = [:CONC],
    evid = :EVID,
    amt = :AMT,
    addl = :ADDL,
    ii = :II,
    cmt = :CMT,
)
Population
  Subjects: 100
  Observations: CONC

Once created we see some summary information about how many subjects there are and what kind of information they have.

With a Population assigned to a variable, we can do a few things. For example, we can plot the observations against time using the following built-in plotting function.

observations_vs_time(population)

In the plot we see data and a loess fit to the data. It can be useful to get a general idea of what the data looks like.

4 Model code

We are now ready to define our model. For this study, we have oral administration data and we expect there to be a significant distribution phase. This means that we will need a Depot and a Peripheral compartment besides the Central compartment that will be used to predict concentrations that we can relate to data. Such a model has a closed form solution and in Pumas this specific model is called Depots1Central1Periph1 and has to be specified in the @dynamics part of the model code.

oral2cmt = @model begin
    @metadata begin
        desc = "PROJECT multipledose oral study"
        timeu = u"hr" # hour
    end
    @param begin
        θka  RealDomain(lower = 0.01)
        θcl  RealDomain(lower = 0.2)
        θvc  RealDomain(lower = 0.1)
        θvp  RealDomain(lower = 1.0)
        θq  RealDomain(lower = 0.1)
        Ω  PDiagDomain(4)
        σ_add  RealDomain(lower = 0.0)
        σ_prop  RealDomain(lower = 0.0)
    end
    @random begin
        η ~ MvNormal(Ω)
    end
    @pre begin
        Ka = θka
        CL = θcl * exp(η[1])
        Vc = θvc * exp(η[2])
        Vp = θvp * exp(η[3])
        Q = θq * exp(η[4])
    end
    @dosecontrol begin
        bioav = (Depot = 0.5,)
    end
    @dynamics Depots1Central1Periph1
    @derived begin
        conc_model := @. Central / Vc
        CONC ~ @. Normal(conc_model, sqrt(σ_add^2 + (conc_model * σ_prop)^2))
    end
end
PumasModel
  Parameters: θka, θcl, θvc, θvp, θq, Ω, σ_add, σ_prop
  Random effects: η
  Covariates: 
  Dynamical system variables: Depot, Central, Peripheral
  Dynamical system type: Closed form
  Derived: CONC
  Observed: CONC

We note that a bioav value is specified in the @dosecontrol model-block. With data on orally administered drugs alone it is not possible to estimate the reduced bioavailability that so often characterizes these oral routes. If we cannot estimate it, we will only be able to estimate so-called apparent clearance (written CL/F), apparent volume of distribution (Vc/F), and so on. Alternatively, the bioavailability might be known from a previous study. Then we can simply include it in the model. This is done in the @dosecontrol model block. Here, we assume 50% bioavailability, but if we replace 0.5 with a parameter from the @param block (and restrict it to have a lower limit of 0 and an upper limit of 1) we can also estimate it from data under appropriate conditions and assumptions.

Besides the data and model we have to set initial parameters. This is done by specifying a NamedTuple using the syntax below.

initial_est_oral2cmt = (
    θcl = 1.0,
    θka = 0.25,
    θvc = 1.0,
    θvp = 10.0,
    θq = 1.5,
    Ω = Diagonal([0.09, 0.09, 0.09, 0.09]),
    σ_add = sqrt(10.0),
    σ_prop = sqrt(0.01),
)

5 Fit base model

To fit the model to the data we use the fit function. We pass the show_trace = false option on to the optimizer to suppress some of the information during the fitting.

oral2cmt_results = fit(
    oral2cmt,
    population,
    initial_est_oral2cmt,
    Pumas.FOCE();
    optim_options = (show_trace = false,),
)
[ Info: Checking the initial parameter values.
[ Info: The initial negative log likelihood and its gradient are finite. Check passed.
FittedPumasModel

Successful minimization:                      true

Likelihood approximation:                     FOCE
Likelihood Optimizer:                         BFGS
Dynamical system type:                 Closed form

Log-likelihood value:                   -3988.0159
Number of subjects:                            100
Number of parameters:         Fixed      Optimized
                                  0             11
Observation records:         Active        Missing
    CONC:                      4200              0
    Total:                     4200              0

---------------------
           Estimate
---------------------
θka         0.31845
θcl         2.0376
θvc         5.0125
θvp        10.793
θq          1.9526
Ω₁,₁        0.086664
Ω₂,₂        0.17962
Ω₃,₃        0.089769
Ω₄,₄        0.18984
σ_add       0.032052
σ_prop      0.051134
---------------------

5.1 Understanding the output

From the final output we see several lines of information. We see the information about the likelihood approximation, dynamical model solution method and so on. An important line to know about is the Log-likelihood line.

    Log-likelihood value:                   -3988.0159

During fitting we maximize this value, and it can also be queried programmatically with the loglikelihood function.

loglikelihood(oral2cmt_results)
-3988.0158650003814
Comparison to NONMEM

In Case Study 1 we went into detail with how to compare log-likelihood values between Pumas and NONEMEM. We won’t go into too much detail here, but just remember that we can compare NONMEM output to the Pumas results by running

-2loglikelihood(oral2cmt_results)
7976.031730000763

and compare it to the OBJECTIVE FUNCTION VALUE WITH CONSTANT line in the lst file to verify if the fits match between the two pieces of software.

The main model parameters are the fixed effects that are shown in the beginning of the table.

θcl         2.0376
θka         0.31845
θvc         5.0125
θvp        10.793
θq          1.9526

We can also look at the estimated variances (often referred to as omegas or Ωs) of the random effects (often referred to as etas or ηs).

Ω₁,₁        0.086664
Ω₂,₂        0.17962
Ω₃,₃        0.089768
Ω₄,₄        0.18983

The subscripts refer to the position in the variance-covariance matrix Ω.

The last two parameters refer to the two components of the error model.

σ_add       2.9282
σ_prop      0.12113

Again, these are standard deviations which is why we had to square them in the error model specification.

CONC ~ @. Normal(conc_model, sqrt(σ_add^2 + (conc_model*σ_prop)^2))

This concludes looking at the results from the fit function call.

6 Create additional output and interpret it

To get access to diagnostic such as predictions and residuals we need to use the inspect function.

oral2cmt_inspect = inspect(oral2cmt_results)
[ Info: Calculating predictions.
[ Info: Calculating weighted residuals.
[ Info: Calculating empirical bayes.
[ Info: Evaluating individual parameters.
[ Info: Evaluating dose control parameters.
[ Info: Done.
FittedPumasModelInspection

Fitting was successful: true
Likehood approximation used for weighted residuals : Pumas.FOCE{Optim.NewtonTrustRegion{Float64}, Optim.Options{Float64, Nothing}}(Optim.NewtonTrustRegion{Float64}(1.0, 100.0, 1.4901161193847656e-8, 0.1, 0.25, 0.75, false), Optim.Options(x_abstol = 0.0, x_reltol = 0.0, f_abstol = 0.0, f_reltol = 0.0, g_abstol = 1.0e-5, g_reltol = 1.0e-8, outer_x_abstol = 0.0, outer_x_reltol = 0.0, outer_f_abstol = 0.0, outer_f_reltol = 0.0, outer_g_abstol = 1.0e-8, outer_g_reltol = 1.0e-8, f_calls_limit = 0, g_calls_limit = 0, h_calls_limit = 0, allow_f_increases = false, allow_outer_f_increases = true, successive_f_tol = 1, iterations = 1000, outer_iterations = 1000, store_trace = false, trace_simplex = false, show_trace = false, extended_trace = false, show_every = 1, time_limit = NaN, )
)

With the variable oral2cmt_inspect in hand, we are now ready to use many of the built-in functions that can be used to diagnose the model fit. We typically start with the goodness_of_fit plot.

goodness_of_fit(oral2cmt_inspect, markercolor = (:gray, 0.3))

6.1 Subject fits

One useful plot is the subject_fits plot that combines data points with individual and population predictions.

subject_fits(oral2cmt_inspect)

We can also look at individual plots or groups of plots. By use of the paginate and separate keywords. If we activate the two by setting them to true we will get a vector of plots back, and we can use indexing with square brackets to get the one we want to see.

subfits = subject_fits(
    oral2cmt_inspect;
    separate = true,
    paginate = true,
    facets = (combinelabels = true,),
)
subfits[1]

As noticed in the plot above, there can be jagged motion of the predicted concentration during the days when only one or 2 concentrations are measured. In such cases, Pumas provide easy ways to generate rich predictions.

indpreds = [
    predict(
        oral2cmt_results.model,
        oral2cmt_results.data[i],
        oral2cmt_results.param;
        obstimes = minimum(oral2cmt_results.data[i].time):0.5:maximum(
            oral2cmt_results.data[i].time,
        ),
    ) for i = 1:length(oral2cmt_results.data)
]
Prediction
  Subjects: 100
  Predictions: CONC

The code above performs a rich prediction using the final parameter estimates from the model by predicting concentrations at every 0.5 hours from the minimum to maximum time for each subject. This code is a called as array comprehension or simply put, a for loop.

Understanding Array Comprehension

Purpose of the Code

The code aims to generate individual predictions for each subject in a dataset.

Understanding Array Comprehensions

Array comprehensions are a concise way to create arrays in Julia. They combine the power of loops and conditional statements into a single, elegant expression.

Think of it like a recipe:

  1. Ingredients: You have the data from each subject (oral2cmt_results.data[i]), the prediction model (oral2cmt_results.model), and the model parameters (oral2cmt_results.param).

  2. Instructions:

    • For each subject: Iterate through all subjects (indicated by i going from 1 to the number of subjects).
    • Get the data: Extract the data for the current subject (oral2cmt_results.data[i]).
    • Calculate observation times: Determine the range of time points for which to generate predictions (minimum(oral2cmt_results.data[i].time):0.5:maximum(oral2cmt_results.data[i].time)). This generates a sequence of time points starting from the earliest observation time for the subject, up to the latest observation time, with a step size of 0.5.
    • Predict: Use the model and parameters to generate predictions for these time points for the current subject (predict(...)).
    • Collect: Store these predictions in a new array (indpreds).

Code Breakdown

Let’s translate the code into plain English:

  1. indpreds = [ : This starts creating an empty array called indpreds, which will eventually hold the predictions for all subjects.

  2. [... for i = 1:length(oral2cmt_results.data)] : This is the array comprehension. It tells Julia to do the following for each value of i, where i ranges from 1 to the total number of subjects:

    • predict( ... ): Generate predictions using the model, data for subject i, and the parameters. The obstimes argument specifies the time points for which we want predictions.
  3. ] : This closes the array comprehension, signaling that the predictions for all subjects have been calculated and stored in the indpreds array.

Key Points

  • The oral2cmt_results object comes from fitting a pharmacokinetic model.
  • The predict function takes the model, data for a specific individual, and parameters, and returns model predictions at the requested time points (obstimes).
  • The array comprehension elegantly replaces a for loop to generate individual predictions efficiently.

Alternative Using a Loop

If you’re more comfortable with loops, here’s the equivalent code:

indpreds = []
for i = 1:length(oral2cmt_results.data)
    subject_data = oral2cmt_results.data[i]
    obstimes = minimum(subject_data.time):0.5:maximum(subject_data.time)
    pred = predict(oral2cmt_results.model, subject_data, oral2cmt_results.param; obstimes)
    push!(indpreds, pred)
end
indpreds_sf = subject_fits(
    indpreds,
    figurelegend = (position = :t, alignmode = Outside(), orientation = :horizontal),
    paginate = true,
    separate = true,
    limit = 9,
    rows = 3,
    columns = 3,
    ipred_linewidth = 6,
    pred_linewidth = 6,
    markersize = 16,
    facet = (combinelabels = true,),
    figure = (resolution = (1400, 1000), fontsize = 28),
    axis = (
        xticks = 0:12:1600,
        xticklabelrotation = π / 2,
        ylabel = "Concentration",
        xlabel = "Time (hours)",
    ),
)
indpreds_sf[1]

The plot generated here is smoother compared to using the original ipreds from the model that are at the observed time only and appear like a true pharmacokinetic muptiple dose profile.

7 Inference

To calculate standard errors we use the infer function.

infer_covar = infer(oral2cmt_results)
infer_table = coeftable(infer_covar)
[ Info: Calculating: variance-covariance matrix.
[ Info: Done.
11×5 DataFrame
Row parameter estimate se ci_lower ci_upper
String Float64 Float64 Float64 Float64
1 θka 0.318446 0.0138923 0.291218 0.345674
2 θcl 2.03758 0.0600642 1.91985 2.1553
3 θvc 5.01246 0.261288 4.50034 5.52457
4 θvp 10.7933 0.328853 10.1488 11.4379
5 θq 1.9526 0.104232 1.74831 2.15689
6 Ω₁,₁ 0.086664 0.01206 0.0630269 0.110301
7 Ω₂,₂ 0.179617 0.0239031 0.132768 0.226467
8 Ω₃,₃ 0.0897686 0.0143274 0.0616874 0.11785
9 Ω₄,₄ 0.189837 0.0246662 0.141493 0.238182
10 σ_add 0.0320518 0.0144757 0.00367996 0.0604236
11 σ_prop 0.051134 0.000737659 0.0496882 0.0525798

To calculate RSEs we can simply create a new column in the DataFrame.

infer_table.RSE = infer_table.se ./ infer_table.estimate .* 100
infer_table
11×6 DataFrame
Row parameter estimate se ci_lower ci_upper RSE
String Float64 Float64 Float64 Float64 Float64
1 θka 0.318446 0.0138923 0.291218 0.345674 4.36252
2 θcl 2.03758 0.0600642 1.91985 2.1553 2.94783
3 θvc 5.01246 0.261288 4.50034 5.52457 5.21278
4 θvp 10.7933 0.328853 10.1488 11.4379 3.04682
5 θq 1.9526 0.104232 1.74831 2.15689 5.33812
6 Ω₁,₁ 0.086664 0.01206 0.0630269 0.110301 13.9158
7 Ω₂,₂ 0.179617 0.0239031 0.132768 0.226467 13.3078
8 Ω₃,₃ 0.0897686 0.0143274 0.0616874 0.11785 15.9604
9 Ω₄,₄ 0.189837 0.0246662 0.141493 0.238182 12.9933
10 σ_add 0.0320518 0.0144757 0.00367996 0.0604236 45.1634
11 σ_prop 0.051134 0.000737659 0.0496882 0.0525798 1.4426

This concludes Case Study II as presented in the original material.

8 Bonus material

Even if Case Study II has been covered, we would like to take this opportunity to show more features that could be useful after fitting the model to data.

8.1 Visual Predictive Checks

Recently, it has become quite popular to asses model quality using the so-called Visual Predictive Checks (VPCs). These are easily simulated and plotted in Pumas using the vpc and vpc_plot functions.

oral2cmt_vpc = vpc(oral2cmt_results; samples = 100)
vpc_plot(oral2cmt_vpc)
[ Info: Continuous VPC

From the plot, we see that the simulated profiles fall well within the intervals defined by the data profiles.

On the other hand, if we would like to view this more as a oral dose pharmacokinetic concentration time profile, then we use tad, time after dose as a x-axis variable. Here is corresponding plot.

oral2cmt_vpc_tad = vpc(oral2cmt_results; covariates = [:tad], samples = 100)
vpc_plot(oral2cmt_vpc_tad)
[ Info: Continuous VPC

9 Inference

Beyond the asymptotic variance-covariance based inference that we presented above, it is also possible to use the Bootstrap and SIR features available in the infer function of Pumas.

bts = infer(oral2cmt_results, Bootstrap(samples = 100))
┌ Info: Bootstrap inference finished.
│   Total resampled fits = 100
│   Success rate = 1.0
└   Unique resampled populations = 100
Bootstrap inference results

Successful minimization:                      true

Likelihood approximation:                     FOCE
Likelihood Optimizer:                         BFGS
Dynamical system type:                 Closed form

Log-likelihood value:                   -3988.0159
Number of subjects:                            100
Number of parameters:         Fixed      Optimized
                                  0             11
Observation records:         Active        Missing
    CONC:                      4200              0
    Total:                     4200              0

---------------------------------------------------------------------
          Estimate           SE                       95.0% C.I.
---------------------------------------------------------------------
θka        0.31845         0.014065          [ 0.29369  ;  0.3416  ]
θcl        2.0376          0.059841          [ 1.9242   ;  2.1398  ]
θvc        5.0125          0.23491           [ 4.6328   ;  5.4722  ]
θvp       10.793           0.32798           [10.235    ; 11.41    ]
θq         1.9526          0.087475          [ 1.7662   ;  2.1032  ]
Ω₁,₁       0.086664        0.011905          [ 0.065655 ;  0.11276 ]
Ω₂,₂       0.17962         0.023925          [ 0.13531  ;  0.21942 ]
Ω₃,₃       0.089769        0.01316           [ 0.066336 ;  0.11496 ]
Ω₄,₄       0.18984         0.026307          [ 0.14528  ;  0.24592 ]
σ_add      0.032052        0.016788          [ 2.1802e-5;  0.064447]
σ_prop     0.051134        0.00070352        [ 0.04985  ;  0.052361]
---------------------------------------------------------------------
Successful fits: 100 out of 100
Unique resampled populations: 100
No stratification.
sir = infer(oral2cmt_results, SIR(samples = 100, resamples = 20))
[ Info: Calculating: variance-covariance matrix.
[ Info: Done.
[ Info: Running SIR.
[ Info: Resampling.
Simulated inference results

Successful minimization:                      true

Likelihood approximation:                     FOCE
Likelihood Optimizer:                         BFGS
Dynamical system type:                 Closed form

Log-likelihood value:                   -3988.0159
Number of subjects:                            100
Number of parameters:         Fixed      Optimized
                                  0             11
Observation records:         Active        Missing
    CONC:                      4200              0
    Total:                     4200              0

---------------------------------------------------------------------
          Estimate           SE                       95.0% C.I.
---------------------------------------------------------------------
θka        0.31845         0.010716          [ 0.3028   ;  0.33982 ]
θcl        2.0376          0.087323          [ 1.9274   ;  2.214   ]
θvc        5.0125          0.27219           [ 4.6728   ;  5.5924  ]
θvp       10.793           0.348             [10.444    ; 11.636   ]
θq         1.9526          0.10503           [ 1.8308   ;  2.1898  ]
Ω₁,₁       0.086664        0.012274          [ 0.070114 ;  0.11401 ]
Ω₂,₂       0.17962         0.023977          [ 0.148    ;  0.22019 ]
Ω₃,₃       0.089769        0.0099616         [ 0.075186 ;  0.10911 ]
Ω₄,₄       0.18984         0.025483          [ 0.15419  ;  0.24804 ]
σ_add      0.032052        0.013152          [ 0.0075296;  0.052923]
σ_prop     0.051134        0.00077223        [ 0.049906 ;  0.052347]
---------------------------------------------------------------------

We go into more details about different inference methods in the documentation and in other tutorials.

10 Simulation

A very important part of the usual pharmacometrics workflow is to simulate trajectories. In the following, we can see how to simulate from our model, given the subjects we had in the dataset, the estimated parameters, and the empirical bayes estimates as the values for our random effects.

oral_2cmt_sims =
    simobs(oral2cmt, population, coef(oral2cmt_results), empirical_bayes(oral2cmt_results))

p1 = sim_plot(oral2cmt, oral_2cmt_sims[8:8]; observations = [:CONC])

We plot the simulated data for subject number 8.

10.1 Simulating alternative doses

Simulation is often used to compare some baseline treatment to a new, potentially better, treatment, dose or dosage regimen. There are many ways to carry out that exercise. Below, we will take the original dataset we had and replace the event information with a new proposed dosage regimen. Instead of three doses a day we propose two doses a day. We keep the drug amount the same by increasing the dose to 750 mg. Then the total dose amount per day stays at 1500 mg. To keep the treatment going for a total of five days, we need to change addl to 9 (for a total of 10 doses), set ii to 12 and amt to 750.

10.2 Using ordinary differential equation notation for dynamics

So far, we have used the closed for solution interface. Now, we want to characterize the AUC and Cmax of the old and new dosage regimen and we would like to add another compartment to integrate the area under the curve (AUC) for the Central compartment.

oral2cmt_ode = @model begin
    @param begin
        θka  RealDomain(lower = 0.01)
        θcl  RealDomain(lower = 0.2)
        θvc  RealDomain(lower = 0.1)
        θvp  RealDomain(lower = 1.0)
        θq  RealDomain(lower = 0.1)
        Ω  PDiagDomain(4)
        σ_add  RealDomain(lower = 0.0)
        σ_prop  RealDomain(lower = 0.0)
    end
    @random begin
        η ~ MvNormal(Ω)
    end
    @pre begin
        Ka = θka
        CL = θcl * exp(η[1])
        Vc = θvc * exp(η[2])
        Vp = θvp * exp(η[3])
        Q = θq * exp(η[4])
    end
    @dosecontrol begin
        bioav = (Depot = 0.5,)
    end
    @dynamics begin
        Depot' = -Ka * Depot
        Central' = Ka * Depot - (CL + Q) / Vc * Central + Q / Vp * Periph
        Periph' = Q / Vc * Central - Q / Vp * Periph
        AUC_Central' = Central
    end
    @derived begin
        conc_model = @. Central / Vc
        _t = t
    end
end
PumasModel
  Parameters: θka, θcl, θvc, θvp, θq, Ω, σ_add, σ_prop
  Random effects: η
  Covariates: 
  Dynamical system variables: Depot, Central, Periph, AUC_Central
  Dynamical system type: Matrix exponential
  Derived: conc_model, _t
  Observed: conc_model, _t

The notation should be straight forward for anyone with knowledge of differential equations. The time derivatives are on the right-hand sides, and the name of the corresponding dynamical variable is inferred from the X' expression on the left-hand side in @dynamics.

10.3 Data munging

new_dr = DosageRegimen(750, addl = 9, ii = 12)
new_population = Subject.(population, events = new_dr)
Population
  Subjects: 100
  Observations: CONC

The second line uses the . to take each subject in population and construct a new similar Subject using the Subject constructor with keyword specification events = new_dr that means to replace the existing events with the content of new_dr. It is equivalent to define a function f(sub) = Subject(sub, events = new_dr) and apply this function to each element of population.

### Original doses simulations
oral_2cmt_original_sims = simobs(
    oral2cmt_ode,
    population,
    coef(oral2cmt_results),
    empirical_bayes(oral2cmt_results),
)
simulated_original_df = DataFrame(oral_2cmt_original_sims)

### Counter factual simulation
oral_2cmt_new_sims = simobs(
    oral2cmt_ode,
    new_population,
    coef(oral2cmt_results),
    empirical_bayes(oral2cmt_results),
)
simulated_new_df = DataFrame(oral_2cmt_new_sims)
first(simulated_new_df, 10)
10×26 DataFrame
Row id time conc_model _t evid bioav_Depot amt cmt rate duration ss ii route η_1 η_2 η_3 η_4 Depot Central Periph AUC_Central Ka CL Vc Vp Q
String? Float64 Float64? Float64? Int64? Float64? Float64? Symbol? Float64? Float64? Int8? Float64? NCA.Route? Float64? Float64? Float64? Float64? Float64? Float64? Float64? Float64? Float64? Float64? Float64? Float64? Float64?
1 1 0.0 missing missing 1 0.5 750.0 Depot 0.0 0.0 0 0.0 NullRoute -0.097715 0.291929 -0.335907 -0.278312 missing missing missing missing 0.318446 1.84789 6.71172 7.71388 1.47824
2 1 0.5 7.27344 0.5 0 missing missing missing missing missing missing missing missing -0.097715 0.291929 -0.335907 -0.278312 319.802 48.8173 2.78412 13.0619 0.318446 1.84789 6.71172 7.71388 1.47824
3 1 1.0 11.9451 1.0 0 missing missing missing missing missing missing missing missing -0.097715 0.291929 -0.335907 -0.278312 272.729 80.1721 9.45641 45.9175 0.318446 1.84789 6.71172 7.71388 1.47824
4 1 3.0 17.0629 3.0 0 missing missing missing missing missing missing missing missing -0.097715 0.291929 -0.335907 -0.278312 144.256 114.521 45.443 257.08 0.318446 1.84789 6.71172 7.71388 1.47824
5 1 4.0 16.1647 4.0 0 missing missing missing missing missing missing missing missing -0.097715 0.291929 -0.335907 -0.278312 104.914 108.493 59.9594 369.143 0.318446 1.84789 6.71172 7.71388 1.47824
6 1 6.0 13.0021 6.0 0 missing missing missing missing missing missing missing missing -0.097715 0.291929 -0.335907 -0.278312 55.4926 87.2665 76.5441 565.506 0.318446 1.84789 6.71172 7.71388 1.47824
7 1 8.0 10.0492 8.0 0 missing missing missing missing missing missing missing missing -0.097715 0.291929 -0.335907 -0.278312 29.3519 67.4475 80.1127 719.474 0.318446 1.84789 6.71172 7.71388 1.47824
8 1 8.5 9.42237 8.5 0 missing missing missing missing missing missing missing missing -0.097715 0.291929 -0.335907 -0.278312 25.0315 63.2403 79.6486 752.133 0.318446 1.84789 6.71172 7.71388 1.47824
9 1 12.0 missing missing 1 0.5 750.0 Depot 0.0 0.0 0 0.0 NullRoute -0.097715 0.291929 -0.335907 -0.278312 missing missing missing missing 0.318446 1.84789 6.71172 7.71388 1.47824
10 1 16.0 20.1999 16.0 0 missing missing missing missing missing missing missing missing -0.097715 0.291929 -0.335907 -0.278312 107.212 135.576 112.071 1435.19 0.318446 1.84789 6.71172 7.71388 1.47824

10.3.1 Cmax

First, let’s compare Cmax in the last interval.

cmax_original_df = combine(
    groupby(filter(x -> x.time >= 96 && !ismissing(x.Central), simulated_original_df), :id),
    :Central => maximum => :Cmax_original,
)
cmax_new_df = combine(
    groupby(filter(x -> x.time >= 96 && !ismissing(x.Central), simulated_new_df), :id),
    :Central => maximum => :Cmax_new,
)
Cmax_df = leftjoin(cmax_original_df, cmax_new_df; on = :id)
first(Cmax_df, 10)
10×3 DataFrame
Row id Cmax_original Cmax_new
String? Float64 Float64?
1 1 107.964 149.283
2 2 89.0458 108.73
3 3 82.3627 113.703
4 4 111.827 123.393
5 5 27.8576 36.6473
6 6 39.4589 48.004
7 7 206.529 270.723
8 8 79.6981 108.049
9 9 111.119 139.375
10 10 50.1407 58.4852

If our safety requirement is that Cmax does not exceed 200 we can

filter(x -> x.Cmax_original >= 200 || x.Cmax_new >= 200, Cmax_df)
8×3 DataFrame
Row id Cmax_original Cmax_new
String? Float64 Float64?
1 7 206.529 270.723
2 12 153.665 200.432
3 17 180.056 237.633
4 28 148.536 201.92
5 41 209.024 244.216
6 63 154.063 204.846
7 70 180.812 216.82
8 94 171.96 214.529

And we can see that with the new regimen more subjects are expected to be exposed to concentrations above the safe limit. This is of course because we give fewer but higher doses.

10.3.2 AUC

Second, we can use the fact that we integrated Central in the model and stored in the AUC_Central to calculate the AUC in any interval. We are curious about the interval (96,144) hours, so let’s filter on those endpoint values.

# First we select relevant columns
simulated_original_df = @select(simulated_original_df, :id, :evid, :time, :AUC_Central)
# Then we subset the two time points to calculate AUC between for the original dose
auc_original_df = @rsubset(simulated_original_df, :evid == 0, :time in [96, 144])

# Then we do the same for the new dose
simulated_new_df = @select(simulated_new_df, :id, :evid, :time, :AUC_Central)
auc_new_df = @rsubset(simulated_new_df, :evid == 0, :time in [96, 144])

# Then we combine them into one and use `diff` to calculate the difference in AUC
# at time point 144 and at time point 96.
combined_auc = leftjoin(
    combine(groupby(auc_original_df, :id), :AUC_Central => diff => :AUC_last_original),
    combine(groupby(auc_new_df, :id), :AUC_Central => diff => :AUC_last_new);
    on = :id,
)
100×3 DataFrame
75 rows omitted
Row id AUC_last_original AUC_last_new
String? Float64 Float64?
1 1 3445.92 3298.7
2 2 2635.48 2512.3
3 3 2389.58 2278.36
4 4 3722.53 3592.82
5 5 462.117 450.328
6 6 870.378 842.443
7 7 7463.62 7191.57
8 8 2417.45 2326.6
9 9 3531.23 3378.51
10 10 1275.9 1236.08
11 11 1300.62 1257.44
12 12 5016.92 4773.23
13 13 1872.04 1801.47
89 89 2046.95 1964.48
90 90 3601.37 3444.69
91 91 1661.34 1602.12
92 92 911.015 884.66
93 93 1313.81 1259.67
94 94 6142.73 5927.68
95 95 2670.72 2541.62
96 96 2885.86 2742.88
97 97 2207.89 2133.52
98 98 1946.24 1861.81
99 99 2610.73 2500.71
100 100 2377.06 2296.18

And we can draw the histogram of the difference between the new dosage regimen and the old one.

AlgebraOfGraphics

For an intro tutorial that describes the use of AlgebraOfGraphics.jl to plot from DataFrames please see this tutorial.

using AlgebraOfGraphics
combined_auc.auc_diff = combined_auc.AUC_last_new .- combined_auc.AUC_last_original
plot = data(combined_auc) * mapping(:auc_diff) * histogram()
draw(plot)

We see that AUCs are smaller for the new regimen. Overall this means that safety is potentially compromised and exposure as measured by AUC is lower given the proposed regimen.

11 Conclusion

In this tutorial, we saw how to go from map from data to a Population, how to formulate a model of oral administration and fit it, and how to use built-in functionality to assess the quality of the formulated model. We also saw some advanced features such as bootstrap and SIR inference, VPC plots, and simulations.