using Pumas
using DataFramesMeta
using PumasUtilities
using CairoMakie 
Case Study II - Building a PopPK model with multiple doses
1 Second of three case studies
This second tutorial based on in a series of case studies found here . The second case study is not too different from the first. It has oral dosing so a slightly more advanced model is needed. For this reason, we add some bonus material to this case study not found in the original. We show how to perform visual predictive checks, bootstrap and SIR inference, and how to simulate alternative dosing regimens.
2 Getting the data
Please download the data file for this tutorial and place it in the same location as this tutorial. The data file consists of data obtained from 10 individuals who were treated with 500mg dose every eight hours for five days. The dataset is loaded it into memory as a DataFrame using CSV.read. We use the first(input, number_of_elements) function to show the first 10 rows of the DataFrame.
using CSV
pkdata = CSV.read("CS2_oralestaddl.csv", DataFrame; header = 5)
first(pkdata, 10)| Row | CID | TIME | CONC | AMT | ADDL | II | MDV | 
|---|---|---|---|---|---|---|---|
| Int64 | Float64 | Float64 | Int64 | Int64 | Int64 | Int64 | |
| 1 | 1 | 0.0 | 0.0 | 500 | 14 | 8 | 1 | 
| 2 | 1 | 0.5 | 4.5799 | 0 | 0 | 0 | 0 | 
| 3 | 1 | 1.0 | 8.5717 | 0 | 0 | 0 | 0 | 
| 4 | 1 | 3.0 | 10.979 | 0 | 0 | 0 | 0 | 
| 5 | 1 | 4.0 | 11.347 | 0 | 0 | 0 | 0 | 
| 6 | 1 | 6.0 | 8.7705 | 0 | 0 | 0 | 0 | 
| 7 | 1 | 8.0 | 7.0155 | 0 | 0 | 0 | 0 | 
| 8 | 1 | 8.5 | 10.889 | 0 | 0 | 0 | 0 | 
| 9 | 1 | 16.0 | 9.7015 | 0 | 0 | 0 | 0 | 
| 10 | 1 | 16.5 | 14.115 | 0 | 0 | 0 | 0 | 
The file contains 11 columns:
- CID: subject identification number.
- TIME: Blood sampling time. Needs to be monotonically increasing.
- CONC: The dependent variable. In this case, a plasma concentration.
- AMT: Amount of drug administered at dosing time in the TIME column or zero for observation rows
- DOSE: Dose group. Included for post-processing and asset generation (tables and plots).
- ADDL: Number of additional doses. The value of the column is 14 in event rows for a total of 15 doses (3 a day for 5 days).
- II: the inter-dose interval. The value is 8 (Hr) in event rows.
- MDV: Missing dependent variable. Can be used to flag observations that should be ignored.
The names in the dataset reflects that it originates from a NONMEM case study. There are several conventions that are historical in nature and that relate to the functionality of NONMEM. For example, the columns names are all uppercase. We rename the ID column as follows:
rename!(pkdata, :CID => :ID)The final step before our analysis data is ready is to add event columns. These indicate whether a row is an event or an observation. If the row specifies an event it needs to indicate which compartment the dose applies to. The EVID column indicates the type. If it is 0 the row is an observation and if it is 1 the row in a dosing event.
@rtransform!(pkdata, :EVID = :AMT == 0 ? 0 : 1)
@rtransform!(pkdata, :CMT = :AMT == 0 ? missing : Symbol("Depot"))Instead of using the MDV functionality that would typically be used in NONMEM we set the CONC column to missing for event records.
@rtransform! pkdata :CONC = :EVID == 1 ? missing : :CONC3 Mapping to a population
In Pumas, we take tabular datasets and convert them in to Populations. A Population is nothing but a collection of Subjects. Subjects can be constructed programmatically or they can be read from tabular data such as a DataFrame. To map from tabular data to a Population we use the read_pumas function. In this case, the population is read using the following code:
population = read_pumas(
    pkdata;
    id = :ID,
    time = :TIME,
    observations = [:CONC],
    evid = :EVID,
    amt = :AMT,
    addl = :ADDL,
    ii = :II,
    cmt = :CMT,
)Population
  Subjects: 100
  Observations: CONCOnce created we see some summary information about how many subjects there are and what kind of information they have.
With a Population assigned to a variable, we can do a few things. For example, we can plot the observations against time using the following built-in plotting function.
observations_vs_time(population)In the plot we see data and a loess fit to the data. It can be useful to get a general idea of what the data looks like.
4 Model code
We are now ready to define our model. For this study, we have oral administration data and we expect there to be a significant distribution phase. This means that we will need a Depot and a Peripheral compartment besides the Central compartment that will be used to predict concentrations that we can relate to data. Such a model has a closed form solution and in Pumas this specific model is called Depots1Central1Periph1 and has to be specified in the @dynamics part of the model code.
oral2cmt = @model begin
    @metadata begin
        desc = "PROJECT multipledose oral study"
        timeu = u"hr" # hour
    end
    @param begin
        θka ∈ RealDomain(lower = 0.01)
        θcl ∈ RealDomain(lower = 0.2)
        θvc ∈ RealDomain(lower = 0.1)
        θvp ∈ RealDomain(lower = 1.0)
        θq ∈ RealDomain(lower = 0.1)
        Ω ∈ PDiagDomain(4)
        σ_add ∈ RealDomain(lower = 0.0)
        σ_prop ∈ RealDomain(lower = 0.0)
    end
    @random begin
        η ~ MvNormal(Ω)
    end
    @pre begin
        Ka = θka
        CL = θcl * exp(η[1])
        Vc = θvc * exp(η[2])
        Vp = θvp * exp(η[3])
        Q = θq * exp(η[4])
    end
    @dosecontrol begin
        bioav = (Depot = 0.5,)
    end
    @dynamics Depots1Central1Periph1
    @derived begin
        conc_model := @. Central / Vc
        CONC ~ @. Normal(conc_model, sqrt(σ_add^2 + (conc_model * σ_prop)^2))
    end
endPumasModel
  Parameters: θka, θcl, θvc, θvp, θq, Ω, σ_add, σ_prop
  Random effects: η
  Covariates: 
  Dynamical system variables: Depot, Central, Peripheral
  Dynamical system type: Closed form
  Derived: CONC
  Observed: CONCWe note that a bioav value is specified in the @dosecontrol model-block. With data on orally administered drugs alone it is not possible to estimate the reduced bioavailability that so often characterizes these oral routes. If we cannot estimate it, we will only be able to estimate so-called apparent clearance (written CL/F), apparent volume of distribution (Vc/F), and so on. Alternatively, the bioavailability might be known from a previous study. Then we can simply include it in the model. This is done in the @dosecontrol model block. Here, we assume 50% bioavailability, but if we replace 0.5 with a parameter from the @param block (and restrict it to have a lower limit of 0 and an upper limit of 1) we can also estimate it from data under appropriate conditions and assumptions.
Besides the data and model we have to set initial parameters. This is done by specifying a NamedTuple using the syntax below.
initial_est_oral2cmt = (
    θcl = 1.0,
    θka = 0.25,
    θvc = 1.0,
    θvp = 10.0,
    θq = 1.5,
    Ω = Diagonal([0.09, 0.09, 0.09, 0.09]),
    σ_add = sqrt(10.0),
    σ_prop = sqrt(0.01),
)5 Fit base model
To fit the model to the data we use the fit function. We pass the show_trace = false option on to the optimizer to suppress some of the information during the fitting.
oral2cmt_results = fit(
    oral2cmt,
    population,
    initial_est_oral2cmt,
    Pumas.FOCE();
    optim_options = (show_trace = false,),
)[ Info: Checking the initial parameter values.
[ Info: The initial negative log likelihood and its gradient are finite. Check passed.FittedPumasModel
Successful minimization:                      true
Likelihood approximation:                     FOCE
Likelihood Optimizer:                         BFGS
Dynamical system type:                 Closed form
Log-likelihood value:                   -3988.0159
Number of subjects:                            100
Number of parameters:         Fixed      Optimized
                                  0             11
Observation records:         Active        Missing
    CONC:                      4200              0
    Total:                     4200              0
---------------------
           Estimate
---------------------
θka         0.31845
θcl         2.0376
θvc         5.0125
θvp        10.793
θq          1.9526
Ω₁,₁        0.086664
Ω₂,₂        0.17962
Ω₃,₃        0.089769
Ω₄,₄        0.18984
σ_add       0.032052
σ_prop      0.051134
---------------------5.1 Understanding the output
From the final output we see several lines of information. We see the information about the likelihood approximation, dynamical model solution method and so on. An important line to know about is the Log-likelihood line.
    Log-likelihood value:                   -3988.0159During fitting we maximize this value, and it can also be queried programmatically with the loglikelihood function.
loglikelihood(oral2cmt_results)-3988.0158650003814In Case Study 1 we went into detail with how to compare log-likelihood values between Pumas and NONEMEM. We won’t go into too much detail here, but just remember that we can compare NONMEM output to the Pumas results by running
-2loglikelihood(oral2cmt_results)7976.031730000763and compare it to the OBJECTIVE FUNCTION VALUE WITH CONSTANT line in the lst file to verify if the fits match between the two pieces of software.
The main model parameters are the fixed effects that are shown in the beginning of the table.
θcl         2.0376
θka         0.31845
θvc         5.0125
θvp        10.793
θq          1.9526We can also look at the estimated variances (often referred to as omegas or Ωs) of the random effects (often referred to as etas or ηs).
Ω₁,₁        0.086664
Ω₂,₂        0.17962
Ω₃,₃        0.089768
Ω₄,₄        0.18983The subscripts refer to the position in the variance-covariance matrix Ω.
The last two parameters refer to the two components of the error model.
σ_add       2.9282
σ_prop      0.12113Again, these are standard deviations which is why we had to square them in the error model specification.
CONC ~ @. Normal(conc_model, sqrt(σ_add^2 + (conc_model*σ_prop)^2))This concludes looking at the results from the fit function call.
6 Create additional output and interpret it
To get access to diagnostic such as predictions and residuals we need to use the inspect function.
oral2cmt_inspect = inspect(oral2cmt_results)[ Info: Calculating predictions.
[ Info: Calculating weighted residuals.
[ Info: Calculating empirical bayes.
[ Info: Evaluating individual parameters.
[ Info: Evaluating dose control parameters.
[ Info: Done.FittedPumasModelInspection
Fitting was successful: true
Likehood approximation used for weighted residuals : Pumas.FOCE{Optim.NewtonTrustRegion{Float64}, Optim.Options{Float64, Nothing}}(Optim.NewtonTrustRegion{Float64}(1.0, 100.0, 1.4901161193847656e-8, 0.1, 0.25, 0.75, false), Optim.Options(x_abstol = 0.0, x_reltol = 0.0, f_abstol = 0.0, f_reltol = 0.0, g_abstol = 1.0e-5, g_reltol = 1.0e-8, outer_x_abstol = 0.0, outer_x_reltol = 0.0, outer_f_abstol = 0.0, outer_f_reltol = 0.0, outer_g_abstol = 1.0e-8, outer_g_reltol = 1.0e-8, f_calls_limit = 0, g_calls_limit = 0, h_calls_limit = 0, allow_f_increases = false, allow_outer_f_increases = true, successive_f_tol = 1, iterations = 1000, outer_iterations = 1000, store_trace = false, trace_simplex = false, show_trace = false, extended_trace = false, show_every = 1, time_limit = NaN, )
)With the variable oral2cmt_inspect in hand, we are now ready to use many of the built-in functions that can be used to diagnose the model fit. We typically start with the goodness_of_fit plot.
goodness_of_fit(oral2cmt_inspect, markercolor = (:gray, 0.3))6.1 Subject fits
One useful plot is the subject_fits plot that combines data points with individual and population predictions.
subject_fits(oral2cmt_inspect)We can also look at individual plots or groups of plots. By use of the paginate and separate keywords. If we activate the two by setting them to true we will get a vector of plots back, and we can use indexing with square brackets to get the one we want to see.
subfits = subject_fits(
    oral2cmt_inspect;
    separate = true,
    paginate = true,
    facets = (combinelabels = true,),
)
subfits[1]As noticed in the plot above, there can be jagged motion of the predicted concentration during the days when only one or 2 concentrations are measured. In such cases, Pumas provide easy ways to generate rich predictions.
indpreds = [
    predict(
        oral2cmt_results.model,
        oral2cmt_results.data[i],
        oral2cmt_results.param;
        obstimes = minimum(oral2cmt_results.data[i].time):0.5:maximum(
            oral2cmt_results.data[i].time,
        ),
    ) for i = 1:length(oral2cmt_results.data)
]Prediction
  Subjects: 100
  Predictions: CONCThe code above performs a rich prediction using the final parameter estimates from the model by predicting concentrations at every 0.5 hours from the minimum to maximum time for each subject. This code is a called as array comprehension or simply put, a for loop.
Purpose of the Code
The code aims to generate individual predictions for each subject in a dataset.
Understanding Array Comprehensions
Array comprehensions are a concise way to create arrays in Julia. They combine the power of loops and conditional statements into a single, elegant expression.
Think of it like a recipe:
- Ingredients: You have the data from each subject ( - oral2cmt_results.data[i]), the prediction model (- oral2cmt_results.model), and the model parameters (- oral2cmt_results.param).
- Instructions: - For each subject: Iterate through all subjects (indicated by igoing from 1 to the number of subjects).
- Get the data: Extract the data for the current subject (oral2cmt_results.data[i]).
- Calculate observation times: Determine the range of time points for which to generate predictions (minimum(oral2cmt_results.data[i].time):0.5:maximum(oral2cmt_results.data[i].time)). This generates a sequence of time points starting from the earliest observation time for the subject, up to the latest observation time, with a step size of 0.5.
- Predict: Use the model and parameters to generate predictions for these time points for the current subject (predict(...)).
- Collect: Store these predictions in a new array (indpreds).
 
- For each subject: Iterate through all subjects (indicated by 
Code Breakdown
Let’s translate the code into plain English:
- indpreds = [: This starts creating an empty array called- indpreds, which will eventually hold the predictions for all subjects.
- [... for i = 1:length(oral2cmt_results.data)]: This is the array comprehension. It tells Julia to do the following for each value of- i, where- iranges from 1 to the total number of subjects:- predict( ... ): Generate predictions using the model, data for subject- i, and the parameters. The- obstimesargument specifies the time points for which we want predictions.
 
- ]: This closes the array comprehension, signaling that the predictions for all subjects have been calculated and stored in the- indpredsarray.
Key Points
- The oral2cmt_resultsobject comes from fitting a pharmacokinetic model.
- The predictfunction takes the model, data for a specific individual, and parameters, and returns model predictions at the requested time points (obstimes).
- The array comprehension elegantly replaces a for loop to generate individual predictions efficiently.
Alternative Using a Loop
If you’re more comfortable with loops, here’s the equivalent code:
indpreds = []
for i = 1:length(oral2cmt_results.data)
    subject_data = oral2cmt_results.data[i]
    obstimes = minimum(subject_data.time):0.5:maximum(subject_data.time)
    pred = predict(oral2cmt_results.model, subject_data, oral2cmt_results.param; obstimes)
    push!(indpreds, pred)
endindpreds_sf = subject_fits(
    indpreds,
    figurelegend = (position = :t, alignmode = Outside(), orientation = :horizontal),
    paginate = true,
    separate = true,
    limit = 9,
    rows = 3,
    columns = 3,
    ipred_linewidth = 6,
    pred_linewidth = 6,
    markersize = 16,
    facet = (combinelabels = true,),
    figure = (resolution = (1400, 1000), fontsize = 28),
    axis = (
        xticks = 0:12:1600,
        xticklabelrotation = π / 2,
        ylabel = "Concentration",
        xlabel = "Time (hours)",
    ),
)
indpreds_sf[1]The plot generated here is smoother compared to using the original ipreds from the model that are at the observed time only and appear like a true pharmacokinetic muptiple dose profile.
7 Inference
To calculate standard errors we use the infer function.
infer_covar = infer(oral2cmt_results)
infer_table = coeftable(infer_covar)[ Info: Calculating: variance-covariance matrix.
[ Info: Done.| Row | parameter | estimate | se | ci_lower | ci_upper | 
|---|---|---|---|---|---|
| String | Float64 | Float64 | Float64 | Float64 | |
| 1 | θka | 0.318446 | 0.0138923 | 0.291218 | 0.345674 | 
| 2 | θcl | 2.03758 | 0.0600642 | 1.91985 | 2.1553 | 
| 3 | θvc | 5.01246 | 0.261288 | 4.50034 | 5.52457 | 
| 4 | θvp | 10.7933 | 0.328853 | 10.1488 | 11.4379 | 
| 5 | θq | 1.9526 | 0.104232 | 1.74831 | 2.15689 | 
| 6 | Ω₁,₁ | 0.086664 | 0.01206 | 0.0630269 | 0.110301 | 
| 7 | Ω₂,₂ | 0.179617 | 0.0239031 | 0.132768 | 0.226467 | 
| 8 | Ω₃,₃ | 0.0897686 | 0.0143274 | 0.0616874 | 0.11785 | 
| 9 | Ω₄,₄ | 0.189837 | 0.0246662 | 0.141493 | 0.238182 | 
| 10 | σ_add | 0.0320518 | 0.0144757 | 0.00367996 | 0.0604236 | 
| 11 | σ_prop | 0.051134 | 0.000737659 | 0.0496882 | 0.0525798 | 
To calculate RSEs we can simply create a new column in the DataFrame.
infer_table.RSE = infer_table.se ./ infer_table.estimate .* 100
infer_table| Row | parameter | estimate | se | ci_lower | ci_upper | RSE | 
|---|---|---|---|---|---|---|
| String | Float64 | Float64 | Float64 | Float64 | Float64 | |
| 1 | θka | 0.318446 | 0.0138923 | 0.291218 | 0.345674 | 4.36252 | 
| 2 | θcl | 2.03758 | 0.0600642 | 1.91985 | 2.1553 | 2.94783 | 
| 3 | θvc | 5.01246 | 0.261288 | 4.50034 | 5.52457 | 5.21278 | 
| 4 | θvp | 10.7933 | 0.328853 | 10.1488 | 11.4379 | 3.04682 | 
| 5 | θq | 1.9526 | 0.104232 | 1.74831 | 2.15689 | 5.33812 | 
| 6 | Ω₁,₁ | 0.086664 | 0.01206 | 0.0630269 | 0.110301 | 13.9158 | 
| 7 | Ω₂,₂ | 0.179617 | 0.0239031 | 0.132768 | 0.226467 | 13.3078 | 
| 8 | Ω₃,₃ | 0.0897686 | 0.0143274 | 0.0616874 | 0.11785 | 15.9604 | 
| 9 | Ω₄,₄ | 0.189837 | 0.0246662 | 0.141493 | 0.238182 | 12.9933 | 
| 10 | σ_add | 0.0320518 | 0.0144757 | 0.00367996 | 0.0604236 | 45.1634 | 
| 11 | σ_prop | 0.051134 | 0.000737659 | 0.0496882 | 0.0525798 | 1.4426 | 
This concludes Case Study II as presented in the original material.
8 Bonus material
Even if Case Study II has been covered, we would like to take this opportunity to show more features that could be useful after fitting the model to data.
8.1 Visual Predictive Checks
Recently, it has become quite popular to asses model quality using the so-called Visual Predictive Checks (VPCs). These are easily simulated and plotted in Pumas using the vpc and vpc_plot functions.
oral2cmt_vpc = vpc(oral2cmt_results; samples = 100)
vpc_plot(oral2cmt_vpc)[ Info: Continuous VPCFrom the plot, we see that the simulated profiles fall well within the intervals defined by the data profiles.
On the other hand, if we would like to view this more as a oral dose pharmacokinetic concentration time profile, then we use tad, time after dose as a x-axis variable. Here is corresponding plot.
oral2cmt_vpc_tad = vpc(oral2cmt_results; covariates = [:tad], samples = 100)
vpc_plot(oral2cmt_vpc_tad)[ Info: Continuous VPC9 Inference
Beyond the asymptotic variance-covariance based inference that we presented above, it is also possible to use the Bootstrap and SIR features available in the infer function of Pumas.
bts = infer(oral2cmt_results, Bootstrap(samples = 100))┌ Info: Bootstrap inference finished.
│   Total resampled fits = 100
│   Success rate = 1.0
└   Unique resampled populations = 100Bootstrap inference results
Successful minimization:                      true
Likelihood approximation:                     FOCE
Likelihood Optimizer:                         BFGS
Dynamical system type:                 Closed form
Log-likelihood value:                   -3988.0159
Number of subjects:                            100
Number of parameters:         Fixed      Optimized
                                  0             11
Observation records:         Active        Missing
    CONC:                      4200              0
    Total:                     4200              0
---------------------------------------------------------------------
          Estimate           SE                       95.0% C.I.
---------------------------------------------------------------------
θka        0.31845         0.013257          [ 0.29705  ;  0.34763 ]
θcl        2.0376          0.058192          [ 1.9322   ;  2.1455  ]
θvc        5.0125          0.27483           [ 4.5268   ;  5.6332  ]
θvp       10.793           0.35028           [10.129    ; 11.466   ]
θq         1.9526          0.095969          [ 1.7796   ;  2.1229  ]
Ω₁,₁       0.086664        0.011693          [ 0.064158 ;  0.10661 ]
Ω₂,₂       0.17962         0.026217          [ 0.13087  ;  0.22978 ]
Ω₃,₃       0.089769        0.013636          [ 0.064316 ;  0.11475 ]
Ω₄,₄       0.18984         0.02589           [ 0.13958  ;  0.24707 ]
σ_add      0.032052        0.014743          [ 7.5888e-5;  0.057053]
σ_prop     0.051134        0.00075976        [ 0.049491 ;  0.052738]
---------------------------------------------------------------------
Successful fits: 100 out of 100
Unique resampled populations: 100
No stratification.sir = infer(oral2cmt_results, SIR(samples = 100, resamples = 20))[ Info: Calculating: variance-covariance matrix.
[ Info: Done.
[ Info: Running SIR.
[ Info: Resampling.Simulated inference results
Successful minimization:                      true
Likelihood approximation:                     FOCE
Likelihood Optimizer:                         BFGS
Dynamical system type:                 Closed form
Log-likelihood value:                   -3988.0159
Number of subjects:                            100
Number of parameters:         Fixed      Optimized
                                  0             11
Observation records:         Active        Missing
    CONC:                      4200              0
    Total:                     4200              0
---------------------------------------------------------------------
          Estimate           SE                       95.0% C.I.
---------------------------------------------------------------------
θka        0.31845         0.0079662         [ 0.30964  ;  0.33419 ]
θcl        2.0376          0.060482          [ 1.9453   ;  2.1318  ]
θvc        5.0125          0.2623            [ 4.6287   ;  5.6107  ]
θvp       10.793           0.35885           [10.366    ; 11.506   ]
θq         1.9526          0.11976           [ 1.8219   ;  2.1934  ]
Ω₁,₁       0.086664        0.012644          [ 0.074249 ;  0.1152  ]
Ω₂,₂       0.17962         0.026206          [ 0.13134  ;  0.22127 ]
Ω₃,₃       0.089769        0.014695          [ 0.073385 ;  0.12207 ]
Ω₄,₄       0.18984         0.030743          [ 0.14992  ;  0.2539  ]
σ_add      0.032052        0.013272          [ 0.0062733;  0.051136]
σ_prop     0.051134        0.00059637        [ 0.050087 ;  0.052105]
---------------------------------------------------------------------We go into more details about different inference methods in the documentation and in other tutorials.
10 Simulation
A very important part of the usual pharmacometrics workflow is to simulate trajectories. In the following, we can see how to simulate from our model, given the subjects we had in the dataset, the estimated parameters, and the empirical bayes estimates as the values for our random effects.
oral_2cmt_sims =
    simobs(oral2cmt, population, coef(oral2cmt_results), empirical_bayes(oral2cmt_results))
p1 = sim_plot(oral2cmt, oral_2cmt_sims[8:8]; observations = [:CONC])We plot the simulated data for subject number 8.
10.1 Simulating alternative doses
Simulation is often used to compare some baseline treatment to a new, potentially better, treatment, dose or dosage regimen. There are many ways to carry out that exercise. Below, we will take the original dataset we had and replace the event information with a new proposed dosage regimen. Instead of three doses a day we propose two doses a day. We keep the drug amount the same by increasing the dose to 750 mg. Then the total dose amount per day stays at 1500 mg. To keep the treatment going for a total of five days, we need to change addl to 9 (for a total of 10 doses), set ii to 12 and amt to 750.
10.2 Using ordinary differential equation notation for dynamics
So far, we have used the closed for solution interface. Now, we want to characterize the AUC and Cmax of the old and new dosage regimen and we would like to add another compartment to integrate the area under the curve (AUC) for the Central compartment.
oral2cmt_ode = @model begin
    @param begin
        θka ∈ RealDomain(lower = 0.01)
        θcl ∈ RealDomain(lower = 0.2)
        θvc ∈ RealDomain(lower = 0.1)
        θvp ∈ RealDomain(lower = 1.0)
        θq ∈ RealDomain(lower = 0.1)
        Ω ∈ PDiagDomain(4)
        σ_add ∈ RealDomain(lower = 0.0)
        σ_prop ∈ RealDomain(lower = 0.0)
    end
    @random begin
        η ~ MvNormal(Ω)
    end
    @pre begin
        Ka = θka
        CL = θcl * exp(η[1])
        Vc = θvc * exp(η[2])
        Vp = θvp * exp(η[3])
        Q = θq * exp(η[4])
    end
    @dosecontrol begin
        bioav = (Depot = 0.5,)
    end
    @dynamics begin
        Depot' = -Ka * Depot
        Central' = Ka * Depot - (CL + Q) / Vc * Central + Q / Vp * Periph
        Periph' = Q / Vc * Central - Q / Vp * Periph
        AUC_Central' = Central
    end
    @derived begin
        conc_model = @. Central / Vc
        _t = t
    end
endPumasModel
  Parameters: θka, θcl, θvc, θvp, θq, Ω, σ_add, σ_prop
  Random effects: η
  Covariates: 
  Dynamical system variables: Depot, Central, Periph, AUC_Central
  Dynamical system type: Matrix exponential
  Derived: conc_model, _t
  Observed: conc_model, _tThe notation should be straight forward for anyone with knowledge of differential equations. The time derivatives are on the right-hand sides, and the name of the corresponding dynamical variable is inferred from the X' expression on the left-hand side in @dynamics.
10.3 Data munging
new_dr = DosageRegimen(750, addl = 9, ii = 12)
new_population = Subject.(population, events = new_dr)Population
  Subjects: 100
  Observations: CONCThe second line uses the . to take each subject in population and construct a new similar Subject using the Subject constructor with keyword specification events = new_dr that means to replace the existing events with the content of new_dr. It is equivalent to define a function f(sub) = Subject(sub, events = new_dr) and apply this function to each element of population.
### Original doses simulations
oral_2cmt_original_sims = simobs(
    oral2cmt_ode,
    population,
    coef(oral2cmt_results),
    empirical_bayes(oral2cmt_results),
)
simulated_original_df = DataFrame(oral_2cmt_original_sims)
### Counter factual simulation
oral_2cmt_new_sims = simobs(
    oral2cmt_ode,
    new_population,
    coef(oral2cmt_results),
    empirical_bayes(oral2cmt_results),
)
simulated_new_df = DataFrame(oral_2cmt_new_sims)
first(simulated_new_df, 10)| Row | id | time | conc_model | _t | evid | bioav_Depot | amt | cmt | rate | duration | ss | ii | route | η_1 | η_2 | η_3 | η_4 | Depot | Central | Periph | AUC_Central | Ka | CL | Vc | Vp | Q | 
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| String? | Float64 | Float64? | Float64? | Int64? | Float64? | Float64? | Symbol? | Float64? | Float64? | Int8? | Float64? | NCA.Route? | Float64? | Float64? | Float64? | Float64? | Float64? | Float64? | Float64? | Float64? | Float64? | Float64? | Float64? | Float64? | Float64? | |
| 1 | 1 | 0.0 | missing | missing | 1 | 0.5 | 750.0 | Depot | 0.0 | 0.0 | 0 | 0.0 | NullRoute | -0.097715 | 0.291929 | -0.335907 | -0.278312 | missing | missing | missing | missing | 0.318446 | 1.84789 | 6.71172 | 7.71388 | 1.47824 | 
| 2 | 1 | 0.5 | 7.27344 | 0.5 | 0 | missing | missing | missing | missing | missing | missing | missing | missing | -0.097715 | 0.291929 | -0.335907 | -0.278312 | 319.802 | 48.8173 | 2.78412 | 13.0619 | 0.318446 | 1.84789 | 6.71172 | 7.71388 | 1.47824 | 
| 3 | 1 | 1.0 | 11.9451 | 1.0 | 0 | missing | missing | missing | missing | missing | missing | missing | missing | -0.097715 | 0.291929 | -0.335907 | -0.278312 | 272.729 | 80.1721 | 9.45641 | 45.9175 | 0.318446 | 1.84789 | 6.71172 | 7.71388 | 1.47824 | 
| 4 | 1 | 3.0 | 17.0629 | 3.0 | 0 | missing | missing | missing | missing | missing | missing | missing | missing | -0.097715 | 0.291929 | -0.335907 | -0.278312 | 144.256 | 114.521 | 45.443 | 257.08 | 0.318446 | 1.84789 | 6.71172 | 7.71388 | 1.47824 | 
| 5 | 1 | 4.0 | 16.1647 | 4.0 | 0 | missing | missing | missing | missing | missing | missing | missing | missing | -0.097715 | 0.291929 | -0.335907 | -0.278312 | 104.914 | 108.493 | 59.9594 | 369.143 | 0.318446 | 1.84789 | 6.71172 | 7.71388 | 1.47824 | 
| 6 | 1 | 6.0 | 13.0021 | 6.0 | 0 | missing | missing | missing | missing | missing | missing | missing | missing | -0.097715 | 0.291929 | -0.335907 | -0.278312 | 55.4926 | 87.2665 | 76.5441 | 565.506 | 0.318446 | 1.84789 | 6.71172 | 7.71388 | 1.47824 | 
| 7 | 1 | 8.0 | 10.0492 | 8.0 | 0 | missing | missing | missing | missing | missing | missing | missing | missing | -0.097715 | 0.291929 | -0.335907 | -0.278312 | 29.3519 | 67.4475 | 80.1127 | 719.474 | 0.318446 | 1.84789 | 6.71172 | 7.71388 | 1.47824 | 
| 8 | 1 | 8.5 | 9.42237 | 8.5 | 0 | missing | missing | missing | missing | missing | missing | missing | missing | -0.097715 | 0.291929 | -0.335907 | -0.278312 | 25.0315 | 63.2403 | 79.6486 | 752.133 | 0.318446 | 1.84789 | 6.71172 | 7.71388 | 1.47824 | 
| 9 | 1 | 12.0 | missing | missing | 1 | 0.5 | 750.0 | Depot | 0.0 | 0.0 | 0 | 0.0 | NullRoute | -0.097715 | 0.291929 | -0.335907 | -0.278312 | missing | missing | missing | missing | 0.318446 | 1.84789 | 6.71172 | 7.71388 | 1.47824 | 
| 10 | 1 | 16.0 | 20.1999 | 16.0 | 0 | missing | missing | missing | missing | missing | missing | missing | missing | -0.097715 | 0.291929 | -0.335907 | -0.278312 | 107.212 | 135.576 | 112.071 | 1435.19 | 0.318446 | 1.84789 | 6.71172 | 7.71388 | 1.47824 | 
10.3.1 Cmax
First, let’s compare Cmax in the last interval.
cmax_original_df = combine(
    groupby(filter(x -> x.time >= 96 && !ismissing(x.Central), simulated_original_df), :id),
    :Central => maximum => :Cmax_original,
)
cmax_new_df = combine(
    groupby(filter(x -> x.time >= 96 && !ismissing(x.Central), simulated_new_df), :id),
    :Central => maximum => :Cmax_new,
)
Cmax_df = leftjoin(cmax_original_df, cmax_new_df; on = :id)
first(Cmax_df, 10)| Row | id | Cmax_original | Cmax_new | 
|---|---|---|---|
| String? | Float64 | Float64? | |
| 1 | 1 | 107.964 | 149.283 | 
| 2 | 2 | 89.0458 | 108.73 | 
| 3 | 3 | 82.3627 | 113.703 | 
| 4 | 4 | 111.827 | 123.393 | 
| 5 | 5 | 27.8576 | 36.6473 | 
| 6 | 6 | 39.4589 | 48.004 | 
| 7 | 7 | 206.529 | 270.723 | 
| 8 | 8 | 79.6981 | 108.049 | 
| 9 | 9 | 111.119 | 139.375 | 
| 10 | 10 | 50.1407 | 58.4852 | 
If our safety requirement is that Cmax does not exceed 200 we can
filter(x -> x.Cmax_original >= 200 || x.Cmax_new >= 200, Cmax_df)| Row | id | Cmax_original | Cmax_new | 
|---|---|---|---|
| String? | Float64 | Float64? | |
| 1 | 7 | 206.529 | 270.723 | 
| 2 | 12 | 153.665 | 200.432 | 
| 3 | 17 | 180.056 | 237.633 | 
| 4 | 28 | 148.536 | 201.92 | 
| 5 | 41 | 209.024 | 244.216 | 
| 6 | 63 | 154.063 | 204.846 | 
| 7 | 70 | 180.812 | 216.82 | 
| 8 | 94 | 171.96 | 214.529 | 
And we can see that with the new regimen more subjects are expected to be exposed to concentrations above the safe limit. This is of course because we give fewer but higher doses.
10.3.2 AUC
Second, we can use the fact that we integrated Central in the model and stored in the AUC_Central to calculate the AUC in any interval. We are curious about the interval (96,144) hours, so let’s filter on those endpoint values.
# First we select relevant columns
simulated_original_df = @select(simulated_original_df, :id, :evid, :time, :AUC_Central)
# Then we subset the two time points to calculate AUC between for the original dose
auc_original_df = @rsubset(simulated_original_df, :evid == 0, :time in [96, 144])
# Then we do the same for the new dose
simulated_new_df = @select(simulated_new_df, :id, :evid, :time, :AUC_Central)
auc_new_df = @rsubset(simulated_new_df, :evid == 0, :time in [96, 144])
# Then we combine them into one and use `diff` to calculate the difference in AUC
# at time point 144 and at time point 96.
combined_auc = leftjoin(
    combine(groupby(auc_original_df, :id), :AUC_Central => diff => :AUC_last_original),
    combine(groupby(auc_new_df, :id), :AUC_Central => diff => :AUC_last_new);
    on = :id,
)| Row | id | AUC_last_original | AUC_last_new | 
|---|---|---|---|
| String? | Float64 | Float64? | |
| 1 | 1 | 3445.92 | 3298.7 | 
| 2 | 2 | 2635.48 | 2512.3 | 
| 3 | 3 | 2389.58 | 2278.36 | 
| 4 | 4 | 3722.53 | 3592.82 | 
| 5 | 5 | 462.117 | 450.328 | 
| 6 | 6 | 870.378 | 842.443 | 
| 7 | 7 | 7463.62 | 7191.57 | 
| 8 | 8 | 2417.45 | 2326.6 | 
| 9 | 9 | 3531.23 | 3378.51 | 
| 10 | 10 | 1275.9 | 1236.08 | 
| 11 | 11 | 1300.62 | 1257.44 | 
| 12 | 12 | 5016.92 | 4773.23 | 
| 13 | 13 | 1872.04 | 1801.47 | 
| ⋮ | ⋮ | ⋮ | ⋮ | 
| 89 | 89 | 2046.95 | 1964.48 | 
| 90 | 90 | 3601.37 | 3444.69 | 
| 91 | 91 | 1661.34 | 1602.12 | 
| 92 | 92 | 911.015 | 884.66 | 
| 93 | 93 | 1313.81 | 1259.67 | 
| 94 | 94 | 6142.73 | 5927.68 | 
| 95 | 95 | 2670.72 | 2541.62 | 
| 96 | 96 | 2885.86 | 2742.88 | 
| 97 | 97 | 2207.89 | 2133.52 | 
| 98 | 98 | 1946.24 | 1861.81 | 
| 99 | 99 | 2610.73 | 2500.71 | 
| 100 | 100 | 2377.06 | 2296.18 | 
And we can draw the histogram of the difference between the new dosage regimen and the old one.
For an intro tutorial that describes the use of AlgebraOfGraphics.jl to plot from DataFrames please see this tutorial.
using AlgebraOfGraphics
combined_auc.auc_diff = combined_auc.AUC_last_new .- combined_auc.AUC_last_original
plot = data(combined_auc) * mapping(:auc_diff) * histogram()
draw(plot)We see that AUCs are smaller for the new regimen. Overall this means that safety is potentially compromised and exposure as measured by AUC is lower given the proposed regimen.
11 Conclusion
In this tutorial, we saw how to go from map from data to a Population, how to formulate a model of oral administration and fit it, and how to use built-in functionality to assess the quality of the formulated model. We also saw some advanced features such as bootstrap and SIR inference, VPC plots, and simulations.