using Pumas
using PharmaDatasets
using DataFramesMeta
using CairoMakie
using AlgebraOfGraphics
using PumasUtilities

Visual Predictive Checks for Continuous Data Models
1 Learning Objectives
By the end of this tutorial, you will be able to generate and interpret VPC plots in Pumas, explore misspecification via simulation, and apply stratification and prediction correction.
2 Introduction
The original Visual Predictive Check was proposed to compare the observed data to prediction intervals from an estimated model. If the model is not in agreement with the data we will observe data curves outside of the prediction intervals and we may reject the model. In a Bayesian context, a similar plot is used to verify the posterior predictive distribution against the data.
In practice, the interpretation is not so easy. Covariates including different treatments, biologically significant differences between patients, and different sample times complicates matters significantly. This means that the clear visual check quickly becomes muddied. In this tutorial, we will see how to produce a VPC for continuous data in Pumas and we will try to adjust for some of the complications involved with using the VPC in practice.
First, we set up a Warfarin PK model.
= @model begin
warfarin_pk_model
@metadata begin
= "Warfarin 1-compartment PK model (PD removed)"
desc = u"hr"
timeu end
@param begin
# PK parameters
"""
Clearance (L/hr)
"""
∈ RealDomain(lower = 0.0, init = 0.134)
pop_CL """
Central volume (L)
"""
∈ RealDomain(lower = 0.0, init = 8.11)
pop_Vc """
Absorption lag time (hr)
"""
∈ RealDomain(lower = 0.0, init = 0.523)
pop_tabs
# Inter-individual variability
"""
- ΩCL: Clearance
- ΩVc: Central volume
"""
∈ PDiagDomain([0.01, 0.01])
pk_Ω # Residual variability
"""
σ_prop: Proportional error
"""
∈ RealDomain(lower = 0.0, init = 0.00752)
σ_prop end
@random begin
~ MvNormal(pk_Ω) # mean = 0, covariance = pk_Ω
pk_η end
@covariates begin
"""
FSZCL: Clearance scaling factor
"""
FSZCL"""
FSZV: Volume scaling factor
"""
FSZVend
@pre begin
= FSZCL * pop_CL * exp(pk_η[1])
CL = FSZV * pop_Vc * exp(pk_η[2])
Vc
= pop_tabs
tabs = log(2) / tabs
Ka end
@vars begin
:= Central / Vc
cp end
@dynamics begin
' = -Ka * Depot
Depot' = Ka * Depot - (CL / Vc) * Central
Centralend
@derived begin
"""
Concentration (ng/mL)
"""
~ @. Normal(cp, sqrt((σ_prop * cp)^2))
conc end
end
PumasModel
Parameters: pop_CL, pop_Vc, pop_tabs, pk_Ω, σ_prop
Random effects: pk_η
Covariates: FSZCL, FSZV
Dynamical system variables: Depot, Central
Dynamical system type: Matrix exponential
Derived: conc
Observed: conc
= dataset("paganz2024/warfarin_long")
warfarin_data
@chain warfarin_data begin
@groupby :ID :TIME :DVID
@transform :tmp = length(:ID)
@rsubset :tmp > 1
end
= @chain warfarin_data begin
warfarin_data_processed @groupby :ID :TIME :DVID
@transform :tmp = 1:length(:ID)
@rtransform :TIME = :tmp == 1 ? :TIME : :TIME + 1e-6
@select Not(:tmp)
end
# Transform the data in a single chain of operations
= @chain warfarin_data_processed begin
warfarin_data_wide @rsubset !contains(:ID, "#")
@rtransform begin
# Scaling factors
:FSZV = :WEIGHT / 70 # volume scaling
:FSZCL = (:WEIGHT / 70)^0.75 # clearance scaling (allometric)
# Column name for the DV
:DVNAME = "DV$(:DVID)"
# Dosing indicator columns
:CMT = ismissing(:AMOUNT) ? missing : 1
:EVID = ismissing(:AMOUNT) ? 0 : 1
end
unstack(Not([:DVID, :DVNAME, :DV]), :DVNAME, :DV)
rename!(:DV1 => :conc, :DV2 => :pca)
end
Row | ID | TIME | WEIGHT | AGE | SEX | AMOUNT | FSZV | FSZCL | CMT | EVID | DV0 | pca | conc |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
String3 | Float64 | Float64 | Int64 | Int64 | Float64? | Float64 | Float64 | Int64? | Int64 | Float64? | Float64? | Float64? | |
1 | 1 | 0.0 | 66.7 | 50 | 1 | 100.0 | 0.952857 | 0.96443 | 1 | 1 | missing | missing | missing |
2 | 1 | 0.0 | 66.7 | 50 | 1 | missing | 0.952857 | 0.96443 | missing | 0 | missing | 100.0 | missing |
3 | 1 | 24.0 | 66.7 | 50 | 1 | missing | 0.952857 | 0.96443 | missing | 0 | missing | 49.0 | 9.2 |
4 | 1 | 36.0 | 66.7 | 50 | 1 | missing | 0.952857 | 0.96443 | missing | 0 | missing | 32.0 | 8.5 |
5 | 1 | 48.0 | 66.7 | 50 | 1 | missing | 0.952857 | 0.96443 | missing | 0 | missing | 26.0 | 6.4 |
6 | 1 | 72.0 | 66.7 | 50 | 1 | missing | 0.952857 | 0.96443 | missing | 0 | missing | 22.0 | 4.8 |
7 | 1 | 96.0 | 66.7 | 50 | 1 | missing | 0.952857 | 0.96443 | missing | 0 | missing | 28.0 | 3.1 |
8 | 1 | 120.0 | 66.7 | 50 | 1 | missing | 0.952857 | 0.96443 | missing | 0 | missing | 33.0 | 2.5 |
9 | 2 | 0.0 | 66.7 | 31 | 1 | 100.0 | 0.952857 | 0.96443 | 1 | 1 | missing | missing | missing |
10 | 2 | 0.0 | 66.7 | 31 | 1 | missing | 0.952857 | 0.96443 | missing | 0 | missing | 100.0 | missing |
11 | 2 | 0.5 | 66.7 | 31 | 1 | missing | 0.952857 | 0.96443 | missing | 0 | missing | missing | 0.0 |
12 | 2 | 2.0 | 66.7 | 31 | 1 | missing | 0.952857 | 0.96443 | missing | 0 | missing | missing | 8.4 |
13 | 2 | 3.0 | 66.7 | 31 | 1 | missing | 0.952857 | 0.96443 | missing | 0 | missing | missing | 9.7 |
⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ |
306 | 31 | 48.0 | 83.3 | 24 | 1 | missing | 1.19 | 1.13936 | missing | 0 | missing | 24.0 | 6.4 |
307 | 31 | 72.0 | 83.3 | 24 | 1 | missing | 1.19 | 1.13936 | missing | 0 | missing | 22.0 | 4.5 |
308 | 31 | 96.0 | 83.3 | 24 | 1 | missing | 1.19 | 1.13936 | missing | 0 | missing | 28.0 | 3.4 |
309 | 31 | 120.0 | 83.3 | 24 | 1 | missing | 1.19 | 1.13936 | missing | 0 | missing | 42.0 | 2.5 |
310 | 32 | 0.0 | 62.0 | 21 | 1 | 93.0 | 0.885714 | 0.912999 | 1 | 1 | missing | missing | missing |
311 | 32 | 0.0 | 62.0 | 21 | 1 | missing | 0.885714 | 0.912999 | missing | 0 | missing | 100.0 | missing |
312 | 32 | 24.0 | 62.0 | 21 | 1 | missing | 0.885714 | 0.912999 | missing | 0 | missing | 36.0 | 8.9 |
313 | 32 | 36.0 | 62.0 | 21 | 1 | missing | 0.885714 | 0.912999 | missing | 0 | missing | 27.0 | 7.7 |
314 | 32 | 48.0 | 62.0 | 21 | 1 | missing | 0.885714 | 0.912999 | missing | 0 | missing | 24.0 | 6.9 |
315 | 32 | 72.0 | 62.0 | 21 | 1 | missing | 0.885714 | 0.912999 | missing | 0 | missing | 23.0 | 4.4 |
316 | 32 | 96.0 | 62.0 | 21 | 1 | missing | 0.885714 | 0.912999 | missing | 0 | missing | 20.0 | 3.5 |
317 | 32 | 120.0 | 62.0 | 21 | 1 | missing | 0.885714 | 0.912999 | missing | 0 | missing | 22.0 | 2.5 |
= read_pumas(
pop
warfarin_data_wide;= :ID,
id = :TIME,
time = :AMOUNT,
amt = :CMT,
cmt = :EVID,
evid = [:SEX, :WEIGHT, :FSZV, :FSZCL],
covariates = [:conc],
observations )
Population
Subjects: 31
Covariates: SEX, WEIGHT, FSZV, FSZCL
Observations: conc
= fit(warfarin_pk_model, pop, init_params(warfarin_pk_model), FOCE()) foce_fit
[ Info: Checking the initial parameter values.
[ Info: The initial negative log likelihood and its gradient are finite. Check passed.
Iter Function value Gradient norm
0 7.293083e+04 1.461056e+05
* time: 0.019902944564819336
1 9.818354e+03 1.953327e+04
* time: 0.7908179759979248
2 7.222705e+03 1.429007e+04
* time: 0.9616858959197998
3 3.173581e+03 6.054138e+03
* time: 0.9866287708282471
4 1.786367e+03 3.182851e+03
* time: 1.0113868713378906
5 1.008524e+03 1.523930e+03
* time: 1.0362298488616943
6 6.628559e+02 7.392148e+02
* time: 1.0604078769683838
7 5.074320e+02 3.380632e+02
* time: 1.0861709117889404
8 4.488285e+02 1.409490e+02
* time: 1.1124999523162842
9 4.321458e+02 4.815869e+01
* time: 1.1397178173065186
10 4.293762e+02 3.611674e+01
* time: 1.1664648056030273
11 4.291782e+02 3.810027e+01
* time: 1.1914119720458984
12 4.291570e+02 3.798194e+01
* time: 1.2148637771606445
13 4.290988e+02 3.654733e+01
* time: 1.2384648323059082
14 4.289761e+02 3.190818e+01
* time: 1.2619178295135498
15 4.287351e+02 2.028140e+01
* time: 1.286815881729126
16 4.284295e+02 1.615961e+01
* time: 1.311743974685669
17 4.282338e+02 1.409536e+01
* time: 1.337871789932251
18 4.281796e+02 1.775240e+01
* time: 1.3619389533996582
19 4.281680e+02 1.641484e+01
* time: 1.3849267959594727
20 4.281590e+02 1.490102e+01
* time: 1.4081189632415771
21 4.281320e+02 1.464321e+01
* time: 1.431372880935669
22 4.280727e+02 1.477070e+01
* time: 1.45560884475708
23 4.279464e+02 1.487623e+01
* time: 1.480719804763794
24 4.277608e+02 1.479359e+01
* time: 1.561551809310913
25 4.276099e+02 2.182121e+01
* time: 1.587122917175293
26 4.275547e+02 2.510023e+01
* time: 1.6110379695892334
27 4.275402e+02 2.460539e+01
* time: 1.6356477737426758
28 4.275273e+02 2.373992e+01
* time: 1.6610620021820068
29 4.274912e+02 2.199978e+01
* time: 1.6859629154205322
30 4.274025e+02 1.928910e+01
* time: 1.7115728855133057
31 4.271657e+02 1.765219e+01
* time: 1.7373127937316895
32 4.265541e+02 3.361826e+01
* time: 1.7636089324951172
33 4.250214e+02 5.665070e+01
* time: 1.7899267673492432
34 4.222442e+02 7.740390e+01
* time: 1.8159708976745605
35 4.202011e+02 7.057124e+01
* time: 1.8406548500061035
36 4.179514e+02 4.045950e+01
* time: 1.8655848503112793
37 4.170700e+02 7.924687e+00
* time: 1.8906378746032715
38 4.170338e+02 9.918974e+00
* time: 1.911681890487671
39 4.170322e+02 9.740094e+00
* time: 1.9316740036010742
40 4.170294e+02 9.580525e+00
* time: 1.9518818855285645
41 4.170224e+02 9.354483e+00
* time: 1.973510980606079
42 4.170044e+02 8.980420e+00
* time: 1.9970958232879639
43 4.169579e+02 8.341740e+00
* time: 2.0208258628845215
44 4.168403e+02 1.197122e+01
* time: 2.046637773513794
45 4.165578e+02 1.828136e+01
* time: 2.072518825531006
46 4.159678e+02 2.403113e+01
* time: 2.1215438842773438
47 4.151117e+02 2.418037e+01
* time: 2.1470868587493896
48 4.144995e+02 1.738597e+01
* time: 2.170883893966675
49 4.142440e+02 8.526452e+00
* time: 2.1948938369750977
50 4.141656e+02 1.084232e+00
* time: 2.2185938358306885
51 4.141623e+02 1.683615e-01
* time: 2.2410969734191895
52 4.141623e+02 5.899685e-02
* time: 2.2612109184265137
53 4.141623e+02 5.110076e-02
* time: 2.2814009189605713
54 4.141623e+02 5.095926e-02
* time: 2.2994728088378906
55 4.141623e+02 5.095855e-02
* time: 2.321917772293091
56 4.141623e+02 5.083699e-02
* time: 2.3398277759552
57 4.141623e+02 5.083694e-02
* time: 2.36370587348938
58 4.141623e+02 5.083689e-02
* time: 2.396068811416626
59 4.141623e+02 5.083453e-02
* time: 2.4234459400177
60 4.141623e+02 5.083413e-02
* time: 2.4542768001556396
61 4.141623e+02 4.994785e-02
* time: 2.4739527702331543
62 4.141623e+02 4.969511e-02
* time: 2.493306875228882
63 4.141623e+02 4.809471e-02
* time: 2.51387095451355
64 4.141622e+02 4.566696e-02
* time: 2.5334978103637695
65 4.141622e+02 5.884378e-02
* time: 2.553788900375366
66 4.141621e+02 8.013542e-02
* time: 2.5743069648742676
67 4.141620e+02 8.789628e-02
* time: 2.5943429470062256
68 4.141618e+02 6.143283e-02
* time: 2.6160337924957275
69 4.141618e+02 2.027742e-02
* time: 2.6388628482818604
70 4.141618e+02 2.038166e-03
* time: 2.6602327823638916
71 4.141618e+02 1.523421e-04
* time: 2.679537773132324
FittedPumasModel
Successful minimization: true
Likelihood approximation: FOCE
Likelihood Optimizer: BFGS
Dynamical system type: Matrix exponential
Log-likelihood value: -414.16175
Number of subjects: 31
Number of parameters: Fixed Optimized
0 6
Observation records: Active Missing
conc: 239 47
Total: 239 47
------------------------
Estimate
------------------------
pop_CL 0.13281
pop_Vc 8.2582
pop_tabs 0.93142
pk_Ω₁,₁ 0.037972
pk_Ω₂,₂ 0.0092456
σ_prop 0.21277
------------------------
3 What exactly is a VPC?
A Visual Predictive Check (VPC) is a plot constructed to identify misspecification of the model. It attempts to compare the distribution of the data with the distribution of simulations from the estimated model. Essentially constructed in parts:
flowchart TD Start(["Start"]) --> Q1["For pcVPC calculate the prediction median function<br>and apply it to the observed independent variable."] Q1 --> Q2["Calculate the sequence of quantiles (0.1, 0.5, 0.9)<br>for the (corrected) dependent variable observations<br>across the independent variable nodes"] Q2 --> Q3 Q2 --> Q6 Q3["Simulate all datasets using estimated parameters with<br>random effects and error model sampling (and apply<br>correction for pcVPC)"] --> Q4["Calculate quantiles for each simulated dataset"] Q4 --> Q5["Determine prediction intervals across<br>simulations per bin/location"] Q5 --> Q6["Plot results (observed vs simulated intervals)"] Q6 --> Q7["Check if observed medians lie within intervals"] Q7 --> End(["End"]) %% Labels and styles subgraph n1 ["VPC Workflow"] end classDef step fill:#9ecfdf,stroke:#000,stroke-width:1px,color:#000 classDef Peach stroke-width:1px,stroke-dasharray:none,stroke:#FBB35A,fill:#FFEFDB,color:#8F632D class Q1,Q2,Q3,Q4,Q5,Q6,Q7 step class n1 Peach
This differs from other diagnostics such as weighted residuals by considering not only data compared to the modeled mean, but tries to make sure that more or less extreme subjects are also well-predicted by the model.
4 Applying the VPC workflow in Pumas
In Pumas we have a predefined workflow for simulating the required data as well as summarizing the data using the binless VPC method. In short, the binless method avoids binning data across the independent variable. Instead, it applies a smoothed local quantile regression to combine information from data that is close together according to the kernel used. Binless VPCs are advantageous over traditional VPCs as they allow smoother representation of model-data agreement rather than across few discrete intervals.
The workflow consists of just two functions:
vpc(...)
to simulate the required data and calculate the quantities used for lines and ribbons in the final plotsvpc_plot(...)
fromPumasUtilities
that takes the data and presents it in a plot
simobs
It is also possible to simulate the data using simobs
but we will not cover that approach in this tutorial.
4.1 How to simulate?
To simulate, we use the vpc
function along with input on how the simulations and quantile calculations should be performed. The most simple invocation of the function is:
= vpc(foce_fit) myvpc
[ Info: Continuous VPC
Visual Predictive Check
Type of VPC: Continuous VPC
Simulated populations: 499
Subjects in data: 31
Stratification variable(s): None
Confidence level: 0.95
VPC lines: quantiles ([0.1, 0.5, 0.9])
This creates an object that has all the information we’ll need. Here’s a brief explanation for each line of the VPC output:
Visual Predictive Check:
This is the name of the diagnostic tool used to assess how well your model predicts observed data by comparing simulated data distributions.Type of VPC: Continuous VPC:
The VPC was performed for a dependent variable (e.g., concentration) that changes continuously over time.Simulated populations: 499:
The model generated 499 synthetic datasets to create prediction intervals for comparison.Subjects in data: 31:
Your original dataset contains 31 individual subjects.Stratification variable(s): None:
The VPC was performed on the entire dataset without splitting it into subgroups (e.g., by sex or age).Confidence level: 0.95:
The shaded prediction intervals on the plot are calculated to contain 95% of the simulated data.VPC lines: quantiles ([0.1, 0.5, 0.9]):
The VPC plot will display the 10th percentile (0.1), median (0.5), and 90th percentile (0.9) of the simulated data.
While the above function call will produce data for a VPC we often need to make adjustments in practice. Consider the (partial) signature where we omitted some options for clarity of this tutorial:
vpc(
::AbstractFittedPumasModel;
fpm::Integer = 499,
samples::Vector{Symbol} = [keys(fpm.data[1].observations)[1]],
observations::Vector{Symbol} = [:time],
covariates::Vector{Symbol} = Symbol[],
stratify_by= nothing,
bandwidth ::Bool = false,
prediction_correction )
Some important potential keyword arguments in the vpc
function are:
samples
(default is 500) to control the number times the population is re-simulated according to the fitted modelobservations
is a vector of symbols (e.g.[:conc]
) to control what endpoint to perform the VPC for if there are multiple present in the modelcovariates
(default is:time
) is a vector of symbols (e.g.[:tad]
) to control the first axis variablestratify_by
to control stratification of VPC calculations and plottingbandwidth
is a number (default is 2.0 for continuous VPCs) to control the level of smoothingprediction_correction
(default is false) to apply a type of correction for differences in subjects. If prediction correction is applied it is currently not possible to also use stratification.
Please see the docstring (?vpc
) for all options and defaults.
With the output of vpc
in hand, we can continue to plotting the VPC.
4.2 How to plot?
Plotting is done with the vpc_plot
function. This function accepts the output of vpc()
and can be customized in the same way as the rest of the built-in plotting functions covered earlier.
vpc_plot(myvpc, figurelegend = (; position = :b, nbanks = 2, tellheight = true))
The VPC looks good. The data medians are generally well within the ribbons.
4.3 A misspecified model
To produce a VPC that does not look good we can intentionally constrain the model to not produce the estimates found above. Let us test whether forcing a higher pop_CL
can be detected as misspecification in our setup:
= fit(
foce_fit_CL
warfarin_pk_model,
pop,init_params(warfarin_pk_model)..., pop_CL = 0.15),
(FOCE();
= (:pop_CL,),
constantcoef = (; show_trace = false),
optim_options )
[ Info: Checking the initial parameter values.
[ Info: The initial negative log likelihood and its gradient are finite. Check passed.
FittedPumasModel
Successful minimization: true
Likelihood approximation: FOCE
Likelihood Optimizer: BFGS
Dynamical system type: Matrix exponential
Log-likelihood value: -418.66271
Number of subjects: 31
Number of parameters: Fixed Optimized
1 5
Observation records: Active Missing
conc: 239 47
Total: 239 47
-----------------------
Estimate
-----------------------
pop_CL 0.15
pop_Vc 8.3615
pop_tabs 0.93424
pk_Ω₁,₁ 0.051854
pk_Ω₂,₂ 0.010071
σ_prop 0.21565
-----------------------
Then we perform the required calculations
= vpc(foce_fit_CL) myvpc_CL
[ Info: Continuous VPC
Visual Predictive Check
Type of VPC: Continuous VPC
Simulated populations: 499
Subjects in data: 31
Stratification variable(s): None
Confidence level: 0.95
VPC lines: quantiles ([0.1, 0.5, 0.9])
And produce the plot
vpc_plot(myvpc_CL, figurelegend = (; position = :b, nbanks = 2, tellheight = true))
In this case, the model does not look well-specified, and we should consider not fixing the population level clearance parameter. Note, that the interpretation is up to the person looking at the plot. In this case, it appears as if the fit is decent for early time points and bad for later time points. This would tend to rule out a problem in the absorption phase. The question is then if it’s misspecified because of clearance or maybe we didn’t correctly account for a distribution phase with many peripheral compartments? In this case, we introduced the problem ourselves so we have a good idea, but in general it’s not always easy.
4.4 Stratification
If a covariate with a limited number of values is thought to be significant in the model or if we want to make sure that the fit is equally good across strata we should produce a stratified plot.
= vpc(foce_fit; stratify_by = [:SEX]) my_vpc_sex
[ Info: Continuous VPC
Visual Predictive Check
Type of VPC: Continuous VPC
Simulated populations: 499
Subjects in data: 31
Stratification variable(s): [:SEX]
Confidence level: 0.95
VPC lines: quantiles ([0.1, 0.5, 0.9])
vpc_plot(my_vpc_sex; figurelegend = (; position = :b, nbanks = 2, tellheight = true))
As we can see, stratification also allows us to investigate the data more closely. Supposedly, the data median starts at a large positive number at time point 0! This leads to a model misspecification according to the red outlier band. When outliers are observed we need to consider at least three possible scenarios:
- Is the data wrong? There could be unphysiological values that we need to explain and that our model could not be expected to explain. Hopefully, exploratory data analysis (EDA) caught such values.
- Is the model wrong? If the data looks good, maybe the model simply cannot fit these data points. We may need to take a closer look at the model fit for specific subjects that contribute to this upper part of the observed distribution.
- Is the VPC misspecified? Just as binned VPCs require appropriate binning, the binless VPC has aggregation of similar data points by use of local regression. We can try to vary the
bandwidth
parameter to see if deviating from the default value of2
Let us try experiment with the bandwidth. In the dataset we observe that there are some very large observations at :TIME == 2.0
. These could be biasing the curves for the observed data.
= vpc(foce_fit; stratify_by = [:SEX], bandwidth = 1.9) my_vpc_sex
[ Info: Continuous VPC
Visual Predictive Check
Type of VPC: Continuous VPC
Simulated populations: 499
Subjects in data: 31
Stratification variable(s): [:SEX]
Confidence level: 0.95
VPC lines: quantiles ([0.1, 0.5, 0.9])
vpc_plot(my_vpc_sex; figurelegend = (; position = :b, nbanks = 2, tellheight = true))
We see that the black curve now goes down to a small value at :TIME == 0.0
as expected. The problem highlights something that is true for all VPCs and should always be kept in mind. We are aggregating data across time and covariate differences, so it is not always easy to know what the VPC curve “should” look like when the scenarios become complex. Here, we felt comfortable concluding that the first VPC did not invalidate our fit because we don’t think any subjects should start with positive values of the drug in plasma, but it may not always be easy to say if the VPC or the fit is the reason for an apparent indication of misspecification.
4.5 Changing to Time-After-Dose (tad
)
When producing a VPC it is often suggested to plot time-after-dose as the first axis variable and stratify by the number of the dose unless the dosage regimen is more complicated and several doses are administered at the same time or very close together. To change the variable on the first axis, we use the covariates
keyword. Stratification is done as before.
= vpc(foce_fit; covariates = [:tad], stratify_by = [:dosenum], bandwidth = 1.9) my_vpc_sex
[ Info: Continuous VPC
Visual Predictive Check
Type of VPC: Continuous VPC
Simulated populations: 499
Subjects in data: 31
Stratification variable(s): [:dosenum]
Confidence level: 0.95
VPC lines: quantiles ([0.1, 0.5, 0.9])
vpc_plot(my_vpc_sex; figurelegend = (; position = :b, nbanks = 2, tellheight = true))
In our example, there is not real change because we only have a single dose per subject.
4.6 Prediction correction
Prediction-corrected Visual Predictive Checks (pcVPCs) were developed to account for variability in independent variables (e.g., dose, covariates, or time-varying predictors) that may influence predictions but cannot easily be stratified. In such cases, raw observed and simulated values may differ simply due to differences in design rather than model misspecification. The pcVPC corrects these design-related differences by normalizing both observed and simulated outcomes relative to the population-level prediction at each time point (typically using the model-predicted median). This allows for a clearer comparison of the model’s predictive performance across heterogeneous conditions.
To perform a prediction corrected VPC in Pumas, simply set the prediction_correction
keyword to true.
= vpc(foce_fit; prediction_correction = true, bandwidth = 1.9)
my_pcvpc vpc_plot(my_pcvpc; figurelegend = (; position = :t, nbanks = 2, tellheight = true))
[ Info: Continuous VPC
In this instance, the prediction correction does not cause much of a difference.
5 Conclusion
In this tutorial, we saw how to produce a VPC from a fitted model. We showed how to look for misspecification and some of the levers and controls a user has to customize the plot.
6 Further Reading
If you want to read more on the application and properties of visual predictive checks, please have a look at the following papers.
1. Karlsson, M. O., & Holford, N. H. (2008). A tutorial on visual predictive checks. PAGE 17, Abstract 1434. http://www.page-meeting.org/?abstract=1434
2. Bergstrand, M., Hooker, A. C., Wallin, J. E., & Karlsson, M. O. (2011). Prediction-corrected visual predictive checks for diagnosing nonlinear mixed-effects models. CPT: Pharmacometrics & Systems Pharmacology, 89(2), 204–217. https://doi.org/10.1038/clpt.2010.268
3. Ette, E. I., & Williams, P. J. (2004). Population pharmacokinetics II: estimation methods. Annals of Pharmacotherapy, 38(11), 1907–1915. https://doi.org/10.1345/aph.1E225
4. Bonate, P. L. (2011). Pharmacokinetic-Pharmacodynamic Modeling and Simulation (2nd ed.). Springer. Chapters on model evaluation and simulation diagnostics.