using Pumas
Posterior Postprocessing - Plots and Queries
After you fit your Bayesian Pumas model, there are a number of functions and plots you can call on the output of the fit or discard function.
In this tutorial we’ll showcase some of them. As always, please refer to the Bayesian Workflow Pumas documentation for more details.
Let’s recall the model from the Introduction to Bayesian Models in Pumas tutorial:
pk_1cmp = @model begin
@param begin
tvcl ~ LogNormal(log(3.2), 1)
tvv ~ LogNormal(log(16.4), 1)
tvka ~ LogNormal(log(3.8), 1)
ω²cl ~ LogNormal(log(0.04), 0.25)
ω²v ~ LogNormal(log(0.04), 0.25)
ω²ka ~ LogNormal(log(0.04), 0.25)
σ_p ∈ LogNormal(log(0.2), 0.25)
end
@random begin
ηcl ~ Normal(0, sqrt(ω²cl))
ηv ~ Normal(0, sqrt(ω²v))
ηka ~ Normal(0, sqrt(ω²ka))
end
@covariates begin
Dose
end
@pre begin
CL = tvcl * exp(ηcl)
Vc = tvv * exp(ηv)
Ka = tvka * exp(ηka)
end
@dynamics Depots1Central1
@derived begin
cp := @. Central / Vc
Conc ~ @. Normal(cp, abs(cp) * σ_p)
end
end┌ Warning: Covariate Dose is not used in the model. └ @ Pumas ~/run/_work/PumasTutorials.jl/PumasTutorials.jl/custom_julia_depot/packages/Pumas/6G31F/src/dsl/model_macro.jl:3167
PumasModel
Parameters: tvcl, tvv, tvka, ω²cl, ω²v, ω²ka, σ_p
Random effects: ηcl, ηv, ηka
Covariates: Dose
Dynamical system variables: Depot, Central
Dynamical system type: Closed form
Derived: Conc
Observed: Conc
using PharmaDatasets
pkpain_df = dataset("pk_painrelief");
using DataFramesMeta
@chain pkpain_df begin
@rsubset! :Dose != "Placebo"
@rtransform! begin
:amt = :Time == 0 ? parse(Int, chop(:Dose; tail = 3)) : missing
:evid = :Time == 0 ? 1 : 0
:cmt = :Time == 0 ? 1 : 2
end
@rtransform! :Conc = :evid == 1 ? missing : :Conc
end;
first(pkpain_df, 5)| Row | Subject | Time | Conc | PainRelief | PainScore | RemedStatus | Dose | amt | evid | cmt |
|---|---|---|---|---|---|---|---|---|---|---|
| Int64 | Float64 | Float64? | Int64 | Int64 | Int64 | String7 | Int64? | Int64 | Int64 | |
| 1 | 1 | 0.0 | missing | 0 | 3 | 1 | 20 mg | 20 | 1 | 1 |
| 2 | 1 | 0.5 | 1.15578 | 1 | 1 | 0 | 20 mg | missing | 0 | 2 |
| 3 | 1 | 1.0 | 1.37211 | 1 | 0 | 0 | 20 mg | missing | 0 | 2 |
| 4 | 1 | 1.5 | 1.30058 | 1 | 0 | 0 | 20 mg | missing | 0 | 2 |
| 5 | 1 | 2.0 | 1.19195 | 1 | 1 | 0 | 20 mg | missing | 0 | 2 |
pop =
pkpain_noplb = read_pumas(
pkpain_df,
id = :Subject,
time = :Time,
amt = :amt,
observations = [:Conc],
covariates = [:Dose],
evid = :evid,
cmt = :cmt,
)Population
Subjects: 120
Covariates: Dose
Observations: Conc
pk_1cmp_fit = fit(
pk_1cmp,
pop,
init_params(pk_1cmp),
BayesMCMC(; nsamples = 1_000, nadapts = 500, constantcoef = (; tvka = 2)),
);
pk_1cmp_tfit = discard(pk_1cmp_fit; burnin = 500)[ Info: Checking the initial parameter values. [ Info: The initial log probability and its gradient are finite. Check passed. [ Info: Checking the initial parameter values. [ Info: The initial log probability and its gradient are finite. Check passed. [ Info: Checking the initial parameter values. [ Info: The initial log probability and its gradient are finite. Check passed. [ Info: Checking the initial parameter values. [ Info: The initial log probability and its gradient are finite. Check passed.
Chains MCMC chain (500×6×4 Array{Float64, 3}):
Iterations = 1:1:500
Number of chains = 4
Samples per chain = 500
Wall duration = 555.97 seconds
Compute duration = 2044.26 seconds
parameters = tvcl, tvv, ω²cl, ω²v, ω²ka, σ_p
Summary Statistics
parameters mean std mcse ess_bulk ess_tail rhat ⋯
Symbol Float64 Float64 Float64 Float64 Float64 Float64 ⋯
tvcl 3.1874 0.0771 0.0082 85.3594 180.2762 1.0497 ⋯
tvv 13.2701 0.2741 0.0162 287.0303 477.4483 1.0038 ⋯
ω²cl 0.0736 0.0085 0.0006 188.0547 331.3742 1.0243 ⋯
ω²v 0.0461 0.0057 0.0002 617.6937 881.9354 1.0080 ⋯
ω²ka 1.1018 0.1428 0.0032 2018.0821 1774.1258 1.0014 ⋯
σ_p 0.1041 0.0023 0.0001 1888.3511 1700.2015 1.0004 ⋯
1 column omitted
Quantiles
parameters 2.5% 25.0% 50.0% 75.0% 97.5%
Symbol Float64 Float64 Float64 Float64 Float64
tvcl 3.0398 3.1360 3.1871 3.2341 3.3504
tvv 12.7689 13.0819 13.2548 13.4439 13.8308
ω²cl 0.0589 0.0675 0.0728 0.0790 0.0920
ω²v 0.0363 0.0423 0.0455 0.0496 0.0594
ω²ka 0.8379 1.0025 1.0935 1.1930 1.3998
σ_p 0.0998 0.1025 0.1040 0.1055 0.1085
1 Predictive Checks
In the Bayesian workflow, we generally perform predictive checks both in our prior selection and also in our posterior estimation (Gelman et al., 2013; McElreath, 2020).
1.1 Prior Predictive Checks
In a very simple way, prior predictive checks consists in simulate parameter values based on prior distribution without conditioning on any data or employing any likelihood function.
Independent of the level of information specified in the priors, it is always important to perform a prior sensitivity analysis in order to have a deep understanding of the prior influence onto the posterior.
To perform a prior predictive check in Pumas, we need to use the simobs function with:
positional arguments:
- model
PopulationorSubject
keyword arguments:
samples: an integer; how many samples to drawsimulate_errora boolean; whether to sample from the error model in the@derived block(true) or to simply return the expected value of the error distribution (false)
In the case of a prior predictive check, we’ll use the following options in simobs:
prior_sims = simobs(pk_1cmp, pop; samples = 200, simulate_error = true)We get back a Vector of SimulatedObservations:
typeof(prior_sims)Vector{SimulatedObservations{PumasModel{(tvcl = 1, tvv = 1, tvka = 1, ω²cl = 1, ω²v = 1, ω²ka = 1, σ_p = 1), 3, (:Depot, :Central), (:Ka, :CL, :Vc), ParamSet{@NamedTuple{tvcl::LogNormal{Float64}, tvv::LogNormal{Float64}, tvka::LogNormal{Float64}, ω²cl::LogNormal{Float64}, ω²v::LogNormal{Float64}, ω²ka::LogNormal{Float64}, σ_p::LogNormal{Float64}}}, RandomObj{(), var"#1#7"}, TimeDispatcher{var"#2#8", var"#3#9"}, Nothing, var"#4#10", Depots1Central1, DerivedObj{(:Conc,), var"#5#11"}, var"#6#12", Nothing, PumasModelOptions}, Subject{@NamedTuple{Conc::Vector{Union{Missing, Float64}}}, ConstantCovar{@NamedTuple{Dose::String7}}, DosageRegimen{Float64, Symbol, Float64, Float64, Float64, Float64}, Vector{Float64}}, Vector{Float64}, @NamedTuple{tvcl::Float64, tvv::Float64, tvka::Float64, ω²cl::Float64, ω²v::Float64, ω²ka::Float64, σ_p::Float64}, @NamedTuple{ηcl::Float64, ηv::Float64, ηka::Float64}, @NamedTuple{Dose::Vector{String7}}, @NamedTuple{saveat::Vector{Float64}, ss_abstol::Float64, ss_reltol::Float64, ss_maxiters::Int64, abstol::Float64, reltol::Float64, alg::CompositeAlgorithm{0, Tuple{Vern7{typeof(trivial_limiter!), typeof(trivial_limiter!), False}, Rodas5P{1, AutoForwardDiff{1, Nothing}, GenericLUFactorization{RowMaximum}, typeof(DEFAULT_PRECS), Val{:forward}(), true, nothing, typeof(trivial_limiter!), typeof(trivial_limiter!)}}, AutoSwitch{Vern7{typeof(trivial_limiter!), typeof(trivial_limiter!), False}, Rodas5P{1, AutoForwardDiff{1, Nothing}, GenericLUFactorization{RowMaximum}, typeof(DEFAULT_PRECS), Val{:forward}(), true, nothing, typeof(trivial_limiter!), typeof(trivial_limiter!)}, Rational{Int64}, Int64}}, continuity::Symbol}, PKPDAnalyticalSolution{SVector{2, Float64}, 2, Vector{SVector{2, Float64}}, Vector{Float64}, Vector{SVector{2, Float64}}, Vector{SVector{2, Float64}}, Returns{@NamedTuple{Ka::Float64, CL::Float64, Vc::Float64}}, AnalyticalPKPDProblem{SVector{2, Float64}, Float64, false, Depots1Central1, Vector{Event{Float64, Float64, Float64}}, Vector{Float64}, Returns{@NamedTuple{Ka::Float64, CL::Float64, Vc::Float64}}}}, @NamedTuple{}, @NamedTuple{CL::Vector{Float64}, Vc::Vector{Float64}, Ka::Vector{Float64}}, @NamedTuple{Conc::Vector{Float64}}, @NamedTuple{}}} (alias for Array{Pumas.SimulatedObservations{PumasModel{(tvcl = 1, tvv = 1, tvka = 1, ω²cl = 1, ω²v = 1, ω²ka = 1, σ_p = 1), 3, (:Depot, :Central), (:Ka, :CL, :Vc), ParamSet{@NamedTuple{tvcl::LogNormal{Float64}, tvv::LogNormal{Float64}, tvka::LogNormal{Float64}, ω²cl::LogNormal{Float64}, ω²v::LogNormal{Float64}, ω²ka::LogNormal{Float64}, σ_p::LogNormal{Float64}}}, Pumas.RandomObj{(), var"#1#7"}, Pumas.TimeDispatcher{var"#2#8", var"#3#9"}, Nothing, var"#4#10", Depots1Central1, Pumas.DerivedObj{(:Conc,), var"#5#11"}, var"#6#12", Nothing, Pumas.PumasModelOptions}, Subject{@NamedTuple{Conc::Array{Union{Missing, Float64}, 1}}, Pumas.ConstantCovar{@NamedTuple{Dose::InlineStrings.String7}}, DosageRegimen{Float64, Symbol, Float64, Float64, Float64, Float64}, Array{Float64, 1}}, Array{Float64, 1}, @NamedTuple{tvcl::Float64, tvv::Float64, tvka::Float64, ω²cl::Float64, ω²v::Float64, ω²ka::Float64, σ_p::Float64}, @NamedTuple{ηcl::Float64, ηv::Float64, ηka::Float64}, @NamedTuple{Dose::Array{InlineStrings.String7, 1}}, @NamedTuple{saveat::Array{Float64, 1}, ss_abstol::Float64, ss_reltol::Float64, ss_maxiters::Int64, abstol::Float64, reltol::Float64, alg::CompositeAlgorithm{0, Tuple{Vern7{typeof(OrdinaryDiffEqCore.trivial_limiter!), typeof(OrdinaryDiffEqCore.trivial_limiter!), Static.False}, Rodas5P{1, ADTypes.AutoForwardDiff{1, Nothing}, LinearSolve.GenericLUFactorization{LinearAlgebra.RowMaximum}, typeof(OrdinaryDiffEqCore.DEFAULT_PRECS), Val{:forward}(), true, nothing, typeof(OrdinaryDiffEqCore.trivial_limiter!), typeof(OrdinaryDiffEqCore.trivial_limiter!)}}, OrdinaryDiffEqCore.AutoSwitch{Vern7{typeof(OrdinaryDiffEqCore.trivial_limiter!), typeof(OrdinaryDiffEqCore.trivial_limiter!), Static.False}, Rodas5P{1, ADTypes.AutoForwardDiff{1, Nothing}, LinearSolve.GenericLUFactorization{LinearAlgebra.RowMaximum}, typeof(OrdinaryDiffEqCore.DEFAULT_PRECS), Val{:forward}(), true, nothing, typeof(OrdinaryDiffEqCore.trivial_limiter!), typeof(OrdinaryDiffEqCore.trivial_limiter!)}, Rational{Int64}, Int64}}, continuity::Symbol}, Pumas.PKPDAnalyticalSolution{StaticArraysCore.SArray{Tuple{2}, Float64, 1, 2}, 2, Array{StaticArraysCore.SArray{Tuple{2}, Float64, 1, 2}, 1}, Array{Float64, 1}, Array{StaticArraysCore.SArray{Tuple{2}, Float64, 1, 2}, 1}, Array{StaticArraysCore.SArray{Tuple{2}, Float64, 1, 2}, 1}, Returns{@NamedTuple{Ka::Float64, CL::Float64, Vc::Float64}}, Pumas.AnalyticalPKPDProblem{StaticArraysCore.SArray{Tuple{2}, Float64, 1, 2}, Float64, false, Depots1Central1, Array{Pumas.Event{Float64, Float64, Float64}, 1}, Array{Float64, 1}, Returns{@NamedTuple{Ka::Float64, CL::Float64, Vc::Float64}}}}, @NamedTuple{}, @NamedTuple{CL::Array{Float64, 1}, Vc::Array{Float64, 1}, Ka::Array{Float64, 1}}, @NamedTuple{Conc::Array{Float64, 1}}, @NamedTuple{}}, 1})
which we can then perform a visual predictive check (VPC):
To use the VPC plots, you have to load the PumasUtilities package.
prior_vpc = vpc(prior_sims; stratify_by = [:Dose]);using PumasUtilitiesAxes in a facetted VPC plot can be linked by using the facet keyword argument:
facet = (; linkaxes = true)You can also control which axis is linked by using:
linkxaxes: only the X-axislinkyaxes: only the Y-axislinkaxes: both X- and Y-axis
For all of those you can use the following options:
true/:all: link all axesfalse: unlink all axes
vpc_plot(prior_vpc; facet = (; linkaxes = true,))1.2 Posterior Predictive Checks
We need to make sure that the MCMC-approximated posterior distribution of our dependent variable (DV) can capture all the nuances of the real distribution density/mass of DV.
This procedure is called posterior predictive check, and it is generally carried on by a visual inspection of the real density/mass of our DV against generated samples of DV by the Bayesian model.
We also perform other inspections, see the Model Comparison with Crossvalidation tutorial.
To perform a posterior predictive check in Pumas, we need to use the simobs function with:
positional arguments:
- result from a
fit/discardusing a Bayesian Pumas model
- result from a
keyword arguments:
samples: an integer; how many samples to drawsimulate_errora boolean; whether to sample from the error model in the@derived block(true) or to simply return the expected value of the error distribution (false).
In the case of a posterior predictive check, we’ll use the following options in simobs:
posterior_sims = simobs(pk_1cmp_tfit; samples = 200, simulate_error = true)posterior_vpc = vpc(posterior_sims; stratify_by = [:Dose]);vpc_plot(posterior_vpc; facet = (; linkaxes = true,))2 Posterior Queries
Often you want to execute posterior queries. A common posterior query is, for example, whether a certain parameter \(\theta\) is higher than \(0\):
\[\operatorname{E}[\theta > 0 \mid \text{data}]\]
The way we can do this is using the mean function with a convenient do operator:
mean(pk_1cmp_tfit) do p
p.tvcl >= 3
end0.991
It also works for between-subject variability parameters:
mean(pk_1cmp_tfit) do p
p.ω²cl >= 0.27^2
end0.498
Or random effects in a specific subject:
mean(pk_1cmp_tfit; subject = 1) do p
p.ηcl <= 0
end0.9995
You can use any summarizing function in a posterior query. This means any function that takes a vector of numbers and outputs a single number; even user-defined functions.
3 Visualizations
There are a number of plots which one can use to gain insights into how the sampler and model are performing.
To use the following plots, you have to load the PumasUtilities package.
Also for the facetted Bayesian plots the linking of axes can be controlled with the facet = (; linkaxes = ...), facet = (; linkxaxes = ...) and facet = (; linkyaxes = ...) keyword arguments. As these visualizations are based on AlgebraOfGraphics.jl, for all of those you can use the following options:
true/:all: link all axes:minimal: link only within columns or rowsfalse/:none: unlink all axes.
The trace plot of a parameter shows the value of the parameter in each iteration of the MCMC algorithm. A good trace plot is one that:
- Is noisy, not an increasing or decreasing line for example.
- Has a fixed mean.
- Has a fixed variance.
- Shows all chains overlapping with each other, i.e. chain mixing.
You can plot a trace plot for the population parameters :tvcl and :tvv using:
trace_plot(pk_1cmp_tfit; parameters = [:tvcl, :tvv])You can also plot the trace plots of the subject-specific parameters for a selection of subjects using:
trace_plot(pk_1cmp_tfit; subjects = [1, 2])The density plot of a parameter shows a smoothed version of the histogram of the parameter values, giving an approximate probability density function for the marginal posterior of the parameter considered. This helps us visualize the shape of the marginal posterior of each parameter.
You can plot a density plot for the population parameter :tvcl using:
density_plot(pk_1cmp_tfit; parameters = [:tvcl])You can also plot the density plots of the subject-specific parameters for a selection of subjects using:
density_plot(pk_1cmp_tfit; subjects = [1, 2], facet = (; linkyaxes = false))Another plot that can be used for parameter estimates is the ridgeline plot, which outputs a single density, even if you are using multiple chains, along with relevant statistical information about your parameter. The information that it outputs is mean, median, 80% quantiles (Q0.1 and Q0.9), along with 95% and 80% highest posterior density interval(HPDI). HPDIs are the narrowest interval that capture certain percentage of the posterior density. You can plot ridge plots with the function ridgeline_plot, which has a similar syntax as density_plot.
Here’s an example with the population parameters :tvcl:
ridgeline_plot(pk_1cmp_tfit; parameters = [:tvcl])MCMC chains are prone to auto-correlation between the samples because each sample in the chain is a function of the previous sample.
You can plot an auto-correlation plot for the population parameter :tvcl using:
autocor_plot(pk_1cmp_tfit; parameters = [:tvcl])4 References
Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., & Rubin, D. B. (2013). Bayesian Data Analysis. Chapman and Hall/CRC.
McElreath, R. (2020). Statistical rethinking: A Bayesian course with examples in R and Stan. CRC press.