# Set seed for reproducible results
Random.seed!(123)
# Generate samples from a normal distribution
population_mean_clearance = 5.0 # L/hr
population_variance = 1.0 # (L/hr)²
clearance_distribution = Normal(population_mean_clearance, sqrt(population_variance))
# Take 100 samples (representing 100 different patients)
individual_clearances = rand(clearance_distribution, 100)
# Create a data table for AoG
clearance_data = (clearance = individual_clearances,)
# Plot using AlgebraOfGraphics
# Plot histogram for distribution of clearance
plt_hist =
data(clearance_data) * mapping(:clearance => "Clearance (L/hr)") * histogram(bins = 15)
# Add vertical for the population mean clearance
plt_line = mapping(population_mean_clearance) * visual(VLines; color = :red, linewidth = 2)
# Draw the resulting plot
draw(
plt_hist + plt_line;
axis = (title = "Distribution of Clearance Values", ylabel = "Number of Patients"),
)Between Subject Variability
1 Introduction
Pharmacokinetic and pharmacodynamic (PKPD) modeling is fundamentally concerned with understanding how drugs behave within populations of individuals. One of the most important aspects of population modeling is characterizing the variability that exists between different subjects. This variability is known as Between Subject Variability (BSV) or Inter-individual Variability (IIV).
In this tutorial, we’ll provide a step-by-step guide on how to understand and model BSV in Pumas. We’ll start with the fundamental statistical concepts and gradually build up to implementing complex BSV structures in Pumas models.
2 Between Subject Variability
Before diving into BSV modeling, it’s essential to understand the difference between individual samples and the variance of a distribution.
Imagine we have a population of patients, and we have estimated their individual pharmacokinetic (PK) parameters using a population PK modeling approach. We now have a collection of clearance estimates for the analysis population:
- Samples: These are the individual observations or measurements (the clearance value for each patient).
- Variance: This quantifies how spread out these measurements are around the mean.
Let’s visualize this with a simple example:
In this example:
- Each value in
individual_clearancesis a sample representing one patient. - The variance (\sigma^2 = 1.0) describes how these samples are spread around the mean.
2.1 Samples to Random Effects
Defining individual parameters for a population PK model requires:
- Estimation of the population-typical value for a parameter (e.g., population clearance \theta_{CL} = 5 L/hr)
- A model for describing the deviation of each individual from this population-typical value using random effects
- And estimating the variance of the deviations assuming a normal distribution
The variance of this normal distribution represents the BSV - the degree of variability in a PK parameter we expect to see between different subjects in our population.
For a single parameter (like clearance), the relationship between the individual parameter value, population parameter, and random effect is often expressed as:
P_i = \theta_P \cdot e^{\eta_i}
Where:
- P_i is the parameter value for individual i
- \theta_P is the population parameter value
- \eta_i is the random effect for individual i, where \eta_i \sim \mathcal{N}(0,\omega^2)
- \omega^2 is the variance of the random effect (BSV)
3 Single Random Effect
Let’s start with the simplest case: modeling BSV for a single parameter (like clearance). In NLME, we typically model parameters on a log scale to ensure they remain positive.
# Define population parameters
θ_CL = 5.0 # Population clearance (L/hr)
omega_CL = 0.09 # Variance of the random effect (dimensionless when log-transformed)
# Generate individual random effects
eta_CL = rand(Normal(0, sqrt(omega_CL)), 100) # 100 subjects
# Calculate individual clearances
individual_CL = θ_CL .* exp.(eta_CL)
# Create a data table for AoG
cl_data = (clearance = individual_CL,)
# Plot using AlgebraOfGraphics
plt = data(cl_data) * mapping(:clearance => "Clearance (L/hr)") * histogram(bins = 15)
draw(plt; axis = (title = "Individual Clearance Values", ylabel = "Number of Subjects"))In Pumas, this would be implemented as:
@param begin
θ_CL ∈ RealDomain(lower = 0, init = 5.0) # Population clearance
Ω ∈ RealDomain(lower = 0, init = 0.09) # Variance for BSV
end
@random begin
η ~ Normal(0, sqrt(Ω)) # Random effect (eta)
end
@pre begin
CL = θ_CL * exp(η) # Individual clearance
endNotice how:
- We define the population parameter (
θ_CL) - We define the variance of the random effect (
Ω) - We sample the random effect (
η) from a normal distribution with mean 0 - We calculate the individual parameter value using the equation
CL = θ_CL * exp(η)
4 Multiple Random Effects
4.1 Diagonal Matrix
Now let’s consider a more realistic case where we have multiple parameters (e.g., clearance and volume) with BSV. When we have multiple random effects, we need to consider not just their individual variances but potential correlations between them.
First, let’s look at the case where we assume the random effects are uncorrelated (a diagonal variance-covariance matrix):
# Define population parameters
θ_CL = 5.0 # Population clearance (L/hr)
θ_V = 50.0 # Population volume (L)
# Variances for random effects (diagonal elements)
omega_CL = 0.09 # Variance for clearance
omega_V = 0.04 # Variance for volume
# Create a diagonal variance-covariance matrix
omega_matrix = Diagonal([omega_CL, omega_V])2×2 Diagonal{Float64, Vector{Float64}}:
0.09 ⋅
⋅ 0.04
# Generate individual random effects for 100 subjects
etas = rand(MvNormal(zeros(2), omega_matrix), 100)
# Calculate individual parameters
individual_CL = θ_CL .* exp.(etas[1, :]) # First row contains etas for CL
individual_V = θ_V .* exp.(etas[2, :]) # Second row contains etas for V
# Create a data table for AoG
pk_data = (CL = individual_CL, V = individual_V)
# Plot using AlgebraOfGraphics
plt =
data(pk_data) *
mapping(:CL => "Clearance (L/hr)", :V => "Volume (L)") *
visual(Scatter, markersize = 8, color = :blue, alpha = 0.6)
draw(plt; axis = (title = "Individual PK Parameters (Uncorrelated BSV)",))For the diagonal matrix case, our mathematical model is:
\begin{pmatrix} \eta_{CL} \\ \eta_V \end{pmatrix} \sim \mathcal{N}\left(\begin{pmatrix} 0 \\ 0 \end{pmatrix}, \begin{pmatrix} \omega^2_{CL} & 0 \\ 0 & \omega^2_V \end{pmatrix}\right)
The variance-covariance matrix \Omega is:
\Omega = \begin{pmatrix} \omega^2_{CL} & 0 \\ 0 & \omega^2_V \end{pmatrix} = \begin{pmatrix} 0.09 & 0 \\ 0 & 0.04 \end{pmatrix}
The individual parameters are calculated as:
CL_i = \theta_{CL} \cdot e^{\eta_{CL,i}} V_i = \theta_V \cdot e^{\eta_{V,i}}
In Pumas, this would be implemented as:
@param begin
θ_CL ∈ RealDomain(lower = 0, init = 5.0) # Population clearance
θ_V ∈ RealDomain(lower = 0, init = 50.0) # Population volume
Ω ∈ PDiagDomain(2) # Diagonal variance-covariance matrix
end
@random begin
η ~ MvNormal(Ω) # Multivariate random effects
end
@pre begin
CL = θ_CL * exp(η[1]) # Individual clearance
V = θ_V * exp(η[2]) # Individual volume
end4.1.1 PDiagDomain in Pumas
The PDiagDomain in Pumas is used to specify a positive definite diagonal matrix. This means:
- It only stores the diagonal elements (variances)
- All diagonal elements are constrained to be positive (hence “positive definite”). This is crucial for variance parameters.
- All off-diagonal elements are zero (no correlations)
When you define Ω ∈ PDiagDomain(2), you’re creating a 2×2 diagonal matrix where the two diagonal elements represent the variances of the random effects for your two parameters.
Initializing this matrix can be done in a few ways:
# Option 1: Initialize with a single value (used for all diagonal elements)
Ω ∈ PDiagDomain(2) # Both diagonal elements initialized to 1.0
# Option 2: Initialize with a vector of values
Ω ∈ PDiagDomain(init = [0.09, 0.04]) # First diagonal element is 0.09, second is 0.044.2 Scalar Approaches for Individual Omega Elements
While PDiagDomain is convenient for diagonal matrices, there are situations where you might want more control over individual elements of the omega matrix. This is particularly useful when:
- You want to fix specific BSV parameters to known values
- You’re building models incrementally, adding one BSV parameter at a time
- You need to test the significance of individual BSV parameters
- You want clearer parameter interpretation and reporting
Instead of using PDiagDomain, you can define each omega parameter as an individual scalar:
@param begin
θ_CL ∈ RealDomain(lower = 0, init = 5.0) # Population clearance
θ_V ∈ RealDomain(lower = 0, init = 50.0) # Population volume
Ka ∈ RealDomain(lower = 0, init = 1.0) # Absorption rate constant
ω_CL ∈ RealDomain(lower = 0, init = 0.09) # BSV for clearance
ω_V ∈ RealDomain(lower = 0, init = 0.04) # BSV for volume
σ_prop ∈ RealDomain(lower = 0, init = 0.1) # Proportional residual error
end
@random begin
η_CL ~ Normal(0, sqrt(ω_CL)) # Random effect for clearance
η_V ~ Normal(0, sqrt(ω_V)) # Random effect for volume
end
@pre begin
CL = θ_CL * exp(η_CL) # Individual clearance
V = θ_V * exp(η_V) # Individual volume
endThis approach gives you explicit control over each BSV parameter and makes it easy to fix or constrain individual elements. Each random effect is defined as an individual Normal distribution, making the model structure clearer and more intuitive.
One major advantage of the scalar approach is the ability to fix specific BSV parameters while estimating others. Instead of modifying the model definition, you use the constantcoef argument in the fit function to fix parameters to specific values.
The model definition remains unchanged with all parameters included:
@param begin
θ_CL ∈ RealDomain(lower = 0, init = 5.0) # Population clearance
θ_V ∈ RealDomain(lower = 0, init = 50.0) # Population volume
Ka ∈ RealDomain(lower = 0, init = 1.0) # Absorption rate constant
ω_CL ∈ RealDomain(lower = 0, init = 0.09) # BSV for clearance
ω_V ∈ RealDomain(lower = 0, init = 0.04) # BSV for volume
σ_prop ∈ RealDomain(lower = 0, init = 0.1) # Proportional residual error
end
@random begin
η_CL ~ Normal(0, sqrt(ω_CL)) # Random effect for clearance
η_V ~ Normal(0, sqrt(ω_V)) # Random effect for volume
end
@pre begin
CL = θ_CL * exp(η_CL) # Individual clearance
V = θ_V * exp(η_V) # Individual volume
endThe recommended approach to fix the parameters is to use constantcoef during fitting:
# Define initial parameters
param_init = (
θ_CL = 5.0,
θ_V = 50.0,
Ka = 1.0,
ω_CL = 0.09, # This will be fixed during fitting
ω_V = 0.04, # This will be estimated
σ_prop = 0.1,
)
# Fit model with ω_CL fixed to 0.09
fitted_result = fit(
model,
population,
param_init,
Pumas.FOCE(),
constantcoef = (ω_CL = 0.09,), # Fix clearance BSV
)You can fix multiple parameters simultaneously:
# Fix both clearance BSV and absorption rate constant
fitted_result = fit(
model,
population,
param_init,
Pumas.FOCE(),
constantcoef = (ω_CL = 0.09, Ka = 1.0), # Fix both parameters
)This approach offers several advantages:
- The model definition remains clean and complete
- You can easily switch between fixed and estimated parameters
- Different fitting runs can fix different sets of parameters
- The same model can be used for different estimation scenarios
The scalar approach is particularly useful for incremental model building. You can start with no BSV and add parameters one at a time:
# Step 1: No BSV model
@model begin
@param begin
θ_CL ∈ RealDomain(lower = 0, init = 5.0)
θ_V ∈ RealDomain(lower = 0, init = 50.0)
Ka ∈ RealDomain(lower = 0, init = 1.0)
σ_prop ∈ RealDomain(lower = 0, init = 0.1)
end
@pre begin
CL = θ_CL # No BSV
V = θ_V # No BSV
end
# ... dynamics and derived
end
# Step 2: Add BSV on clearance only
@model begin
@param begin
θ_CL ∈ RealDomain(lower = 0, init = 5.0)
θ_V ∈ RealDomain(lower = 0, init = 50.0)
Ka ∈ RealDomain(lower = 0, init = 1.0)
ω_CL ∈ RealDomain(lower = 0, init = 0.09) # Add BSV for CL
σ_prop ∈ RealDomain(lower = 0, init = 0.1)
end
@random begin
η_CL ~ Normal(0, sqrt(ω_CL)) # Single random effect
end
@pre begin
CL = θ_CL * exp(η_CL) # BSV on clearance
V = θ_V # No BSV on volume
end
# ... dynamics and derived
end
# Step 3: Add BSV on volume
@model begin
@param begin
θ_CL ∈ RealDomain(lower = 0, init = 5.0)
θ_V ∈ RealDomain(lower = 0, init = 50.0)
Ka ∈ RealDomain(lower = 0, init = 1.0)
ω_CL ∈ RealDomain(lower = 0, init = 0.09) # BSV for CL
ω_V ∈ RealDomain(lower = 0, init = 0.04) # BSV for V
σ_prop ∈ RealDomain(lower = 0, init = 0.1)
end
@random begin
η_CL ~ Normal(0, sqrt(ω_CL)) # Random effect for clearance
η_V ~ Normal(0, sqrt(ω_V)) # Random effect for volume
end
@pre begin
CL = θ_CL * exp(η_CL) # BSV on clearance
V = θ_V * exp(η_V) # BSV on volume
end
# ... dynamics and derived
endThis approach allows you to:
- Test the impact of each BSV parameter individually
- Compare nested models using likelihood ratio tests
- Build complexity gradually based on data support
4.2.1 Working Example: Scalar vs Matrix Approaches
Let’s demonstrate both approaches with a complete working example:
# Scalar approach model
pkmodel_scalar = @model begin
@param begin
θ_CL ∈ RealDomain(lower = 0, init = 5.0)
θ_V ∈ RealDomain(lower = 0, init = 50.0)
Ka ∈ RealDomain(lower = 0, init = 1.0)
ω_CL ∈ RealDomain(lower = 0, init = 0.09)
ω_V ∈ RealDomain(lower = 0, init = 0.04)
σ_prop ∈ RealDomain(lower = 0, init = 0.1)
end
@random begin
η_CL ~ Normal(0, sqrt(ω_CL))
η_V ~ Normal(0, sqrt(ω_V))
end
@pre begin
CL = θ_CL * exp(η_CL)
V = θ_V * exp(η_V)
end
@dynamics begin
Depot' = -Ka * Depot
Central' = Ka * Depot - (CL / V) * Central
end
@derived begin
cp = @. Central / V
dv ~ @. Normal(cp, sqrt(cp^2 * σ_prop^2))
end
end
# Matrix approach model (equivalent)
pkmodel_matrix = @model begin
@param begin
θ_CL ∈ RealDomain(lower = 0, init = 5.0)
θ_V ∈ RealDomain(lower = 0, init = 50.0)
Ka ∈ RealDomain(lower = 0, init = 1.0)
Ω ∈ PDiagDomain(init = [0.09, 0.04])
σ_prop ∈ RealDomain(lower = 0, init = 0.1)
end
@random begin
η ~ MvNormal(Ω)
end
@pre begin
CL = θ_CL * exp(η[1])
V = θ_V * exp(η[2])
end
@dynamics begin
Depot' = -Ka * Depot
Central' = Ka * Depot - (CL / V) * Central
end
@derived begin
cp = @. Central / V
dv ~ @. Normal(cp, sqrt(cp^2 * σ_prop^2))
end
endPumasModel
Parameters: θ_CL, θ_V, Ka, Ω, σ_prop
Random effects: η
Covariates:
Dynamical system variables: Depot, Central
Dynamical system type: Matrix exponential
Derived: cp, dv
Observed: cp, dv
Both models are mathematically equivalent and will produce identical results:
# Parameters for scalar approach
param_scalar = (θ_CL = 5.0, θ_V = 50.0, Ka = 1.0, ω_CL = 0.09, ω_V = 0.04, σ_prop = 0.1)
# Parameters for matrix approach
param_matrix = (θ_CL = 5.0, θ_V = 50.0, Ka = 1.0, Ω = [0.09 0.0; 0.0 0.04], σ_prop = 0.1)
# Create population
pop = map(1:50) do id
Subject(id = id, events = DosageRegimen(100, time = 0))
end
# Simulate both models
sim_scalar = simobs(pkmodel_scalar, pop, param_scalar, obstimes = 0:2:24)
sim_matrix = simobs(pkmodel_matrix, pop, param_matrix, obstimes = 0:2:24)
# Extract individual parameters to verify they're identical
cl_scalar = [subject.icoefs[:CL][1] for subject in sim_scalar]
cl_matrix = [subject.icoefs[:CL][1] for subject in sim_matrix]
# Check if they're identical (should be true with same random seed)
Random.seed!(123)
sim_scalar = simobs(pkmodel_scalar, pop, param_scalar, obstimes = 0:2:24)
Random.seed!(123)
sim_matrix = simobs(pkmodel_matrix, pop, param_matrix, obstimes = 0:2:24)
cl_scalar = [subject.icoefs[:CL][1] for subject in sim_scalar]
cl_matrix = [subject.icoefs[:CL][1] for subject in sim_matrix]
println("Individual clearances are identical: ", cl_scalar ≈ cl_matrix)Individual clearances are identical: true
4.2.2 Using constantcoef with the Scalar Approach
The scalar approach is particularly useful when fitting models with some parameters fixed. Here’s how to use constantcoef with the scalar model:
# Fit the scalar model with clearance BSV fixed
fitted_scalar = fit(
pkmodel_scalar,
population,
param_scalar,
Pumas.FOCE(),
constantcoef = (ω_CL = 0.09,), # Fix clearance BSV
)
# Fit the same model with both BSV parameters fixed
fitted_scalar_fixed = fit(
pkmodel_scalar,
population,
param_scalar,
Pumas.FOCE(),
constantcoef = (ω_CL = 0.09, ω_V = 0.04), # Fix both BSV parameters
)
# Compare model fits using likelihood ratio test
# (assuming fitted_scalar_full is the model with all parameters estimated)
lrt_result = lrtest(fitted_scalar_full, fitted_scalar)This flexibility makes the scalar approach ideal for:
- Testing the significance of individual BSV parameters
- Progressive model building workflows
- Sensitivity analyses with different fixed parameter values
4.2.3 When to Use Each Approach
Use Scalar Approach When:
- You need to fix specific BSV parameters to known values
- Building models incrementally (adding BSV one parameter at a time)
- You want explicit control over individual omega elements
- Parameter interpretation and reporting clarity is important
- Testing significance of individual BSV parameters
Use Matrix Approach When:
- You have established BSV structure and want concise code
- All BSV parameters will be estimated (not fixed)
- You prefer the mathematical matrix notation
- Working with standard diagonal structures
Example Decision Tree:
- Are you fixing any BSV parameters? → Use scalar approach
- Are you building models incrementally? → Use scalar approach
- Do you need maximum flexibility? → Use scalar approach
- Are you working with standard structures? → Matrix approach is fine
- Do you prefer concise code? → Matrix approach
4.3 Full Matrix with Covariances
Random effects parameters can also be correlated. We model this using a full variance-covariance matrix with non-zero off-diagonal elements.
# Define population parameters
θ_CL = 5.0 # Population clearance (L/hr)
θ_V = 50.0 # Population volume (L)
# Variances for random effects (diagonal elements)
omega_CL = 0.09 # Variance for clearance
omega_V = 0.04 # Variance for volume
# Correlation between CL and V (between -1 and 1)
correlation = 0.6
# Covariance between CL and V
covariance = correlation * sqrt(omega_CL) * sqrt(omega_V)
# Create a full variance-covariance matrix
omega_matrix = [omega_CL covariance; covariance omega_V]2×2 Matrix{Float64}:
0.09 0.036
0.036 0.04
# Generate individual random effects for 100 subjects
etas = rand(MvNormal(zeros(2), omega_matrix), 100)
# Calculate individual parameters
individual_CL = θ_CL .* exp.(etas[1, :]) # First row contains etas for CL
individual_V = θ_V .* exp.(etas[2, :]) # Second row contains etas for V
# Create a data table for AoG
pk_data_corr = (CL = individual_CL, V = individual_V)
# Plot using AlgebraOfGraphics
plt =
data(pk_data_corr) *
mapping(:CL => "Clearance (L/hr)", :V => "Volume (L)") *
visual(Scatter, markersize = 8, color = :red, alpha = 0.6)
draw(plt; axis = (title = "Individual PK Parameters (Correlated BSV)",))For the full matrix case, our mathematical model is:
\begin{pmatrix} \eta_{CL} \\ \eta_V \end{pmatrix} \sim \mathcal{N}\left(\begin{pmatrix} 0 \\ 0 \end{pmatrix}, \begin{pmatrix} \omega^2_{CL} & \omega_{CL,V} \\ \omega_{CL,V} & \omega^2_V \end{pmatrix}\right)
The variance-covariance matrix \Omega is:
\Omega = \begin{pmatrix} \omega^2_{CL} & \omega_{CL,V} \\ \omega_{CL,V} & \omega^2_V \end{pmatrix} = \begin{pmatrix} 0.09 & 0.036 \\ 0.036 & 0.04 \end{pmatrix}
Where \omega_{CL,V} = \rho \cdot \sigma_{CL} \cdot \sigma_V = 0.6 \cdot \sqrt{0.09} \cdot \sqrt{0.04} = 0.036
In Pumas, this would be implemented using the PSDDomain:
@param begin
θ_CL ∈ RealDomain(lower = 0, init = 5.0) # Population clearance
θ_V ∈ RealDomain(lower = 0, init = 50.0) # Population volume
Ω ∈ PSDDomain(2) # Full variance-covariance matrix
end
@random begin
η ~ MvNormal(Ω) # Multivariate random effects
end
@pre begin
CL = θ_CL * exp(η[1]) # Individual clearance
V = θ_V * exp(η[2]) # Individual volume
end4.3.1 PSDDomain in Pumas
The PSDDomain in Pumas is used to specify a positive semi-definite matrix, which includes both the variances (diagonal elements) and covariances (off-diagonal elements). This means:
- It stores both diagonal and off-diagonal elements
- The matrix is constrained to be positive semi-definite (all eigenvalues ≥ 0)
- Off-diagonal elements can be non-zero (allowing for correlations)
A variance-covariance matrix must be positive semi-definite (PSD) to be valid. This mathematical property ensures that the variance of any linear combination of random variables is non-negative, which is required for a sensible probabilistic model.
If a matrix is not PSD, certain linear combinations of parameters could have negative variances, which is mathematically and physically impossible.
When you define Ω ∈ PSDDomain(2), you’re creating a 2×2 matrix where:
- The diagonal elements represent the variances of the random effects
- The off-diagonal elements represent the covariances between random effects
Initializing this matrix can be done as follows:
# Option 1: Initialize with a full matrix
initial_omega = [0.09 0.036; 0.036 0.04] # Includes correlation of 0.6
Ω ∈ PSDDomain(init = initial_omega)4.4 Constructing Valid Variance-Covariance Matrices
One challenge when working with full variance-covariance matrices is ensuring they are positive definite. Here is a simple approach to construct a valid matrix:
4.4.1 Using Correlation
# Define standard deviations and correlation
sd_CL = 0.3 # sqrt(0.09)
sd_V = 0.2 # sqrt(0.04)
correlation = 0.6
# Calculate covariance
cov_CL_V = correlation * sd_CL * sd_V
# Construct the matrix
omega_matrix = [sd_CL^2 cov_CL_V; cov_CL_V sd_V^2]
println("Variance-covariance matrix:")
display(omega_matrix)
# Verify it's positive definite
eigenvalues = eigvals(omega_matrix)
println("\nEigenvalues: ", eigenvalues)
println("Is positive definite: ", all(eigenvalues .> 0))Variance-covariance matrix:
Eigenvalues: [0.02117078599837776, 0.10882921400162225]
Is positive definite: true
2×2 Matrix{Float64}:
0.09 0.036
0.036 0.04
In Pumas, you can initialize a PSDDomain using either of these approaches, but the most direct way is to provide the full matrix:
# Correlation approach
sd_CL = 0.3
sd_V = 0.2
correlation = 0.6
cov_CL_V = correlation * sd_CL * sd_V
initial_omega = [sd_CL^2 cov_CL_V; cov_CL_V sd_V^2]
Ω ∈ PSDDomain(init = initial_omega)5 Pumas Example
Let’s put everything together in a complete Pumas model with BSV on clearance and volume, including correlation between these parameters:
using Pumas
pkmodel = @model begin
@param begin
θ_CL ∈ RealDomain(lower = 0, init = 5.0) # Population clearance
θ_V ∈ RealDomain(lower = 0, init = 50.0) # Population volume
Ka ∈ RealDomain(lower = 0, init = 1.0) # Absorption rate constant (fixed)
Ω ∈ PSDDomain(2) # Full variance-covariance matrix for CL and V
σ_prop ∈ RealDomain(lower = 0, init = 0.1) # Proportional residual error
end
@random begin
η ~ MvNormal(Ω) # Multivariate random effects for CL and V
end
@pre begin
CL = θ_CL * exp(η[1]) # Individual clearance
V = θ_V * exp(η[2]) # Individual volume
end
@dynamics begin
Depot' = -Ka * Depot
Central' = Ka * Depot - (CL / V) * Central
end
@derived begin
cp = @. Central / V # Concentration in plasma
dv ~ @. Normal(cp, sqrt(cp^2 * σ_prop^2)) # Observations with proportional error
end
end
# Define initial parameter values including correlation
param_init = (
θ_CL = 5.0,
θ_V = 50.0,
Ka = 1.0,
# Full variance-covariance matrix with correlation of 0.6
Ω = [0.09 0.036; 0.036 0.04],
σ_prop = 0.1,
)
# Create a population design with BSV
pop = map(1:100) do id
Subject(
id = id,
events = DosageRegimen(100, time = 0), # 100 mg dose at time 0
)
end
# Simulate the model
sim = simobs(pkmodel, pop, param_init, obstimes = 0:0.5:24)Simulated population (Vector{<:Subject})
Simulated subjects: 100
Simulated variables: cp, dv
After simulating the model, you can visualize the individual concentration-time profiles to see the effect of BSV:
using AlgebraOfGraphics, CairoMakie, DataFrames
# Extract simulated data into a DataFrame
sim_df = DataFrame(sim)
# Plot using AlgebraOfGraphics
plt =
data(sim_df) *
mapping(:time => "Time (h)", :cp => "Concentration (mg/L)", group = :id) *
visual(Lines, alpha = 0.3)
draw(plt; axis = (title = "Individual Concentration Profiles",))This visualization shows how BSV creates different concentration-time profiles for each individual, even though they all received the same dose.
6 Interpreting BSV
When working with BSV in your models, here are some key points to consider:
6.1 Variances on Log Scale
When we model parameters like CL = θ_CL * exp(η), the random effect η is on a log scale. This means:
- The variance
Ωrepresents variability on a log scale - For small values (<0.3), this variance is approximately the coefficient of variation (CV%) squared
- CV% ≈ 100\cdot\sqrt{\omega^{2}} for small variances
For a parameter modeled as P_i = \theta_P \cdot e^{\eta_i} where \eta_i \sim \mathcal{N}(0,\omega^2):
The coefficient of variation is defined as: \text{CV} = \frac{\sigma}{\mu}
For log-normally distributed parameter P:
- \text{Mean}(\mu) = \theta_P \cdot e^{\omega^2/2}
- \text{Variance}(\sigma^2) = \theta_P^2 \cdot e^{\omega^2} \cdot (e^{\omega^2} - 1)
- \text{Standard Deviation}(\sigma) = \theta_P \cdot e^{\omega^2/2} \cdot \sqrt{e^{\omega^2} - 1}
Therefore:
\text{CV} = \frac{\sigma}{\mu} = \sqrt{e^{\omega^2} - 1}
For values of \omega close to 0, we can approximate \text{CV} \approx \omega or, equivalently, \text{CV}\% \approx 100 \omega. The argument follows from the power series of f(x) = e^{x^2} - 1 around x = 0. It tells us that \text{CV}^2 = e^{\omega^2} - 1 = \omega^2 + \mathcal{O}(\omega^4), and hence \text{CV} \approx \omega for values of \omega close to 0.
For example, if Ω = 0.09 for clearance:
- CV% ≈ 100 * sqrt(0.09) = 30%
- This means about 68% of subjects will have clearance values within ±30% of the population value
The relationship between BSV parameters and the 68% coverage is grounded in probability theory.
First, let’s understand what’s happening mathematically when we model BSV:
- We have a random effect
ηthat follows a normal distribution with mean 0 and varianceω²(0.09 in our example) - We transform this with an exponential function and multiply by our population parameter
θ_CLto get individual clearance values
This creates a log-normal distribution for our clearance values. Let’s examine this transformation and the resulting probabilities:
# Set the population parameter and BSV
θ_CL = 5.0 # Population clearance (L/hr)
ω² = 0.09 # BSV variance
ω = sqrt(ω²) # BSV standard deviation (0.3)
# Create the normal distribution for the random effect (η)
eta_dist = Normal(0, ω)
# Create a log-normal distribution for the clearance
# If X ~ Normal(μ, σ), then exp(X) ~ LogNormal(μ, σ)
# CL = θ_CL * exp(η), so CL ~ LogNormal(log(θ_CL), ω)
cl_dist = LogNormal(log(θ_CL), ω)
# Calculate the probabilities directly from the CDF
p_within_1sd = cdf(eta_dist, ω) - cdf(eta_dist, -ω)
println(
"Probability within ±1 standard deviation (±$ω): $(round(p_within_1sd*100, digits=2))%",
)
# Find clearance values corresponding to ±1 standard deviation
cl_lower = θ_CL * exp(-ω)
cl_upper = θ_CL * exp(ω)
println("Clearance range corresponding to ±1 standard deviation:")
println(
" Lower bound: $(round(cl_lower, digits=2)) L/hr ($(round((cl_lower/θ_CL)*100, digits=1))% of population value)",
)
println(
" Upper bound: $(round(cl_upper, digits=2)) L/hr ($(round((cl_upper/θ_CL)*100, digits=1))% of population value)",
)Probability within ±1 standard deviation (±0.3): 68.27%
Clearance range corresponding to ±1 standard deviation:
Lower bound: 3.7 L/hr (74.1% of population value)
Upper bound: 6.75 L/hr (135.0% of population value)
When we run this code, we get:
- Probability within ±1 standard deviation (±0.3): 68.27%
- Lower bound: 3.7 L/hr (74.1% of population value)
- Upper bound: 6.77 L/hr (135.3% of population value)
This 68.27% figure comes directly from evaluating the cumulative distribution function (CDF) of the normal distribution. The CDF gives us the probability that a random variable takes a value less than or equal to a specified point.
By calculating cdf(eta_dist, ω) - cdf(eta_dist, -ω), we’re finding the probability that η falls between -ω and +ω, which is precisely the area under the normal curve within one standard deviation of the mean.
The key insight here is that this 68.27% probability transfers perfectly to our clearance values. When we transform the normal distribution of η using the exponential function:
- Values of
η = -0.3transform toCL = 5.0 * exp(-0.3) ≈ 3.7 L/hr - Values of η = +0.3 transform to
CL = 5.0 * exp(0.3) ≈ 6.77 L/hr
Even though the resulting distribution is no longer symmetric around the population value (it’s log-normal), the probability mass between these two clearance values remains exactly 68.27%.
We can verify this with random sampling:
# Demonstrate with random samples
samples_n = 10000
eta_samples = rand(eta_dist, samples_n) # Generate random samples from η distribution
cl_samples = θ_CL .* exp.(eta_samples) # Calculate corresponding clearance values
# Count how many samples fall within ±1 standard deviation
count_within_1sd_eta = count(-ω .≤ eta_samples .≤ ω)
count_within_1sd_cl = count(cl_lower .≤ cl_samples .≤ cl_upper)
println("\nFrom $samples_n random samples:")
println(
" Samples of η within ±$ω: $count_within_1sd_eta ($(round(count_within_1sd_eta/samples_n*100, digits=2))%)",
)
println(
" Samples of CL within [$(round(cl_lower, digits=2)), $(round(cl_upper, digits=2))]: $count_within_1sd_cl ($(round(count_within_1sd_cl/samples_n*100, digits=2))%)",
)
From 10000 random samples:
Samples of η within ±0.3: 6780 (67.8%)
Samples of CL within [3.7, 6.75]: 6780 (67.8%)
This simulation would show that approximately 68% of both the η values and the corresponding clearance values fall within the ranges we calculated.
The visual representation of these distributions shows why this happens. The normal distribution of η is symmetric around 0, with about 68% of its area between -0.3 and +0.3. When we transform this to clearance values, the resulting log-normal distribution is right-skewed, but still contains exactly 68% of its probability mass between 3.7 and 6.77 L/hr.
This property is fundamental to interpreting BSV parameters in pharmacometrics. When we report a BSV with ω = 0.3 (or variance ω² = 0.09), we’re saying that about 68% of individual parameter values will fall within a range from 74% to 135% of the population parameter value.
This statistical foundation enables modelers to make quantitative predictions about the expected range of parameter values in a population, which directly impacts dosing recommendations and clinical trial design.
6.2 Correlations
When you see a positive correlation between two parameters (like clearance and volume):
- Subjects with higher-than-typical clearance will tend to have higher-than-typical volume
- This correlation may have physiological interpretations (e.g., relationship with body size)
- Ignoring correlations can lead to biased parameter estimates and poor predictive performance
To visualize the impact of including or excluding correlations, you can compare simulations from:
- A model with diagonal Ω (no correlations)
- A model with full Ω (including correlations)
Often, the second model will better preserve the joint distribution of PK parameters and produce more realistic predictions.
6.3 Visualization
A helpful way to understand BSV is to simulate individual parameter values and visualize them:
# Extract individual parameter values
individual_CLs = [subject.icoefs[:CL][1] for subject in sim]
individual_Vs = [subject.icoefs[:V][1] for subject in sim]
# Create data table for AoG
param_data = (CL = individual_CLs, V = individual_Vs)
# Create scatter plot with AlgebraOfGraphics
plt = data(param_data) * mapping(:CL, :V) * visual(Scatter, markersize = 8, alpha = 0.6)
# Add the typical values as a reference point
typical_point = (CL = [param_init.θ_CL], V = [param_init.θ_V])
reference =
data(typical_point) * mapping(:CL, :V) * visual(Scatter, markersize = 12, color = :red)
draw(
plt + reference;
axis = (;
title = "Correlation between CL and V",
xlabel = "Individual Clearance (L/hr)",
ylabel = "Individual Volume of Distribution (L)",
),
)7 Common BSV Structures
In practice, different BSV structures are used depending on the available data and model complexity:
7.3 Block Diagonal Structure
For models with many parameters, you might use a mix of correlated and uncorrelated random effects:
@param begin
Ω_PK ∈ PSDDomain(2) # Correlated PK parameters (CL and V)
Ω_PD ∈ PDiagDomain(2) # Uncorrelated PD parameters
end
@random begin
η_PK ~ MvNormal(Ω_PK)
η_PD ~ MvNormal(Diagonal(Ω_PD))
end
@pre begin
CL = θ_CL * exp(η_PK[1])
V = θ_V * exp(η_PK[2])
Emax = θ_Emax * exp(η_PD[1])
EC50 = θ_EC50 * exp(η_PD[2])
end\Omega = \begin{pmatrix} \omega^2_{CL} & \omega_{CL,V} & 0 & 0 \\ \omega_{CL,V} & \omega^2_V & 0 & 0 \\ 0 & 0 & \omega^2_{Emax} & 0 \\ 0 & 0 & 0 & \omega^2_{EC50} \end{pmatrix}
This structure assumes:
- Correlation between CL and V
- Correlation between Emax and EC50 is zero
- No correlation between PK and PD parameters
8 Conclusion
Between Subject Variability is a fundamental concept in pharmacometrics, allowing us to quantify how drug parameters vary across a population. In this tutorial, we’ve covered:
- The basic concept of samples versus variance in a population
- How to model BSV for a single parameter using a normal distribution
- How to model BSV for multiple parameters using diagonal and full variance-covariance matrices
- How to implement different BSV structures in Pumas using
PDiagDomainandPSDDomain - How to interpret and visualize BSV in pharmacometric models
In a future tutorial, we’ll explore different functional forms for incorporating random effects into your models beyond the standard log-normal approach presented here. We’ll also discuss the use of non-Gaussian random effects for situations where normality assumptions are not appropriate. These advanced techniques can further enhance your models’ flexibility and accuracy in capturing population variability.
9 References
For more information on this topic, consider the following resources:
Bauer, R. J. (2019). NONMEM Tutorial Part II: Estimation Methods and Advanced Examples. CPT: Pharmacometrics & Systems Pharmacology, 8(8), 538-556.
Karlsson, M. O., & Savic, R. M. (2007). Diagnosing model diagnostics. Clinical Pharmacology & Therapeutics, 82(1), 17-20.
Pumas Documentation: https://docs.pumas.ai/
Owen, J. S., & Fiedler-Kelly, J. (2014). Introduction to population pharmacokinetic/pharmacodynamic analysis with nonlinear mixed effects models. John Wiley & Sons.
Lavielle, M. (2014). Mixed effects models for the population approach: models, tasks, methods and tools. CRC press.