Model Diagnostics and Evaluation in Pumas

Author

Jessica Wojciechowski

1 Introduction

The PumasUtilities package provides access to the plotting functions available for Pumas. All of the available plotting functions build upon the CairoMakie.jl plotting ecosystem and therefore, are interoperable with any plots created using CairoMakie.jl. You can freely combine individual plots from Pumas-specific functions with those of CairoMakie.jl, or any other plotting functions that build upon it such as AlgebraOfGraphics.jl.

The objective of this topic is to demonstrate how to extract model results and perform graphical assessments of the model using in-built Pumas functions.

While not the focus of this topic, customized diagnostics plots can still be generated using AlgebraOfGraphics.jl and CairoMakie.jl using Pumas model output. How to obtain Pumas model output appropriate for customized plotting is demonstrated as part of this topic.

In this topic, it is already assumed that a Pumas model has been fitted (mod_fit) to the warfarin concentrations over time where the analysis dataset was named, examp_df_pumas, and the dependent variable was named, pkconc.

The model is a 1-compartment model with linear elimination and first-order absorption, log-normally distributed inter-individual variability on clearance (CL) and volume of distribution of the central compartment (VC), and a proportional residual error model.

# Read in as a Pumas dataset
examp_df_pumas =
    read_pumas(examp_df; observations = [:pkconc], covariates = [:WEIGHT, :AGE, :SEX])
# Define the Pumas model
mod_code = @model begin
    @param begin
        # Definition of fixed effect parameters
        θCL  RealDomain(; lower = 0.0)
        θVC  RealDomain(; lower = 0.0)
        θKA  RealDomain(; lower = 0.0)
        # Random effect parameters
        # Variance-covariance matrix for inter-individual variability
        Ω  PSDDomain(2)
        # Residual unexplained variability
        σpro  RealDomain(; lower = 0.0)
    end
    @random begin
        # Sampling random effect parameters
        η ~ MvNormal(Ω)
    end
    @covariates WEIGHT AGE SEX
    @pre begin
        # Derived variables
        # Covariates
        # None

        # Individual PK parameters
        CL = θCL * ((WEIGHT / 70)^0.75) * exp(η[1])
        VC = θVC * ((WEIGHT / 70)^1.0) * exp(η[2])
        KA = θKA
    end
    @init begin
        # Define initial conditions
        Depot = 0.0
        Central = 0.0
    end
    @vars begin
        # Concentrations in compartments
        centc := Central / VC
    end
    @dynamics begin
        # Differential equations
        Depot' = -KA * Depot
        Central' = KA * Depot - CL * centc
    end
    @derived begin
        # Definition of derived variables
        # Individual-predicted concentration
        ipre := @.(Central / VC)
        # Dependent variable
        """
        Warfarin Concentration (mg/L)
        """
        pkconc ~ @.Normal(ipre, sqrt((ipre * σpro)^2))
    end
end

# Define the initial estimates
init_params = (θCL = 1, θVC = 10, θKA = 1, Ω = [
    0.09 0.01
    0.01 0.09
], σpro = 0.3)
FittedPumasModel

Successful minimization:                      true

Likelihood approximation:                     FOCE
Likelihood Optimizer:                         BFGS
Dynamical system type:          Matrix exponential

Log-likelihood value:                   -421.87615
Number of subjects:                             31
Number of parameters:         Fixed      Optimized
                                  0              7
Observation records:         Active        Missing
    pkconc:                     240             47
    Total:                      240             47

--------------------
         Estimate
--------------------
θCL       0.13335
θVC       8.2672
θKA       0.73594
Ω₁,₁      0.039497
Ω₂,₁      0.0049461
Ω₂,₂      0.00922
σpro      0.21955
--------------------

2 Obtaining Model Predictions

There are several variables that need to be calculated in order to generate standard goodness-of-fit diagnostics. These variables include:

  • Empirical Bayes estimates (ebes): individual random effect parameters
  • Population-predictions (pred): model-based predictions corresponding to observation time-points using parameters derived from the fixed effects (population-typical values and covariate effects)
  • Individual-predictions (ipred): model-based predictions corresponding to observation time-points using parameters derived from the fixed effects (population-typical values and covariate effects) and ebes
  • Residuals (wres, iwres, npdes, eiwres): population weighted residuals (or conditional weighted residuals), individual weighted residuals, normalized prediction distribution errors, and expected simulated-based individual weighted residuals, respectively.

To obtain these variables, the FittedPumasModel is passed to the inspect function and the corresponding output is shown below.

Tip

nsim is a keyword argument specifying the number of simulations to be performed to obtain simulation-based residual diagnostics (npde and eiwres).

# Return model predictions
# Specifying 100 simulations for demonstrate purposes only
mod_pred = inspect(mod_fit; nsim = 100)
[ Info: Calculating predictions.
[ Info: Calculating weighted residuals.
[ Info: Calculating empirical bayes.
[ Info: Calculating NPDEs and EWRES
[ Info: Evaluating individual parameters.
[ Info: Evaluating dose control parameters.
[ Info: Done.
FittedPumasModelInspection

Fitting was successful: true
Likehood approximation used for weighted residuals : Pumas.FOCE{Optim.NewtonTrustRegion{Float64}, Optim.Options{Float64, Nothing}}(Optim.NewtonTrustRegion{Float64}(1.0, 100.0, 1.4901161193847656e-8, 0.1, 0.25, 0.75, false), Optim.Options(x_abstol = 0.0, x_reltol = 0.0, f_abstol = 0.0, f_reltol = 0.0, g_abstol = 1.0e-5, g_reltol = 1.0e-8, outer_x_abstol = 0.0, outer_x_reltol = 0.0, outer_f_abstol = 0.0, outer_f_reltol = 0.0, outer_g_abstol = 1.0e-8, outer_g_reltol = 1.0e-8, f_calls_limit = 0, g_calls_limit = 0, h_calls_limit = 0, allow_f_increases = false, allow_outer_f_increases = true, successive_f_tol = 1, iterations = 1000, outer_iterations = 1000, store_trace = false, trace_simplex = false, show_trace = false, extended_trace = false, show_warnings = true, show_every = 1, time_limit = NaN, )
)

The FittedPumasModelInspection object returned by the inspect function will be passed to all subsequent goodness-of-fit diagnostic functions in Pumas.

However, as shown above, we cannot directly view the contents of the object nor use it to generate customized goodness-of-fit diagnostic plots with AlgebraOfGraphics.jl. Such that, the object, mod_pred, can be converted to a DataFrame:

mod_pred_df = DataFrame(mod_pred)

The first 10 rows of the DataFrame are printed below. Of note:

  • The DataFrame closely resembles the input analysis dataset for the Pumas model
  • The variables required for goodness-of-fit diagnostics have been added for each dependent variable represented in the model
  • Individual parameters for CL, VC, and KA have been calculated
  • The amounts in each of the model (Depot and Central) compartments are returned

2.1 Interpolation and Extrapolation of Predictions

Predictions at time-points other than those in the original analysis dataset can be generated using the predict function and passing the FittedPumasModel, the Pumas Population object (from read_pumas), and a vector of times to generate predictions.

Note: The intent here is to take the existing dosing regimens and individual parameters in the analysis population and interpolate/extrapolate the individual-predictions beyond those that were observed.

Prediction
  Subjects: 31
  Predictions: pkconc
  Covariates: WEIGHT, AGE, SEX

The Prediction object returned by predict can be passed to a set of goodness-of-fit diagnostic functions in Pumas.

However, as shown above, we cannot directly view the contents of the object nor use it to generate customized goodness-of-fit diagnostic plots with AlgebraOfGraphics.jl. Such that, the object, mod_extrap, can be converted to a DataFrame:

mod_extrap_df = DataFrame(mod_extrap)

The first 10 rows of the DataFrame are printed below. Of note:

  • The DataFrame closely resembles the input analysis dataset for the Pumas model
  • All time-points specified in obstimes in predict have been added to the DataFrame
  • The observed dependent variable for the model, pkconc, is set to missing at all records where interpolation or extrapolation as been performed
  • The individual-predictions (ipred) and population-predictions (pred) have been added for each dependent variable represented in the model as interpolation/extrapolation times
  • Individual parameters for CL, VC, and KA have been calculated
  • The amounts in each of the model (Depot and Central) compartments are returned

3 Convergence Trace

The previous topic of this Module, Interpreting Results from Pumas Fits, discussed how to evaluate if a model has successfully converged and explained the cases why a model optimization may terminate. Pumas allows graphical evaluation of how the loglikelihood and gradient norm changed over the iterations using the convergence_trace function:

# Plot the log-likelihood and gradient norm over each iteration
convergence_trace(mod_fit)

Tip

Additional keyword arguments can be passed to convergence_trace to assist with formatting and styles. The keyword arguments and options leverage CairoMakie.jl functionality.

Type ?convergence_trace in the Julia REPL to see how to modify the colors and styles for the lines.

4 Goodness-of-Fit Diagnostics

Pumas has a suite of functions that generate standard goodness-of-fit diagnostic plots.

To generate a standard 2 x 2 panel, the goodness_of_fit function can be used to plot the following diagnostics. Note: each subpanel of the plot is constructed from separate functions of which can also be used to generate the subpanels separately:

  1. Observed versus population-predicted (observations_vs_predictions)

  2. Observed versus individual-predicted (observations_vs_ipredictions)

  3. Residuals versus time

    • If nsim was not passed to inspect, then default is population weighted residuals versus time (wresiduals_vs_time)
    • If nsim was passed to inspect, then default is normalized prediction distribution errors versus time (npde_vs_time)
  4. Residuals versus predictions

    • If nsim was not passed to inspect, then default is individual weighted residuals versus individual-predictions (iwresiduals_vs_ipredictions)
    • If nsim was passed to inspect, then default is normalized prediction distribution errors versus population-predictions (npde_vs_predictions)
    • Note: An additional variation of residuals versus predictions is available but not included in the result of goodness_of_fit (population weighted residuals versus population predictions; wresiduals_vs_predictions)

In our example, nsim was passed to inspect and therefore, diagnostics displaying normalized prediction distribution errors are presented by default.

# Generate 2 x 2 panel of goodness-of-fit diagnostics
goodness_of_fit(
    mod_pred, # FittedPumasModelInspection
    observations = [:pkconc],

    # Legend options
    include_legend = true,
    figurelegend = (; position = :b, framevisible = false, orientation = :horizontal),
)

Tip

Additional keyword arguments can be passed to goodness_of_fit to assist with formatting and styles. The keyword arguments and options leverage CairoMakie.jl functionality.

Type ?goodness_of_fit in the Julia REPL to see how to modify the colors and styles for the points, and LOESS and linear regression fits.

5 Individual- and Population-Predicted Time-Courses

Individual- and population-predicted time-courses can be evaluated against the observed data for each individual in the analysis population using the subject_fits function.

Objects from either inspect or predict can be passed to subject_fits, where the latter provides individual- and population-predictions at interpolated or extrapolated time-points and may be useful in sparse sampling scenarios.

fig_id_conc = subject_fits(
    mod_pred; # FittedPumasModelInspection

    # Separate the individuals into their own panels
    separate = true,

    # Provide unique labels to each individual in the population
    ids = unique(examp_df.id),

    # Observation type to be plotted
    observations = [:pkconc],

    # Legend options
    include_legend = true,
    figurelegend = (; position = :b, framevisible = false, orientation = :horizontal),

    # Labels for the legend options
    labels = (;
        data = "Observed",
        pred = "Population Predicted",
        ipred = "Individual Predicted",
    ),

    # Options for how to handle the individual facet labels
    facet = (combinelabels = true,),

    # Options to print plots over several pages to avoid crowding
    paginate = true,
)

fig_id_conc = subject_fits(
    mod_extrap; # Prediction

    # Separate the individuals into their own panels
    separate = true,

    # Provide unique labels to each individual in the population
    ids = unique(examp_df.id),

    # Observation type to be plotted
    observations = [:pkconc],

    # Legend options
    include_legend = true,
    figurelegend = (; position = :b, framevisible = false, orientation = :horizontal),

    # Labels for the legend options
    labels = (;
        data = "Observed",
        pred = "Population Predicted",
        ipred = "Individual Predicted",
    ),

    # Options for how to handle the individual facet labels
    facet = (combinelabels = true,),

    # Options to print plots over several pages to avoid crowding
    paginate = true,
)
Tip

Additional keyword arguments can be passed to subject_fits to assist with formatting and styles. The keyword arguments and options leverage CairoMakie.jl functionality.

Type ?subject_fits in the Julia REPL to see how to modify the colors and styles for the points, and individual- and population-predicted lines.

6 Empirical Bayes Estimates Distributions

Pumas plotting functions are available to make assessments of normality of empirical Bayes estimates (EBE) distributions.

6.1 Assessment of Normality

Histograms of EBEs from the FittedPumasModel can be generated using the empirical_bayes_dist function.

# Plot histograms of each of the EBEs from the model output
testplot = empirical_bayes_dist(
    mod_pred,  # FittedPumasModelInspection
)

Tip

Additional keyword arguments can be passed to empirical_bayes_dist to assist with formatting and styles. The keyword arguments and options leverage CairoMakie.jl functionality.

Type ?empirical_bayes_dist in the Julia REPL to see how to modify the colors and styles for the histogram and the vertical line aesthetics.

6.2 Correlation of EBEs

Pairwise correlation plots assessing the correlation between EBEs is best achieved using the PairPlots.jl package and the pairplot function. This package uses the Makie plotting library similar to the in-built Pumas plotting functions.

A pairwise plot can be generated using pairplot. This example code below is just to demonstrate one methods for creating pairwise plots in Julia and it is highly recommended to use help queries where possible, i.e., ?pairplot, and review the PairsPlot.jl Guide. Note: this requires the DataFrame format of the inspect output.

# Create a DataFrame storing just the EBE information based on the
# FittedPumasModelInspection
ebes = @chain mod_pred_df begin
    # Select only columns that contain η
    select(r"^η")
    # Exclude any missing values
    dropmissing
end

# Construct a figure object
p_etacorr = Figure()
# Construct the pairwise plot
pairplot(
    p_etacorr[1, 1],
    # Specify the input DataFrame
    ebes => (
        # Specify the elements that should be presented on the off-diagonals
        # Scatterplot with correlation and trendline
        PairPlots.Scatter(
            marker = '∘',
            markersize = 24,
            alpha = 0.5,
            color = ColorSchemes.tab10.colors[1],
        ),
        PairPlots.TrendLine(color = :red),
        PairPlots.Correlation(fontsize = 14, color = :blue),

        # Specify elements that should be presented on the diagonals
        # Histogram
        PairPlots.MarginHist(color = ColorSchemes.tab10.colors[1]),
    ),
    fullgrid = false,
)
# Print the plot
p_etacorr

Additional comments:

7 EBEs versus Covariates

Graphical assessments of EBE versus covariate relationships can be obtained using empirical_bayes_vs_covariates. By default:

  • Continuous covariates provide a x-y scatter plot and print the Pearson correlation coefficient
  • Categorical covariates produce a violin plot of the distribution of EBEs for a category
# Plot EBEs versus covariates
empirical_bayes_vs_covariates(
    mod_pred, # FittedPumasModelInspection
)

In the example, the function could determine which covariates were continuous and which were categorical based on the variable types in the Population object when the initial analysis dataset was constructed (the first 10 rows are shown below). Here, SEX was already converted to type Categorical with descriptive labels and was retained in this state through the Pumas model fitting process and generation of the FittedPumasModelInspection object.

10×10 DataFrame
Row id time amt evid cmt pkconc pdpca WEIGHT AGE SEX
Abstract… Float64 Float64? Int64 Int64? Float64? Float64? Float64 Int64 Cat…
1 1 0.0 100.0 1 1 missing missing 66.7 50 Male
2 1 0.0 missing 0 missing missing 100.0 66.7 50 Male
3 1 24.0 missing 0 missing 9.2 49.0 66.7 50 Male
4 1 36.0 missing 0 missing 8.5 32.0 66.7 50 Male
5 1 48.0 missing 0 missing 6.4 26.0 66.7 50 Male
6 1 72.0 missing 0 missing 4.8 22.0 66.7 50 Male
7 1 96.0 missing 0 missing 3.1 28.0 66.7 50 Male
8 1 120.0 missing 0 missing 2.5 33.0 66.7 50 Male
9 2 0.0 100.0 1 1 missing missing 66.7 31 Male
10 2 0.0 missing 0 missing missing 100.0 66.7 31 Male

By default, empirical_bayes_vs_covariates treats all covariates as continuous unless otherwise specified by the categorical keyword argument.

Tip

Additional keyword arguments can be passed to empirical_bayes_vs_covariates to assist with formatting and styles. The keyword arguments and options leverage CairoMakie.jl functionality.

Type ?empirical_bayes_vs_covariates in the Julia REPL to see how to modify the colors and styles for the scatter, violin plots, and horizontal line aesthetics.

8 Residual Distributions

In-built Pumas functions are available for assessing the normality of residual distributions similar to empirical_bayes_dist. These include wresiduals_dist and npde_dist for population/individual weighted residuas and normalized prediction distribution errors, respectively.

# Plot histograms of weighted residuals from the model output
wresiduals_dist(
    mod_pred, # FittedPumasModelInspection
)

# Plot histograms of normalized prediction distribution errors from the model output
npde_dist(
    mod_pred, # FittedPumasModelInspection
)

Tip

Additional keyword arguments can be passed to wresiduals_dist and npde_dist to assist with formatting and styles. The keyword arguments and options leverage CairoMakie.jl functionality.

Type ?wresiduals_dist and ?npde_dist in the Julia REPL to see how to modify the colors and styles for the histogram and the vertical line aesthetics.

9 Summary

The Pumas and PumasUtilities packages have several functions for generating goodness-of-fit diagnostics using FittedPumasModel output. The plotting functions accommodate modifications of the aesthetics, however, for full customization it is recommended to generate customized plots using AlgebraOfGraphics.jl and the inspect output converted to a DataFrame output.