Model Representation in Pumas

Author

Vijay Ivaturi

1 Domain Specific Language (DSL)

Pumas provides a domain-specific language (DSL) for building NLME (Nonlinear Mixed Effects) models in a modular fashion. Each block within the model, such as @param, @random, @dynamics, and others, isolates different logical parts of the modeling process, making the code highly readable and maintainable. The Pumas DSL is designed to guide users through a clear and streamlined workflow, ensuring that the steps from parameter definition to statistical model specification are both intuitive and visually clean.

The following sections present a PK/PD model for Warfarin that demonstrates how to use Pumas’s modeling blocks. The accompanying dataset, introduced in the previous topic, forms the basis for applying and illustrating these concepts. For a much more detailed overview, please refer to the documentation.

2 Learning Goals

  • Understand the block-wise layout of Pumas models using @model, including parameters, random effects, covariates, dynamic equations, and statistical models.
  • Gain familiarity with how inter-individual variability and residual variability are represented through the @param and @random blocks.
  • Observe how to specify covariate effects, initialize state variables, and define ODEs in a modular way.
  • Explore the process of linking PK and PD using derived equations and proper error models.

3 Model Overview

The following code block shows the Warfarin PK/PD model defined with @model in Pumas:

warfarin_pkpd_model = @model begin

    @metadata begin
        desc = "Warfarin PK/PD model"
        timeu = u"hr"
    end

    @param begin
        # PK parameters
        """
        Clearance (L/h/70kg)
        """
        pop_CL  RealDomain(lower = 0.0, init = 0.134)
        """
        Central Volume L/70kg
        """
        pop_V  RealDomain(lower = 0.0, init = 8.11)
        """
        Absorption time (h)
        """
        pop_tabs  RealDomain(lower = 0.0, init = 0.523)
        """
        Lag time (h)
        """
        pop_lag  RealDomain(lower = 0.0, init = 0.1)
        # PD parameters
        """
        Baseline
        """
        pop_e0  RealDomain(lower = 0.0, init = 100.0)
        """
        Emax
        """
        pop_emax  RealDomain(init = -1.0)
        """
        EC50
        """
        pop_c50  RealDomain(lower = 0.0, init = 1.0)
        """
        Turnover
        """
        pop_tover  RealDomain(lower = 0.0, init = 14.0)
        # Inter-individual variability
        """
          - ΩCL
          - ΩVc
          - ΩTabs
        """
        pk_Ω  PDiagDomain([0.01, 0.01, 0.01])
        """
        Ωlag
        """
        lag_ω  RealDomain(lower = 0.0, init = 0.1)
        """
          - Ωe0
          - Ωemax
          - Ωec50
          - Ωturn
        """
        pd_Ω  PDiagDomain([0.01, 0.01, 0.01, 0.01])
        # Residual variability
        """
        Proportional residual error for drug concentration
        """
        σ_prop  RealDomain(lower = 0.0, init = 0.00752)
        """
        Additive residual error for drug concentration (mg/L)
        """
        σ_add  RealDomain(lower = 0.0, init = 0.0661)
        """
        Additive error for PCA
        """
        σ_fx  RealDomain(lower = 0.0, init = 0.01)
    end

    @random begin
        # mean = 0, covariance = pk_Ω
        pk_η ~ MvNormal(pk_Ω)
        # mean = 0, standard deviation = lag_ω
        lag_η ~ Normal(0.0, lag_ω)
        # mean = 0, covariance = pd_Ω
        pd_η ~ MvNormal(pd_Ω)
    end

    @covariates begin
        """
        Scaling factor for Volume
        """
        FSZV
        """
        Scaling factor for Clearance
        """
        FSZCL
    end

    @pre begin
        # PK
        CL = FSZCL * pop_CL * exp(pk_η[1])
        Vc = FSZV * pop_V * exp(pk_η[2])
        tabs = pop_tabs * exp(pk_η[3])
        Ka = log(2) / tabs
        # PD
        e0 = pop_e0 * exp(pd_η[1])
        emax = pop_emax * exp(pd_η[2])
        c50 = pop_c50 * exp(pd_η[3])
        tover = pop_tover * exp(pd_η[4])
        kout = log(2) / tover
        rin = e0 * kout
    end

    @dosecontrol begin
        lags = (Depot = pop_lag * exp(lag_η),)
    end

    @init begin
        Turnover = e0
    end

    @vars begin
        cp := Central / Vc
        ratein := Ka * Depot
        pd := 1 + emax * cp / (c50 + cp)
    end

    @dynamics begin
        Depot' = -ratein
        Central' = ratein - CL * cp
        Turnover' = rin * pd - kout * Turnover
    end

    @derived begin
        """
        Warfarin Concentration (mg/L)
        """
        conc ~ @. Normal(cp, sqrt((σ_prop * cp)^2 + σ_add^2))
        """
        PCA
        """
        pca ~ @. Normal(Turnover, σ_fx)
    end
end

The @model macro in Pumas serves as an entry point for constructing an NLME model with a DSL. When the macro is called, it creates a PumasModel object that encapsulates all of the blocks (@param, @random, @covariates, @pre, @dynamics, @derived, etc.) included within. These blocks are each designed for one stage of the modeling process, so that different parts of the model are cleanly separated.

Below is a simplified illustration of how @model is typically structured:

@model begin

    @metadata begin
        # model metadata
    end

    @param begin
        # population parameters
    end

    @random begin
        # random effects
    end

    @covariates begin
        # covariate declarations
    end

    @pre begin
        # transformations of parameters and covariates
    end

    @dosecontrol begin
        # parameter dependent control of doses
    end

    @dynamics begin
        # differential equations or analytical solutions
    end

    @derived begin
        # statistical model linking predictions to observations
    end
end

This structure provides various advantages including:

  1. Modular Definition: Each block has a specific role, leading to a model definition that is easy to read, maintain, and extend.
  2. Clear DSL Syntax: Blocks describe the flow from parameter definitions (@param) and random effects (@random) through to the final statistical distribution of observations in @derived.
  3. Automatic Code Generation: By structuring a model in this manner, Pumas automatically generates code under the hood that carries out parameter transformations, solves ODEs, and evaluates likelihoods and gradients.
  4. Separation of Model Aspects: The distribution of random effects, fixed effects, ODE definitions, and error models all appear in separate sections, which clarifies the interactions among different components of the model.

When the macro is processed, it creates a PumasModel object that can be used with downstream Pumas functions such as fit, inspect, and simobs, among others. By keeping related functionality grouped in a domain-specific macro, the process of setting up and refining an NLME model becomes intuitive and expressive.

The sections below describe how each part of the model is structured and how the DSL blocks operate.

3.1 The @metadata Block

The @metadata block provides a location to store information about the model, including a description and the time units. At present desc and timeu are the only metadata keys that are used.

@metadata begin
    desc = "Warfarin PK/PD model"
    timeu = u"hr"
end

In addition to the @metadata block, individual parameters, covariates, and any other named components within your models can include a short “docstring” to help readers better understand the purpose of each name. They can be added to your models in the same manner as normal docstrings used to document functions and types in normal Julia code.

"""
Clearance (L/h/70kg)
"""
pop_CL  RealDomain(lower = 0.0, init = 0.134)

3.2 The @param Block: Population Parameters

In this model, the @param block defines the population-level (fixed) effects and variability terms. Several parameters have docstrings, which is an optional feature in Pumas to document parameter meaning:

  • PK parameters: pop_CL, pop_V, pop_tabs, pop_lag
  • PD parameters: pop_e0, pop_emax, pop_c50, pop_tover
  • Variability: pk_Ω, lag_ω, pd_Ω (diagonal covariance structures)
  • Residual error parameters: σ_prop, σ_add, σ_fx

Each parameter’s domain is specified.

  • RealDomain(lower=0.0, init=0.134) indicate that pop_CL lies on Real number line where the lower bound imparts a strictly positive parameter with an initial guess of 0.134.

  • PDiagDomain([0.01, 0.01, 0.01]) indicates that pk_Ω is a diagonal \((3 \times 3)\) covariance matrix with each diagonal element representing a starting variance of 0.01.

    • Each dimension of a 3-dimensional random effects vector pk_η draws from \(\mathcal{N}(0, 0.01)\).
    • During fitting, Pumas can update these values, but they remain constrained to be diagonal and positive.
    • This structure ensures independence across the three components of pk_η and a valid covariance matrix for use in the random-effects distribution.

3.3 The @random Block: Random Effects

Random effects describe individual-level deviations from the population means in the @random block:

pk_η ~ MvNormal(pk_Ω)
lag_η ~ Normal(0.0, lag_ω)
pd_η ~ MvNormal(pd_Ω)
  • pk_η: a 3-dimensional random effect vector for PK parameters, with covariance pk_Ω.
  • lag_η: a scalar random effect for lag time.
  • pd_η: a 4-dimensional random effect vector for PD parameters, with covariance pd_Ω.

Values of η are incorporated within the @pre block to produce subject-specific parameter values.

3.4 The @covariates Block

The @covariates block declares the names of covariates used later in the model. Here:

@covariates begin
    """
    Scaling factor for Volume
    """
    FSZV
    """
    Scaling factor for Clearance
    """
    FSZCL
end

Covariates FSZV and FSZCL appear in subsequent expressions scaling pop_CL and pop_V.

3.5 The @pre Block: Preprocessing

The @pre block uses the parameters and random effects to produce the actual subject-specific parameter values used in the ODEs and statistical model:

  1. PK transformations:

    CL = FSZCL * pop_CL * exp(pk_η[1])
    Vc = FSZV * pop_V * exp(pk_η[2])
    tabs = pop_tabs * exp(pk_η[3])
    Ka = log(2) / tabs
  2. PD transformations:

    e0 = pop_e0 * exp(pd_η[1])
    emax = pop_emax * exp(pd_η[2])
    c50 = pop_c50 * exp(pd_η[3])
    tover = pop_tover * exp(pd_η[4])
    kout = log(2) / tover
    rin = e0 * kout
    time = t

These computations ensure each individual’s parameters differ based on random effects and covariates.

3.6 The @dosecontrol Block

The @dosecontrol block manages dose-related controls, such as lag times or bioavailability:

@dosecontrol begin
    lags = (Depot = pop_lag * exp(lag_η),)
end

Any administered dose that enters the Depot compartment is delayed by the factor pop_lag * exp(lag_η).

3.7 The @init Block: Initial Conditions

Initial conditions for ODE states are set in the @init block:

@init begin
    Turnover = e0
end

The variable Turnover starts at e0, while other dynamic variables (e.g., Depot and Central) default to 0 if not explicitly specified.

3.8 The @vars Block: Creating Aliases

The @vars block defines shorthand notations for convenience. It keeps the model code more readable by referencing short variables:

@vars begin
    cp := Central / Vc
    ratein := Ka * Depot
    pd := 1 + emax * cp / (c50 + cp)
end

The expression cp is the plasma concentration, ratein is the absorption rate, and pd is a helper factor in the PD equation.

3.9 Using the Walrus Operator in the vars Block

In the vars block, you might encounter the walrus operator (:=). This operator is used for assignment expressions, allowing you to assign a value to a variable as part of a larger expression.

Tip: Understanding the Walrus Operator (:=)

The walrus operator (:=) is a convenient way to assign values to variables within an expression. In the context of Pumas models, it allows for defining algebraic expressions that are used internally within the model but are not included in the output during simulation and fitting. This helps keep the output focused on the primary variables of interest.

Example:

@vars begin
    cp := Central / Vc
    ratein := Ka * Depot
    pd := 1 + emax * cp / (c50 + cp)
end

In this example, cp, ratein, and pd are calculated for use within the model but are not directly output in the simulation results.

3.10 The @dynamics Block: ODE System

The ODE system specified in the @dynamics block governs the time evolution of the states:

@dynamics begin
    Depot' = -ratein
    Central' = ratein - CL * cp
    Turnover' = rin * pd - kout * Turnover
end
  • Depot decreases by ratein (the absorption rate).
  • Central increases by the same amount and decreases by CL * cp.
  • Turnover is driven by rin * pd and eliminated by kout * Turnover.

The solver processes these ODEs for each subject using the subject-specific parameters set in @pre.

3.11 The @derived Block: Statistical Modeling

The @derived block connects the model predictions to observed data, specifying distributions:

@derived begin
    """
    Warfarin Concentration (mg/L)
    """
    conc ~ @. Normal(cp, sqrt((σ_prop * cp)^2 + σ_add^2))
    """
    PCA
    """
    pca ~ @. Normal(Turnover, σ_fx)
end
  1. conc: Uses a combined error model for Warfarin concentration, with a mean of cp and a standard deviation composed of proportional (σ_prop * cp) plus additive (σ_add) terms.
  2. pca: Modeled as Normal with mean Turnover and standard deviation σ_fx.

Both are vectorized over the observation times with the @. macro.

Tip

Understanding the Tilde (~) Operator in Statistical Models

The tilde (~) operator in Pumas is used to specify the statistical distribution of observed data. It can be read as “is distributed as” or “follows the distribution of”. For example:

pca ~ @. Normal(Turnover, σ_fx)

This expression means:

  • The observed pca values follow a Normal distribution
  • The mean (μ) of the distribution is the predicted Turnover value
  • The standard deviation (σ) is σ_fx
  • The @., also called the dot macro, performs broadcasting and applies this distribution to all observations vectorially

In statistical notation, this would be written as: PCA ~ N(Turnover, σ_fx²)

Whenever ~ appears in model code in Pumas it means that the left-hand side (say, pk_η) is a random variable with the distribution found on the right-hand side. For example, the following code

pk_η ~ MvNormal(pk_Ω)

reads: the random variable pk_η follows a multivariate normal distribution with variance-covariance matrix pk_Ω

3.12 Summary of the Warfarin PK/PD Model Structure

  • @param: Defines all population-level parameters and their domains.
  • @random: Describes the distribution of random effects for individuals.
  • @covariates: Lists covariate names for later blocks.
  • @pre: Calculates subject-specific PK/PD parameter values based on covariates and random effects.
  • @dosecontrol: Specifies special dose handling (e.g., lag time).
  • @init: Sets initial ODE state conditions.
  • @vars: Provides shorthand alias definitions to maintain code readability.
  • @dynamics: Declares the ODE system for PK and PD states.
  • @derived: Maps predictions (e.g., concentration, effect) to observations through a statistical model.

This example demonstrates the clarity offered by Pumas’s DSL for pharmacometric models. Each block handles a distinct facet of NLME modeling, providing a direct mapping from biological concepts (like clearance, volume, and turnover) to the mathematical model and, finally, to the data likelihood specification.

3.13 Comparison of Pumas and NONMEM Blocks

Pumas provides a modular, block-based DSL for defining NLME models, offering clarity and an organized workflow. NONMEM traditionally requires specifying various modeling components in more monolithic control streams. The Warfarin PK/PD example below highlights how Pumas’s block approach can make it more intuitive to reason about each phase of the model-building process. The table shows how corresponding sections map between Pumas and NONMEM for the Warfarin model.

Model Aspect Pumas DSL NONMEM Control Stream
Parameter Definitions @param block
Population parameters (e.g., pop_CL, pop_V) declared with RealDomain or PDiagDomain, and initial values or bounds.
$THETA statements for population fixed effects, $OMEGA or $SIGMA for variance-covariance structures. Bounds or initial estimates typically given inline.
Random Effects @random block
Random effects (e.g., pk_η, pd_η) declared with distributions like MvNormal(pk_Ω). Can include non-Gaussian distributions also
$OMEGA block lines for inter-individual variability. Typically called ETA and only Gaussian Normal distributions are allowed.
Covariates @covariates block
Covariate names like FSZV or FSZCL declared for easy referencing.
Covariates handled as data variables in the control stream’s $DATA section, then references appear directly in $PK or $DES code. No dedicated covariate block.
Pre-processing/Transform @pre block
Parameter-covariate transformations (e.g., CL = FSZCL * pop_CL * exp(pk_η[1])).
Often placed in $PK, with arithmetic transformations of THETA, OMEGA, and data variables. May become lengthy as it merges with other setup code.
Dose Control @dosecontrol block
Settings for lags or infusion rates (e.g., lags = (Depot = pop_lag * exp(lag_η), )).
Lag times typically assigned in $PK or $INPUT, with advanced infusion controls in $INPUT and specialized flags (SS, RATE, etc.) embedded in data or code.
Initial Conditions @init block
Initial states assigned for ODE compartments (e.g., Turnover = e0).
$DES block or $MODEL block with code for initial compartment amounts or states using the A_0(compartment number) syntax.
Aliases @vars block
Convenient definitions such as cp := Central / Vc.
Usually mixed into $PK or $DES logic with manual declarations or comments.
Dynamics (ODEs/Analytical) @dynamics block
ODE expressions (Depot' = -Ka * Depot) or built-in analytical solutions.
$DES block for user-written ODEs. Built-in solution compartments or ADVAN (e.g., ADVAN2, ADVAN4) are used for standard PK structures in $Subroutines.
Statistical Model @derived block
Observed variables with error models (conc ~ @. Normal(cp, sqrt((σ_prop * cp)^2 + σ_add^2))).
$ERROR block to assign residual error structures (Y = F + ERR(...)). Proportional/additive/combined error often manually coded.

The NONMEM control stream uses separate sections ($PROBLEM, $INPUT, $DATA, $THETA, $OMEGA, $PK, $DES, $ERROR, etc.) but is less explicitly modular, typically mixing transformations, random effects, and distribution assumptions. By contrast, Pumas separates these concerns into distinct blocks, improving legibility and maintainability.