Monte Carlo Expectation Maximization (MCEM)

Author

Vijay Ivaturi

Experimental Feature

MCEM is an experimental estimation method introduced in Pumas 2.8. Its API may change in future releases without deprecation.

In this tutorial, we introduce Monte Carlo Expectation Maximization (MCEM), a new estimation method available in Pumas.Experimental. MCEM is a classical approach to maximum likelihood estimation in mixed-effects models that uses Monte Carlo sampling to approximate the E-step of the EM algorithm.

We will cover what MCEM is and when to use it, demonstrate the required model parameterization, fit a warfarin PK model, show the post-estimation workaround for inference and diagnostics, explore non-Gaussian random effect distributions, and provide a comprehensive reference of all MCEM options.

using Pumas
using Pumas.Experimental: MCEM
using PumasUtilities
using PharmaDatasets
using DataFramesMeta
using CairoMakie
using AlgebraOfGraphics

1 What is MCEM?

The Expectation-Maximization (EM) algorithm (Dempster, Laird, and Rubin 1977) is a classical iterative method for finding maximum likelihood estimates when the model involves latent (unobserved) variables, which in pharmacometrics are the random effects.

Each iteration of the EM algorithm has two steps:

  1. E-step (Expectation): define the expected value of the log-likelihood function with respect to the current estimate of the random effects distribution.
  2. M-step (Maximization): update the population parameters by maximizing this expected log-likelihood.

In Monte Carlo EM (MCEM) (Wei and Tanner 1990), the expected value of the log-likelihood function, defined in the E-step, is approximated using Monte Carlo sampling. For each subject, MCEM draws random effect samples from the prior distribution, weights them by their likelihood given the observed data, and computes sufficient statistics. The M-step then optimizes the population parameters based on these weighted sufficient statistics.

Key Advantage: Non-Gaussian Random Effects

The main advantage of MCEM over FOCE is its ability to handle non-Gaussian random effect distributions. While FOCE assumes that random effects follow a (multivariate) normal distribution, MCEM can work with any random effect distribution — including LogNormal, LogitNormal, Gamma, and InverseGamma.

1.1 How MCEM Differs from FOCE and VEM

Feature FOCE / LaplaceI VEM MCEM
Model macro @model @model @model
Error model Gaussian-like (any) Any Additive Gaussian (constant variance)
Random effects Normal only Normal only Any distribution
E-step Laplace approximation Variational inference Monte Carlo sampling
M-step Gradient-based Gradient-based Optimizer-based
infer() support Yes No No (see below)
inspect() support Yes Yes No (see below)

2 Model Parameterization Requirements

MCEM has specific requirements for how the model is parameterized. Understanding these is essential before fitting.

MCEM Model Constraints
  1. Additive Gaussian error only: the @derived block must use Normal(μ, σ) where σ is a constant (does not depend on predicted values or covariates). ProportionalNormal, CombinedNormal, and other non-constant-variance error models are not supported.

  2. Population parameters must enter through @random: the @pre block must not depend on population parameters directly. Instead, population parameters should appear in the @random block as parameters of the random effect distributions.

  3. Error model independence: the @derived block must not depend explicitly on covariates or random effects — these dependencies must flow through the ODE dynamics.

This means the typical FOCE-style parameterization:

# Standard FOCE parameterization — NOT compatible with MCEM
@random η ~ MvNormal(Ω)
@pre begin
    CL = θCL * exp(η[1])  # @pre depends on population parameter θCL
    Vc = θVc * exp(η[2])  # @pre depends on population parameter θVc
end

must be reformulated so that population parameters enter through @random:

# MCEM-compatible parameterization
@random begin
    ηCL ~ Normal(log(θCL), ωCL)  # population params in @random
    ηVc ~ Normal(log(θVc), ωVc)
end
@pre begin
    CL = exp(ηCL)  # @pre depends ONLY on random effects
    Vc = exp(ηVc)
end
# MCEM-compatible parameterization
@random begin
    CL ~ LogNormal(log(θCL), ωCL)  # population params in @random
    Vc ~ LogNormal(log(θVc), ωVc)
end

In the MCEM parameterization, each random effect has its own distribution with the population typical value encoded in the distribution parameters (e.g., the mean of the Normal). This may look different from the standard FOCE convention, but it is mathematically equivalent when using Normal random effects.

3 Warfarin PK Model with MCEM

Let us now fit a warfarin PK model using MCEM. We will first show the MCEM-compatible model, then fit it and compare with FOCE.

3.1 Data Loading

warfarin_data = @chain dataset("pumas/warfarin_pumas") begin
    @rtransform begin
        :FSZV = :wtbl / 70
        :FSZCL = (:wtbl / 70)^0.75
    end
end
first(warfarin_data, 5)
5×12 DataFrame
Row id time evid amt cmt conc pca wtbl age sex FSZV FSZCL
Int64 Float64 Int64 Float64? Int64? Float64? Float64? Float64 Int64 String1 Float64 Float64
1 1 0.0 1 100.0 1 missing missing 66.7 50 M 0.952857 0.96443
2 1 0.5 0 missing missing 0.0 missing 66.7 50 M 0.952857 0.96443
3 1 1.0 0 missing missing 1.9 missing 66.7 50 M 0.952857 0.96443
4 1 2.0 0 missing missing 3.3 missing 66.7 50 M 0.952857 0.96443
5 1 3.0 0 missing missing 6.6 missing 66.7 50 M 0.952857 0.96443
pop = read_pumas(
    warfarin_data;
    id = :id,
    time = :time,
    amt = :amt,
    cmt = :cmt,
    evid = :evid,
    covariates = [:sex, :wtbl, :FSZV, :FSZCL],
    observations = [:conc],
)
Population
  Subjects: 32
  Covariates: sex, wtbl, FSZV, FSZCL
  Observations: conc

3.2 MCEM-Compatible Model Definition

The key difference from the standard FOCE model is how population parameters and random effects are structured. In the MCEM-compatible parameterization:

  • Population typical values (θCL, θVc) appear as parameters of the random effect distributions.
  • The @pre block only depends on random effects and covariates, not on population parameters.
  • The error model uses additive Normal with a constant variance parameter σ.
warfarin_mcem_model = @model begin

    @metadata begin
        desc = "Warfarin 1-compartment PK model (MCEM-compatible)"
        timeu = u"hr"
    end

    @param begin
        θCL  RealDomain(lower = 0.0, init = 0.134)
        θVc  RealDomain(lower = 0.0, init = 8.11)
        ωCL  RealDomain(lower = 0.0, init = 0.1)
        ωVc  RealDomain(lower = 0.0, init = 0.1)
        σ  RealDomain(lower = 0.0, init = 0.5)
    end

    @random begin
        ηCL ~ Normal(log(θCL), ωCL)
        ηVc ~ Normal(log(θVc), ωVc)
    end

    @covariates FSZCL FSZV

    @pre begin
        CL = FSZCL * exp(ηCL)
        Vc = FSZV * exp(ηVc)
        Ka = log(2) / 0.5
    end

    @dynamics Depots1Central1

    @derived begin
        cp := @. Central / Vc
        conc ~ @. Normal(cp, σ)
    end
end
PumasModel
  Parameters: θCL, θVc, ωCL, ωVc, σ
  Random effects: ηCL, ηVc
  Covariates: FSZCL, FSZV
  Dynamical system variables: Depot, Central
  Dynamical system type: Closed form
  Derived: conc
  Observed: conc
Parameterization Differences

Compared to the standard FOCE warfarin model:

  • θCL and θVc are not used in @pre. Instead, they appear as the mean of the log-normal random effects: ηCL ~ Normal(log(θCL), ωCL).
  • ωCL and ωVc are scalar standard deviations rather than a PDiagDomain covariance matrix.
  • σ is an additive error (not proportional), as required by MCEM.
  • The @pre block depends only on random effects (ηCL, ηVc) and covariates (FSZCL, FSZV).

3.3 Fitting with MCEM

MCEM requires two mandatory keyword arguments:

  • iterations: number of EM iterations (each iteration involves sampling + optimization).
  • samples: initial number of Monte Carlo samples per subject per iteration.

The number of samples automatically increases across iterations to ensure convergence (following a polynomial schedule).

fpm_mcem = fit(
    warfarin_mcem_model,
    pop,
    init_params(warfarin_mcem_model),
    MCEM(; iterations = 50, samples = 100),
)
FittedPumasModel

Dynamical system type:                 Closed form

Number of subjects:                             32

Observation records:         Active        Missing
    conc:                       251             47
    Total:                      251             47

Number of parameters:      Constant      Optimized
                                  0              5

Estimation method:             MCEM (EXPERIMENTAL)

---------------
      Estimate
---------------
θCL   0.12493
θVc   8.7244
ωCL   0.16658
ωVc   0.20586
σ     1.7012
---------------

3.4 Parameter Estimates

coef(fpm_mcem)
(θCL = 0.12493050855171235, θVc = 8.724438564047635, ωCL = 0.16658038987383103, ωVc = 0.20585658743700935, σ = 1.701182897012707)
coeftable(fpm_mcem)
5×3 DataFrame
Row parameter constant estimate
String Bool Float64
1 θCL false 0.124931
2 θVc false 8.72444
3 ωCL false 0.16658
4 ωVc false 0.205857
5 σ false 1.70118

3.5 Log-Likelihood

Like VEM, MCEM does not compute the log-likelihood during fitting. You must specify an approximation method:

loglikelihood(fpm_mcem, FOCE())
-518.6952146408544
loglikelihood(fpm_mcem, LaplaceI())
-518.8700972370697

3.6 VPC

Visual predictive checks are available directly:

vpc_mcem = vpc(fpm_mcem);
vpc_plot(vpc_mcem)

4 Post-Estimation Workaround

MCEM does not currently support infer, inspect, predict, or empirical_bayes directly. Calling any of these on an MCEM-fitted model will raise an error:

infer(fpm_mcem)
ArgumentError: asymptotic inference is not supported for Pumas models fitted with MCEM (EXPERIMENTAL).
Stacktrace:
 [1] infer(fpm::Pumas.FittedPumasModel{PumasModel{(θCL = 1, θVc = 1, ωCL = 1, ωVc = 1, σ = 1), 2, (:Depot, :Central), (:Ka, :CL, :Vc), ParamSet{@NamedTuple{θCL::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}, θVc::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}, ωCL::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}, ωVc::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}, σ::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}}}, ParamSet{@NamedTuple{ηCL::Pumas.RandomEffect{false, var"#5#13"}, ηVc::Pumas.RandomEffect{false, var"#6#14"}}}, Pumas.PreObj{Pumas.TimeDispatcher{var"#7#15", var"#8#16"}}, Pumas.DCPObj{Returns{Returns{@NamedTuple{}}}}, var"#10#18", Depots1Central1, Pumas.DerivedObj{(:conc,), false, var"#11#19"}, var"#12#20", Nothing, Pumas.PumasModelOptions}, Vector{Subject{@NamedTuple{conc::Vector{Union{Missing, Float64}}}, Pumas.ConstantCovar{@NamedTuple{sex::InlineStrings.String1, wtbl::Float64, FSZV::Float64, FSZCL::Float64}}, DosageRegimen{Float64, Union{Missing, Int64}, Float64, Float64, Float64, Float64}, Vector{Float64}}}, Pumas.Experimental.MCEMResult{Vector{Float64}}, MCEM{Optim.NelderMead{Optim.AffineSimplexer, Optim.AdaptiveParameters}, Optim.Options{Float64, Nothing}}, Nothing, @NamedTuple{constantcoef::Tuple{}, ensemblealg::EnsembleThreads, diffeq_options::@NamedTuple{ss_abstol::Float64, ss_reltol::Float64, ss_maxiters::Int64, abstol::Float64, reltol::Float64, alg::CompositeAlgorithm{0, Tuple{Vern7{typeof(OrdinaryDiffEqCore.trivial_limiter!), typeof(OrdinaryDiffEqCore.trivial_limiter!), Static.False}, Rodas5P{1, ADTypes.AutoForwardDiff{1, Nothing}, LinearSolve.GenericLUFactorization{LinearAlgebra.RowMaximum}, typeof(OrdinaryDiffEqCore.DEFAULT_PRECS), Val{:forward}(), true, nothing, typeof(OrdinaryDiffEqCore.trivial_limiter!), typeof(OrdinaryDiffEqCore.trivial_limiter!)}}, OrdinaryDiffEqCore.AutoSwitch{Vern7{typeof(OrdinaryDiffEqCore.trivial_limiter!), typeof(OrdinaryDiffEqCore.trivial_limiter!), Static.False}, Rodas5P{1, ADTypes.AutoForwardDiff{1, Nothing}, LinearSolve.GenericLUFactorization{LinearAlgebra.RowMaximum}, typeof(OrdinaryDiffEqCore.DEFAULT_PRECS), Val{:forward}(), true, nothing, typeof(OrdinaryDiffEqCore.trivial_limiter!), typeof(OrdinaryDiffEqCore.trivial_limiter!)}, Rational{Int64}, Int64}}, continuity::Symbol}}, ParamSet{@NamedTuple{θCL::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}, θVc::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}, ωCL::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}, ωVc::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}, σ::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}}}, Nothing}; level::Float64, rethrow_error::Bool, ensemblealg::EnsembleThreads, sandwich_estimator::Bool)
   @ Pumas.Experimental ~/run/_work/PumasTutorials.jl/PumasTutorials.jl/custom_julia_depot/packages/Pumas/lK187/src/estimation/em.jl:755
 [2] infer(fpm::Pumas.FittedPumasModel{PumasModel{(θCL = 1, θVc = 1, ωCL = 1, ωVc = 1, σ = 1), 2, (:Depot, :Central), (:Ka, :CL, :Vc), ParamSet{@NamedTuple{θCL::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}, θVc::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}, ωCL::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}, ωVc::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}, σ::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}}}, ParamSet{@NamedTuple{ηCL::Pumas.RandomEffect{false, var"#5#13"}, ηVc::Pumas.RandomEffect{false, var"#6#14"}}}, Pumas.PreObj{Pumas.TimeDispatcher{var"#7#15", var"#8#16"}}, Pumas.DCPObj{Returns{Returns{@NamedTuple{}}}}, var"#10#18", Depots1Central1, Pumas.DerivedObj{(:conc,), false, var"#11#19"}, var"#12#20", Nothing, Pumas.PumasModelOptions}, Vector{Subject{@NamedTuple{conc::Vector{Union{Missing, Float64}}}, Pumas.ConstantCovar{@NamedTuple{sex::InlineStrings.String1, wtbl::Float64, FSZV::Float64, FSZCL::Float64}}, DosageRegimen{Float64, Union{Missing, Int64}, Float64, Float64, Float64, Float64}, Vector{Float64}}}, Pumas.Experimental.MCEMResult{Vector{Float64}}, MCEM{Optim.NelderMead{Optim.AffineSimplexer, Optim.AdaptiveParameters}, Optim.Options{Float64, Nothing}}, Nothing, @NamedTuple{constantcoef::Tuple{}, ensemblealg::EnsembleThreads, diffeq_options::@NamedTuple{ss_abstol::Float64, ss_reltol::Float64, ss_maxiters::Int64, abstol::Float64, reltol::Float64, alg::CompositeAlgorithm{0, Tuple{Vern7{typeof(OrdinaryDiffEqCore.trivial_limiter!), typeof(OrdinaryDiffEqCore.trivial_limiter!), Static.False}, Rodas5P{1, ADTypes.AutoForwardDiff{1, Nothing}, LinearSolve.GenericLUFactorization{LinearAlgebra.RowMaximum}, typeof(OrdinaryDiffEqCore.DEFAULT_PRECS), Val{:forward}(), true, nothing, typeof(OrdinaryDiffEqCore.trivial_limiter!), typeof(OrdinaryDiffEqCore.trivial_limiter!)}}, OrdinaryDiffEqCore.AutoSwitch{Vern7{typeof(OrdinaryDiffEqCore.trivial_limiter!), typeof(OrdinaryDiffEqCore.trivial_limiter!), Static.False}, Rodas5P{1, ADTypes.AutoForwardDiff{1, Nothing}, LinearSolve.GenericLUFactorization{LinearAlgebra.RowMaximum}, typeof(OrdinaryDiffEqCore.DEFAULT_PRECS), Val{:forward}(), true, nothing, typeof(OrdinaryDiffEqCore.trivial_limiter!), typeof(OrdinaryDiffEqCore.trivial_limiter!)}, Rational{Int64}, Int64}}, continuity::Symbol}}, ParamSet{@NamedTuple{θCL::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}, θVc::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}, ωCL::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}, ωVc::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}, σ::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}}}, Nothing})
   @ Pumas.Experimental ~/run/_work/PumasTutorials.jl/PumasTutorials.jl/custom_julia_depot/packages/Pumas/lK187/src/estimation/em.jl:748
 [3] top-level scope
   @ ~/run/_work/PumasTutorials.jl/PumasTutorials.jl/tutorials/estimation/05-mcem.qmd:341
inspect(fpm_mcem)
ArgumentError: `inspect` is not supported for Pumas models fitted with MCEM (EXPERIMENTAL).
Stacktrace:
 [1] inspect(::Pumas.FittedPumasModel{PumasModel{(θCL = 1, θVc = 1, ωCL = 1, ωVc = 1, σ = 1), 2, (:Depot, :Central), (:Ka, :CL, :Vc), ParamSet{@NamedTuple{θCL::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}, θVc::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}, ωCL::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}, ωVc::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}, σ::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}}}, ParamSet{@NamedTuple{ηCL::Pumas.RandomEffect{false, var"#5#13"}, ηVc::Pumas.RandomEffect{false, var"#6#14"}}}, Pumas.PreObj{Pumas.TimeDispatcher{var"#7#15", var"#8#16"}}, Pumas.DCPObj{Returns{Returns{@NamedTuple{}}}}, var"#10#18", Depots1Central1, Pumas.DerivedObj{(:conc,), false, var"#11#19"}, var"#12#20", Nothing, Pumas.PumasModelOptions}, Vector{Subject{@NamedTuple{conc::Vector{Union{Missing, Float64}}}, Pumas.ConstantCovar{@NamedTuple{sex::InlineStrings.String1, wtbl::Float64, FSZV::Float64, FSZCL::Float64}}, DosageRegimen{Float64, Union{Missing, Int64}, Float64, Float64, Float64, Float64}, Vector{Float64}}}, Pumas.Experimental.MCEMResult{Vector{Float64}}, MCEM{Optim.NelderMead{Optim.AffineSimplexer, Optim.AdaptiveParameters}, Optim.Options{Float64, Nothing}}, Nothing, @NamedTuple{constantcoef::Tuple{}, ensemblealg::EnsembleThreads, diffeq_options::@NamedTuple{ss_abstol::Float64, ss_reltol::Float64, ss_maxiters::Int64, abstol::Float64, reltol::Float64, alg::CompositeAlgorithm{0, Tuple{Vern7{typeof(OrdinaryDiffEqCore.trivial_limiter!), typeof(OrdinaryDiffEqCore.trivial_limiter!), Static.False}, Rodas5P{1, ADTypes.AutoForwardDiff{1, Nothing}, LinearSolve.GenericLUFactorization{LinearAlgebra.RowMaximum}, typeof(OrdinaryDiffEqCore.DEFAULT_PRECS), Val{:forward}(), true, nothing, typeof(OrdinaryDiffEqCore.trivial_limiter!), typeof(OrdinaryDiffEqCore.trivial_limiter!)}}, OrdinaryDiffEqCore.AutoSwitch{Vern7{typeof(OrdinaryDiffEqCore.trivial_limiter!), typeof(OrdinaryDiffEqCore.trivial_limiter!), Static.False}, Rodas5P{1, ADTypes.AutoForwardDiff{1, Nothing}, LinearSolve.GenericLUFactorization{LinearAlgebra.RowMaximum}, typeof(OrdinaryDiffEqCore.DEFAULT_PRECS), Val{:forward}(), true, nothing, typeof(OrdinaryDiffEqCore.trivial_limiter!), typeof(OrdinaryDiffEqCore.trivial_limiter!)}, Rational{Int64}, Int64}}, continuity::Symbol}}, ParamSet{@NamedTuple{θCL::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}, θVc::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}, ωCL::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}, ωVc::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}, σ::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}}}, Nothing}; wres_approx::Nothing, nsim::Int64, rng::Random.TaskLocalRNG)
   @ Pumas.Experimental ~/run/_work/PumasTutorials.jl/PumasTutorials.jl/custom_julia_depot/packages/Pumas/lK187/src/estimation/em.jl:741
 [2] inspect(::Pumas.FittedPumasModel{PumasModel{(θCL = 1, θVc = 1, ωCL = 1, ωVc = 1, σ = 1), 2, (:Depot, :Central), (:Ka, :CL, :Vc), ParamSet{@NamedTuple{θCL::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}, θVc::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}, ωCL::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}, ωVc::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}, σ::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}}}, ParamSet{@NamedTuple{ηCL::Pumas.RandomEffect{false, var"#5#13"}, ηVc::Pumas.RandomEffect{false, var"#6#14"}}}, Pumas.PreObj{Pumas.TimeDispatcher{var"#7#15", var"#8#16"}}, Pumas.DCPObj{Returns{Returns{@NamedTuple{}}}}, var"#10#18", Depots1Central1, Pumas.DerivedObj{(:conc,), false, var"#11#19"}, var"#12#20", Nothing, Pumas.PumasModelOptions}, Vector{Subject{@NamedTuple{conc::Vector{Union{Missing, Float64}}}, Pumas.ConstantCovar{@NamedTuple{sex::InlineStrings.String1, wtbl::Float64, FSZV::Float64, FSZCL::Float64}}, DosageRegimen{Float64, Union{Missing, Int64}, Float64, Float64, Float64, Float64}, Vector{Float64}}}, Pumas.Experimental.MCEMResult{Vector{Float64}}, MCEM{Optim.NelderMead{Optim.AffineSimplexer, Optim.AdaptiveParameters}, Optim.Options{Float64, Nothing}}, Nothing, @NamedTuple{constantcoef::Tuple{}, ensemblealg::EnsembleThreads, diffeq_options::@NamedTuple{ss_abstol::Float64, ss_reltol::Float64, ss_maxiters::Int64, abstol::Float64, reltol::Float64, alg::CompositeAlgorithm{0, Tuple{Vern7{typeof(OrdinaryDiffEqCore.trivial_limiter!), typeof(OrdinaryDiffEqCore.trivial_limiter!), Static.False}, Rodas5P{1, ADTypes.AutoForwardDiff{1, Nothing}, LinearSolve.GenericLUFactorization{LinearAlgebra.RowMaximum}, typeof(OrdinaryDiffEqCore.DEFAULT_PRECS), Val{:forward}(), true, nothing, typeof(OrdinaryDiffEqCore.trivial_limiter!), typeof(OrdinaryDiffEqCore.trivial_limiter!)}}, OrdinaryDiffEqCore.AutoSwitch{Vern7{typeof(OrdinaryDiffEqCore.trivial_limiter!), typeof(OrdinaryDiffEqCore.trivial_limiter!), Static.False}, Rodas5P{1, ADTypes.AutoForwardDiff{1, Nothing}, LinearSolve.GenericLUFactorization{LinearAlgebra.RowMaximum}, typeof(OrdinaryDiffEqCore.DEFAULT_PRECS), Val{:forward}(), true, nothing, typeof(OrdinaryDiffEqCore.trivial_limiter!), typeof(OrdinaryDiffEqCore.trivial_limiter!)}, Rational{Int64}, Int64}}, continuity::Symbol}}, ParamSet{@NamedTuple{θCL::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}, θVc::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}, ωCL::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}, ωVc::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}, σ::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}}}, Nothing})
   @ Pumas.Experimental ~/run/_work/PumasTutorials.jl/PumasTutorials.jl/custom_julia_depot/packages/Pumas/lK187/src/estimation/em.jl:735
 [3] top-level scope
   @ ~/run/_work/PumasTutorials.jl/PumasTutorials.jl/tutorials/estimation/05-mcem.qmd:346
predict(fpm_mcem)
ArgumentError: predictions without explicitly provided values of random effects are not supported for Pumas models fitted with MCEM (EXPERIMENTAL).
Stacktrace:
 [1] predict(::Pumas.FittedPumasModel{PumasModel{(θCL = 1, θVc = 1, ωCL = 1, ωVc = 1, σ = 1), 2, (:Depot, :Central), (:Ka, :CL, :Vc), ParamSet{@NamedTuple{θCL::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}, θVc::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}, ωCL::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}, ωVc::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}, σ::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}}}, ParamSet{@NamedTuple{ηCL::Pumas.RandomEffect{false, var"#5#13"}, ηVc::Pumas.RandomEffect{false, var"#6#14"}}}, Pumas.PreObj{Pumas.TimeDispatcher{var"#7#15", var"#8#16"}}, Pumas.DCPObj{Returns{Returns{@NamedTuple{}}}}, var"#10#18", Depots1Central1, Pumas.DerivedObj{(:conc,), false, var"#11#19"}, var"#12#20", Nothing, Pumas.PumasModelOptions}, Vector{Subject{@NamedTuple{conc::Vector{Union{Missing, Float64}}}, Pumas.ConstantCovar{@NamedTuple{sex::InlineStrings.String1, wtbl::Float64, FSZV::Float64, FSZCL::Float64}}, DosageRegimen{Float64, Union{Missing, Int64}, Float64, Float64, Float64, Float64}, Vector{Float64}}}, Pumas.Experimental.MCEMResult{Vector{Float64}}, MCEM{Optim.NelderMead{Optim.AffineSimplexer, Optim.AdaptiveParameters}, Optim.Options{Float64, Nothing}}, Nothing, @NamedTuple{constantcoef::Tuple{}, ensemblealg::EnsembleThreads, diffeq_options::@NamedTuple{ss_abstol::Float64, ss_reltol::Float64, ss_maxiters::Int64, abstol::Float64, reltol::Float64, alg::CompositeAlgorithm{0, Tuple{Vern7{typeof(OrdinaryDiffEqCore.trivial_limiter!), typeof(OrdinaryDiffEqCore.trivial_limiter!), Static.False}, Rodas5P{1, ADTypes.AutoForwardDiff{1, Nothing}, LinearSolve.GenericLUFactorization{LinearAlgebra.RowMaximum}, typeof(OrdinaryDiffEqCore.DEFAULT_PRECS), Val{:forward}(), true, nothing, typeof(OrdinaryDiffEqCore.trivial_limiter!), typeof(OrdinaryDiffEqCore.trivial_limiter!)}}, OrdinaryDiffEqCore.AutoSwitch{Vern7{typeof(OrdinaryDiffEqCore.trivial_limiter!), typeof(OrdinaryDiffEqCore.trivial_limiter!), Static.False}, Rodas5P{1, ADTypes.AutoForwardDiff{1, Nothing}, LinearSolve.GenericLUFactorization{LinearAlgebra.RowMaximum}, typeof(OrdinaryDiffEqCore.DEFAULT_PRECS), Val{:forward}(), true, nothing, typeof(OrdinaryDiffEqCore.trivial_limiter!), typeof(OrdinaryDiffEqCore.trivial_limiter!)}, Rational{Int64}, Int64}}, continuity::Symbol}}, ParamSet{@NamedTuple{θCL::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}, θVc::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}, ωCL::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}, ωVc::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}, σ::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}}}, Nothing}; nsim::Nothing, obstimes::Nothing, ensemblealg::EnsembleThreads)
   @ Pumas.Experimental ~/run/_work/PumasTutorials.jl/PumasTutorials.jl/custom_julia_depot/packages/Pumas/lK187/src/estimation/em.jl:728
 [2] predict(::Pumas.FittedPumasModel{PumasModel{(θCL = 1, θVc = 1, ωCL = 1, ωVc = 1, σ = 1), 2, (:Depot, :Central), (:Ka, :CL, :Vc), ParamSet{@NamedTuple{θCL::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}, θVc::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}, ωCL::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}, ωVc::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}, σ::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}}}, ParamSet{@NamedTuple{ηCL::Pumas.RandomEffect{false, var"#5#13"}, ηVc::Pumas.RandomEffect{false, var"#6#14"}}}, Pumas.PreObj{Pumas.TimeDispatcher{var"#7#15", var"#8#16"}}, Pumas.DCPObj{Returns{Returns{@NamedTuple{}}}}, var"#10#18", Depots1Central1, Pumas.DerivedObj{(:conc,), false, var"#11#19"}, var"#12#20", Nothing, Pumas.PumasModelOptions}, Vector{Subject{@NamedTuple{conc::Vector{Union{Missing, Float64}}}, Pumas.ConstantCovar{@NamedTuple{sex::InlineStrings.String1, wtbl::Float64, FSZV::Float64, FSZCL::Float64}}, DosageRegimen{Float64, Union{Missing, Int64}, Float64, Float64, Float64, Float64}, Vector{Float64}}}, Pumas.Experimental.MCEMResult{Vector{Float64}}, MCEM{Optim.NelderMead{Optim.AffineSimplexer, Optim.AdaptiveParameters}, Optim.Options{Float64, Nothing}}, Nothing, @NamedTuple{constantcoef::Tuple{}, ensemblealg::EnsembleThreads, diffeq_options::@NamedTuple{ss_abstol::Float64, ss_reltol::Float64, ss_maxiters::Int64, abstol::Float64, reltol::Float64, alg::CompositeAlgorithm{0, Tuple{Vern7{typeof(OrdinaryDiffEqCore.trivial_limiter!), typeof(OrdinaryDiffEqCore.trivial_limiter!), Static.False}, Rodas5P{1, ADTypes.AutoForwardDiff{1, Nothing}, LinearSolve.GenericLUFactorization{LinearAlgebra.RowMaximum}, typeof(OrdinaryDiffEqCore.DEFAULT_PRECS), Val{:forward}(), true, nothing, typeof(OrdinaryDiffEqCore.trivial_limiter!), typeof(OrdinaryDiffEqCore.trivial_limiter!)}}, OrdinaryDiffEqCore.AutoSwitch{Vern7{typeof(OrdinaryDiffEqCore.trivial_limiter!), typeof(OrdinaryDiffEqCore.trivial_limiter!), Static.False}, Rodas5P{1, ADTypes.AutoForwardDiff{1, Nothing}, LinearSolve.GenericLUFactorization{LinearAlgebra.RowMaximum}, typeof(OrdinaryDiffEqCore.DEFAULT_PRECS), Val{:forward}(), true, nothing, typeof(OrdinaryDiffEqCore.trivial_limiter!), typeof(OrdinaryDiffEqCore.trivial_limiter!)}, Rational{Int64}, Int64}}, continuity::Symbol}}, ParamSet{@NamedTuple{θCL::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}, θVc::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}, ωCL::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}, ωVc::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}, σ::RealDomain{Float64, TransformVariables.Infinity{true}, Float64}}}, Nothing})
   @ Pumas.Experimental ~/run/_work/PumasTutorials.jl/PumasTutorials.jl/custom_julia_depot/packages/Pumas/lK187/src/estimation/em.jl:722
 [3] top-level scope
   @ ~/run/_work/PumasTutorials.jl/PumasTutorials.jl/tutorials/estimation/05-mcem.qmd:351

The recommended workaround is to perform a zero-iteration FOCE fit using the MCEM estimates as starting values. This creates a FOCE-fitted model object that supports all post-estimation functions without changing the parameter estimates:

fpm_foce_refit = fit(
    warfarin_mcem_model,
    pop,
    coef(fpm_mcem),
    FOCE();
    optim_options = (; iterations = 0),
)
[ Info: Checking the initial parameter values.
[ Info: The initial negative log likelihood and its gradient are finite. Check passed.
Iter     Function value   Gradient norm 
     0     5.186952e+02     4.898914e+00
 * time: 0.043872833251953125
FittedPumasModel

Dynamical system type:                 Closed form

Number of subjects:                             32

Observation records:         Active        Missing
    conc:                       251             47
    Total:                      251             47

Number of parameters:      Constant      Optimized
                                  0              5

Likelihood approximation:                     FOCE
Likelihood optimizer:                         BFGS

Termination Reason:                     Iterations
Log-likelihood value:                   -518.69521

---------------
      Estimate
---------------
θCL   0.12493
θVc   8.7244
ωCL   0.16658
ωVc   0.20586
σ     1.7012
---------------

Now we can use all standard post-estimation functions:

infer(fpm_foce_refit)
[ Info: Calculating: variance-covariance matrix.
[ Info: Done.
Asymptotic inference results using sandwich estimator

Dynamical system type:                 Closed form

Number of subjects:                             32

Observation records:         Active        Missing
    conc:                       251             47
    Total:                      251             47

Number of parameters:      Constant      Optimized
                                  0              5

Likelihood approximation:                     FOCE
Likelihood optimizer:                         BFGS

Termination Reason:                     Iterations
Log-likelihood value:                   -518.69521

--------------------------------------------------
      Estimate   SE          95.0% C.I.
--------------------------------------------------
θCL   0.12493    0.0058539   [ 0.11346 ; 0.1364 ]
θVc   8.7244     0.43558     [ 7.8707  ; 9.5782 ]
ωCL   0.16658    0.075856    [ 0.017905; 0.31526]
ωVc   0.20586    0.032098    [ 0.14295 ; 0.26877]
σ     1.7012     0.29127     [ 1.1303  ; 2.2721 ]
--------------------------------------------------
res = inspect(fpm_foce_refit)
[ Info: Calculating predictions.
[ Info: Calculating weighted residuals.
[ Info: Calculating empirical bayes.
[ Info: Evaluating dose control parameters.
[ Info: Evaluating individual parameters.
[ Info: Done.
FittedPumasModelInspection

Likelihood approximation used for weighted residuals: FOCE
goodness_of_fit(res)

preds = predict(fpm_foce_refit)
first(preds, 3)
Prediction
  Subjects: 3
  Predictions: conc
  Covariates: sex, wtbl, FSZV, FSZCL
Why Zero Iterations?

Setting iterations = 0 tells FOCE to skip optimization entirely. It simply evaluates the model at the provided parameter values and stores the information needed for inference, diagnostics, and predictions. The resulting parameter estimates are identical to the MCEM estimates:

coef(fpm_foce_refit) == coef(fpm_mcem)
true

5 Comparing with FOCE

Let us fit the same model directly with FOCE to compare estimates:

fpm_foce = fit(warfarin_mcem_model, pop, init_params(warfarin_mcem_model), FOCE())
[ Info: Checking the initial parameter values.
[ Info: The initial negative log likelihood and its gradient are finite. Check passed.
Iter     Function value   Gradient norm 
     0     1.375928e+03     2.085930e+03
 * time: 2.09808349609375e-5
     1     5.460646e+02     1.436354e+02
 * time: 1.095207929611206
     2     5.363797e+02     9.756184e+01
 * time: 1.1044838428497314
     3     5.262173e+02     2.147128e+01
 * time: 1.1138298511505127
     4     5.255327e+02     1.946992e+01
 * time: 1.1225979328155518
     5     5.252147e+02     1.829437e+01
 * time: 1.131155014038086
     6     5.244973e+02     2.021641e+01
 * time: 1.1398968696594238
     7     5.230836e+02     3.046615e+01
 * time: 1.1487410068511963
     8     5.211006e+02     3.216206e+01
 * time: 1.1579759120941162
     9     5.198174e+02     1.940353e+01
 * time: 1.1672658920288086
    10     5.193732e+02     6.432033e+00
 * time: 1.1762480735778809
    11     5.192878e+02     4.336579e+00
 * time: 1.1849360466003418
    12     5.192722e+02     2.802138e+00
 * time: 1.193497896194458
    13     5.192578e+02     2.400581e+00
 * time: 1.2022910118103027
    14     5.192474e+02     2.358824e+00
 * time: 1.2109088897705078
    15     5.192388e+02     2.311301e+00
 * time: 1.219438076019287
    16     5.192264e+02     3.035585e+00
 * time: 1.2280759811401367
    17     5.191991e+02     3.826258e+00
 * time: 1.2368099689483643
    18     5.191391e+02     4.624818e+00
 * time: 1.2455229759216309
    19     5.190246e+02     8.208000e+00
 * time: 1.2542600631713867
    20     5.188882e+02     1.021614e+01
 * time: 1.2629640102386475
    21     5.187641e+02     8.754107e+00
 * time: 1.2717809677124023
    22     5.186865e+02     4.464789e+00
 * time: 1.280709981918335
    23     5.186729e+02     2.608621e+00
 * time: 1.2889659404754639
    24     5.186630e+02     2.440275e-01
 * time: 1.2972979545593262
    25     5.186628e+02     4.088019e-02
 * time: 1.7127280235290527
    26     5.186628e+02     4.549897e-03
 * time: 1.720552921295166
    27     5.186628e+02     3.948320e-04
 * time: 1.7277019023895264
FittedPumasModel

Dynamical system type:                 Closed form

Number of subjects:                             32

Observation records:         Active        Missing
    conc:                       251             47
    Total:                      251             47

Number of parameters:      Constant      Optimized
                                  0              5

Likelihood approximation:                     FOCE
Likelihood optimizer:                         BFGS

Termination Reason:                   GradientNorm
Log-likelihood value:                   -518.66279

---------------
      Estimate
---------------
θCL   0.1262
θVc   8.712
ωCL   0.17223
ωVc   0.20342
σ     1.6989
---------------
compare_estimates(; MCEM = fpm_mcem, FOCE = fpm_foce)
5×3 DataFrame
Row parameter MCEM FOCE
String Float64? Float64?
1 θCL 0.124931 0.126204
2 θVc 8.72444 8.71199
3 ωCL 0.16658 0.172233
4 ωVc 0.205857 0.203417
5 σ 1.70118 1.69893
DataFrame(
    Method = ["MCEM (via FOCE approx)", "MCEM (via LaplaceI approx)", "FOCE direct"],
    LogLikelihood = [
        loglikelihood(fpm_mcem, FOCE()),
        loglikelihood(fpm_mcem, LaplaceI()),
        loglikelihood(fpm_foce),
    ],
)
3×2 DataFrame
Row Method LogLikelihood
String Float64
1 MCEM (via FOCE approx) -518.695
2 MCEM (via LaplaceI approx) -518.87
3 FOCE direct -518.663

6 Non-Gaussian Random Effects

The key advantage of MCEM is its support for non-Gaussian random effect distributions. While FOCE assumes normally distributed random effects, MCEM can handle any distribution. This section demonstrates three non-Gaussian alternatives.

6.1 LogNormal Random Effects

LogNormal random effects are naturally constrained to positive values, which is a common requirement for PK parameters like clearance and volume:

warfarin_lognormal = @model begin
    @param begin
        θCL  RealDomain(lower = 0.0, init = 0.134)
        θVc  RealDomain(lower = 0.0, init = 8.11)
        ωCL  RealDomain(lower = 0.0, init = 0.1)
        ωVc  RealDomain(lower = 0.0, init = 0.1)
        σ  RealDomain(lower = 0.0, init = 0.5)
    end

    @random begin
        ηCL ~ LogNormal(log(θCL), ωCL)
        ηVc ~ LogNormal(log(θVc), ωVc)
    end

    @covariates FSZCL FSZV

    @pre begin
        CL = FSZCL * ηCL
        Vc = FSZV * ηVc
        Ka = log(2) / 0.5
    end

    @dynamics Depots1Central1

    @derived begin
        cp := @. Central / Vc
        conc ~ @. Normal(cp, σ)
    end
end
PumasModel
  Parameters: θCL, θVc, ωCL, ωVc, σ
  Random effects: ηCL, ηVc
  Covariates: FSZCL, FSZV
  Dynamical system variables: Depot, Central
  Dynamical system type: Closed form
  Derived: conc
  Observed: conc
LogNormal Parameterization

With LogNormal random effects:

  • ηCL ~ LogNormal(log(θCL), ωCL) means ηCL is directly on the natural scale (always positive).
  • In @pre, we use CL = FSZCL * ηCL — no exp() transformation is needed because the LogNormal distribution already produces positive values.
  • This is mathematically equivalent to the standard exp(Normal(log(θ), ω)) parameterization that FOCE uses, but it makes the positivity constraint explicit in the random effect distribution.
fpm_lognormal = fit(
    warfarin_lognormal,
    pop,
    init_params(warfarin_lognormal),
    MCEM(; iterations = 50, samples = 100),
)
coef(fpm_lognormal)
(θCL = 0.12547718806566385, θVc = 8.712084633848361, ωCL = 0.17502950886601729, ωVc = 0.20653057031658326, σ = 1.6955799263772982)

6.2 Gamma Random Effects

Gamma random effects provide an alternative positive-valued distribution with different tail behavior than LogNormal. The Gamma distribution is parameterized by shape and scale:

warfarin_gamma = @model begin
    @param begin
        θCL  RealDomain(lower = 0.0, init = 0.134)
        θVc  RealDomain(lower = 0.0, init = 8.11)
        ωCL  RealDomain(lower = 0.0, init = 0.2)
        ωVc  RealDomain(lower = 0.0, init = 0.2)
        σ  RealDomain(lower = 0.0, init = 0.5)
    end

    @random begin
        ηCL ~ Gamma(1 / ωCL^2, θCL * ωCL^2)
        ηVc ~ Gamma(1 / ωVc^2, θVc * ωVc^2)
    end

    @covariates FSZCL FSZV

    @pre begin
        CL = FSZCL * ηCL
        Vc = FSZV * ηVc
        Ka = log(2) / 0.5
    end

    @dynamics Depots1Central1

    @derived begin
        cp := @. Central / Vc
        conc ~ @. Normal(cp, σ)
    end
end
PumasModel
  Parameters: θCL, θVc, ωCL, ωVc, σ
  Random effects: ηCL, ηVc
  Covariates: FSZCL, FSZV
  Dynamical system variables: Depot, Central
  Dynamical system type: Closed form
  Derived: conc
  Observed: conc
Gamma Parameterization

The Gamma distribution is parameterized here such that:

  • Shape = 1/ω² — controls the shape of the distribution.
  • Scale = θ * ω² — ensures the mean equals θ (since mean = shape × scale = θ).
  • The variance is shape × scale² = θ² × ω².
  • As ω → 0, the distribution concentrates around θ.
fpm_gamma = fit(
    warfarin_gamma,
    pop,
    init_params(warfarin_gamma),
    MCEM(; iterations = 50, samples = 100),
)
coef(fpm_gamma)
(θCL = 0.126388382176213, θVc = 8.926996907174553, ωCL = 0.16015614515880486, ωVc = 0.20364047552790018, σ = 1.7135024508398138)

6.3 LogitNormal Random Effects

LogitNormal random effects are constrained to the interval (0, 1), useful when a parameter represents a fraction or bioavailability:

warfarin_logitnormal = @model begin
    @param begin
        θCL  RealDomain(lower = 0.0, upper = 10.0, init = 0.134)
        θVc  RealDomain(lower = 0.0, init = 8.11)
        ωCL  RealDomain(lower = 0.0, init = 0.1)
        ωVc  RealDomain(lower = 0.0, init = 0.1)
        σ  RealDomain(lower = 0.0, init = 0.5)
    end

    @random begin
        ηCL ~ LogitNormal(log(θCL), ωCL)
        ηVc ~ Normal(log(θVc), ωVc)
    end

    @covariates FSZCL FSZV

    @pre begin
        CL = FSZCL * exp(logit(ηCL))
        Vc = FSZV * exp(ηVc)
        Ka = log(2) / 0.5
    end

    @dynamics Depots1Central1

    @derived begin
        cp := @. Central / Vc
        conc ~ @. Normal(cp, σ)
    end
end
PumasModel
  Parameters: θCL, θVc, ωCL, ωVc, σ
  Random effects: ηCL, ηVc
  Covariates: FSZCL, FSZV
  Dynamical system variables: Depot, Central
  Dynamical system type: Closed form
  Derived: conc
  Observed: conc
fpm_logitnormal = fit(
    warfarin_logitnormal,
    pop,
    init_params(warfarin_logitnormal),
    MCEM(; iterations = 50, samples = 100),
)
coef(fpm_logitnormal)
(θCL = 0.12516659193853769, θVc = 8.716192666052123, ωCL = 0.17541816600220833, ωVc = 0.20655303841604025, σ = 1.6952042574938848)

6.4 Comparing Random Effect Distributions

compare_estimates(;
    Normal_RE = fpm_mcem,
    LogNormal_RE = fpm_lognormal,
    Gamma_RE = fpm_gamma,
)
5×4 DataFrame
Row parameter Normal_RE LogNormal_RE Gamma_RE
String Float64? Float64? Float64?
1 θCL 0.124931 0.125477 0.126388
2 θVc 8.72444 8.71208 8.927
3 ωCL 0.16658 0.17503 0.160156
4 ωVc 0.205857 0.206531 0.20364
5 σ 1.70118 1.69558 1.7135
DataFrame(
    RandomEffects = ["Normal", "LogNormal", "Gamma"],
    LogLikelihood_FOCE = [
        loglikelihood(fpm_mcem, FOCE()),
        loglikelihood(fpm_lognormal, FOCE()),
        loglikelihood(fpm_gamma, FOCE()),
    ],
)
3×2 DataFrame
Row RandomEffects LogLikelihood_FOCE
String Float64
1 Normal -518.695
2 LogNormal -518.678
3 Gamma -519.188

7 Covariate-Dependent Random Effects

MCEM supports covariates in the @random block, allowing the random effect distribution to depend on subject-specific characteristics. This is useful for encoding allometric relationships directly in the random effects:

warfarin_covar_re = @model begin
    @param begin
        θCL  RealDomain(lower = 0.0, init = 0.134)
        θVc  RealDomain(lower = 0.0, init = 8.11)
        ωCL  RealDomain(lower = 0.0, init = 0.1)
        ωVc  RealDomain(lower = 0.0, init = 0.1)
        σ  RealDomain(lower = 0.0, init = 0.5)
    end

    @covariates FSZCL FSZV

    @random begin
        ηCL ~ Normal(log(θCL) + 0.75 * log(FSZCL), ωCL)
        ηVc ~ Normal(log(θVc), ωVc)
    end

    @pre begin
        CL = exp(ηCL)
        Vc = FSZV * exp(ηVc)
        Ka = log(2) / 0.5
    end

    @dynamics Depots1Central1

    @derived begin
        cp := @. Central / Vc
        conc ~ @. Normal(cp, σ)
    end
end
PumasModel
  Parameters: θCL, θVc, ωCL, ωVc, σ
  Random effects: ηCL, ηVc
  Covariates: FSZCL, FSZV
  Dynamical system variables: Depot, Central
  Dynamical system type: Closed form
  Derived: conc
  Observed: conc
fpm_covar = fit(
    warfarin_covar_re,
    pop,
    init_params(warfarin_covar_re),
    MCEM(; iterations = 50, samples = 100),
)
coef(fpm_covar)
(θCL = 0.12475803369426806, θVc = 8.712463580371955, ωCL = 0.17347697080067712, ωVc = 0.20377264259192795, σ = 1.700971189892554)

8 Holding Parameters Constant

Like other estimation methods in Pumas, you can hold specific parameters constant during MCEM estimation using the constantcoef keyword to fit:

fpm_mcem_const = fit(
    warfarin_mcem_model,
    pop,
    init_params(warfarin_mcem_model),
    MCEM(; iterations = 50, samples = 100);
    constantcoef = (:θCL,),
)
coef(fpm_mcem_const)
(θCL = 0.134, θVc = 8.628866945428188, ωCL = 0.1899106765876044, ωVc = 0.2043736666172316, σ = 1.6953528053517355)

The parameter θCL remains at its initial value while all other parameters are optimized.

9 Reproducibility

MCEM uses random sampling, so results will differ across runs unless you control the random number generator.

9.1 Using Random.seed!

using Random

Random.seed!(42)
fpm_a = fit(
    warfarin_mcem_model,
    pop,
    init_params(warfarin_mcem_model),
    MCEM(; iterations = 50, samples = 100),
)

Random.seed!(42)
fpm_b = fit(
    warfarin_mcem_model,
    pop,
    init_params(warfarin_mcem_model),
    MCEM(; iterations = 50, samples = 100),
)

coef(fpm_a) == coef(fpm_b)
true

9.2 Using a Custom RNG

Alternatively, you can also pass a dedicated random number generator (RNG):

fpm_c = fit(
    warfarin_mcem_model,
    pop,
    init_params(warfarin_mcem_model),
    MCEM(; iterations = 50, samples = 100);
    rng = MersenneTwister(42),
)

fpm_d = fit(
    warfarin_mcem_model,
    pop,
    init_params(warfarin_mcem_model),
    MCEM(; iterations = 50, samples = 100);
    rng = MersenneTwister(42),
)

coef(fpm_c) == coef(fpm_d)
true

10 MCEM Options Reference

10.1 Required Arguments

MCEM has two required keyword arguments:

  • iterations (Int): number of EM iterations. Each iteration involves sampling random effects for all subjects, computing sufficient statistics, and optimizing population parameters. Typical values: 30–100.

  • samples (Int): initial number of Monte Carlo samples per subject. The actual number of samples increases across iterations following a polynomial schedule: samples_i = samples + (i-1)² to ensure convergence. Typical values: 50–200.

10.2 Optional Arguments

  • optim_alg (default: Optim.NelderMead()): the optimization algorithm used in the M-step. Must be an optimizer from Optim.jl (zeroth-order, first-order, or second-order).

  • optim_options (default: (;)): additional options passed to Optim.Options. By default, show_trace = false and allow_f_increases = false are used.

10.3 fit Keyword Arguments

In addition to the MCEM algorithm options, fit accepts these keyword arguments:

  • constantcoef: tuple of parameter names to hold constant, e.g., (:σ,).
  • rng: random number generator for reproducibility.
  • ensemblealg: EnsembleThreads() (default) or EnsembleSerial() for parallelism.
  • diffeq_options: ODE solver options, e.g., (; abstol = 1e-8, reltol = 1e-8).
  • verbose: whether to print additional fitting information (default: false).

10.4 Example with Custom Options

fit(
    model,
    pop,
    init_params(model),
    MCEM(;
        iterations = 100,
        samples = 200,
        optim_alg = Optim.LBFGS(),
        optim_options = (; show_trace = true),
    );
    constantcoef = (:σ,),
    rng = StableRNG(123),
    ensemblealg = EnsembleSerial(),
    diffeq_options = (; abstol = 1e-8, reltol = 1e-8),
)

11 Current Limitations

MCEM in Pumas 2.8 has the following limitations:

  1. Additive Gaussian error only: only Normal(μ, σ) error models with constant variance are supported. ProportionalNormal, CombinedNormal, and non-Gaussian likelihoods (Poisson, Bernoulli, TimeToEvent) are not supported.

  2. No direct post-estimation: infer, inspect, predict, and empirical_bayes are not supported directly. Use the zero-iteration FOCE refit workaround.

  3. Model structure constraint: population parameters must not appear in @pre. They must enter the model through @random.

When to Use MCEM vs. VEM vs. FOCE
  • Use MCEM when you want to explore non-Gaussian random effect distributions (LogNormal, Gamma, LogitNormal, InverseGamma) with additive Gaussian error models.
  • Use VEM when you need to fit non-Gaussian likelihoods (Poisson, Bernoulli, TimeToEvent) with random effects. See the VEM tutorials for details.
  • Use FOCE when you have a standard PK/PD model with Gaussian random effects and need direct infer() and inspect() support.

12 Conclusion

In this tutorial, we introduced MCEM as a flexible estimation method for mixed-effects models with non-Gaussian random effect distributions.

Key takeaways:

  • MCEM requires iterations and samples as mandatory arguments.
  • The model must be parameterized so that @pre does not depend on population parameters.
  • Only additive Gaussian error models with constant variance are supported.
  • infer, inspect, and predict require a zero-iteration FOCE refit workaround.
  • MCEM’s unique strength is support for non-Gaussian random effects (LogNormal, LogitNormal, Gamma, InverseGamma).
  • Use Random.seed! or rng for reproducible results.
  • The number of MC samples automatically increases across iterations to ensure convergence.

References

Dempster, Arthur P, Nan M Laird, and Donald B Rubin. 1977. “Maximum Likelihood from Incomplete Data via the EM Algorithm.” Journal of the Royal Statistical Society: Series B (Methodological) 39 (1): 1–38. https://doi.org/10.1111/j.2517-6161.1977.tb01600.x.
Wei, Greg C G, and Martin A Tanner. 1990. “A Monte Carlo Implementation of the EM Algorithm and the Poor Man’s Data Augmentation Algorithms.” Journal of the American Statistical Association 85 (411): 699–704. https://doi.org/10.1080/01621459.1990.10474930.

Reuse