```
using Pumas
using DataFramesMeta
using Downloads: Downloads
using CSV
```

# Comparing NONMEM and Pumas models

## 1 Introduction

When translating a model from NONMEM to Pumas, it is important to verify that the two models are in fact identical. The better way to check that the two models are identical is to compare the the log-likelihood of the two models evaluated in the same parameter values and using the same data. If the two values match, then you can be almost certain that the translation is correct.

However, NONMEM does not report the log-likelihood. Instead, it reports an *objective function value* (OFV) which is a function of the log-likelihood. This means that comparing Pumas and NONMEM results require adjusting the output of Pumas’ `loglikelihood`

function to make it comparable to the NONMEM OFV. This tutorial will demonstrate how to make this adjustment and explain why the two numbers are different.

## 2 An example

To demonstrate how to compare NONMEM’s OFV to Pumas’ `loglikelihood`

, we will use an example model from https://github.com/metrumresearchgroup/bbr. First, we will download the result file for the model

```
= readlines(
nonmem_lst Downloads.download(
"https://raw.githubusercontent.com/metrumresearchgroup/bbr/main/inst/model/nonmem/basic/1/1.lst",
), )
```

Starting with NONMEM version 7.4.0, two different versions of the OFV are reported in the output

```
findfirst(contains("OBJECTIVE FUNCTION VALUE WITH CONSTANT"), nonmem_lst) |>
-> begin
i = i-3:i+1
_lines @info "lines $_lines of lst file"
println.(nonmem_lst[_lines])
end;
```

```
[ Info: lines 373:377 of lst file
TOTAL DATA POINTS NORMALLY DISTRIBUTED (N): 741
N*LOG(2PI) CONSTANT TO OBJECTIVE FUNCTION: 1361.8669062093250
OBJECTIVE FUNCTION VALUE WITHOUT CONSTANT: 2583.3109009417403
OBJECTIVE FUNCTION VALUE WITH CONSTANT: 3945.1778071510653
REPORTED OBJECTIVE FUNCTION DOES NOT CONTAIN CONSTANT
```

The value `OBJECTIVE FUNCTION VALUE WITHOUT CONSTANT`

is NONMEM’s traditional OFV which is also reported in the optimization trace. Below we print the last iteration of the trace where the OFV matches the one from the output above

```
findlast(contains("ITERATION NO.:"), nonmem_lst) |>
-> begin
i = i:i+1
_lines @info "lines $_lines of lst file"
println.(nonmem_lst[_lines])
end;
```

```
[ Info: lines 345:346 of lst file
0ITERATION NO.: 37 OBJECTIVE VALUE: 2583.31090094174 NO. OF FUNC. EVALS.: 28
CUMULATIVE NO. OF FUNC. EVALS.: 447
```

The value for the `TOTAL DATA POINTS NORMALLY DISTRIBUTED`

line is usually the number of observations in the dataset. We can confirm this with

```
findfirst(contains("TOT. NO. OF OBS RECS"), nonmem_lst) |>
-> begin
i = i:i+1
_lines @info "lines $_lines of lst file"
println.(nonmem_lst[_lines])
end;
```

```
[ Info: lines 85:86 of lst file
TOT. NO. OF OBS RECS: 741
TOT. NO. OF INDIVIDUALS: 39
```

Be aware that this is not the case if the model is for censored data (also known as M3).

With the traditional OFV and the number `N`

, we can reconstruct all the numbers in the output above with

```
= (; N = 741, OFV = 2583.3109009417403)
NONMEM_OFV_nt = (; NONMEM_OFV_nt..., constant = NONMEM_OFV_nt.N * log(2 * π))
NONMEM_OFV_nt =
NONMEM_OFV_nt ..., OFV_with_constant = NONMEM_OFV_nt.OFV + NONMEM_OFV_nt.constant)
(; NONMEM_OFV_nt= NamedTuple{(:N, :constant, :OFV, :OFV_with_constant)}(NONMEM_OFV_nt) NONMEM_OFV_nt
```

```
(N = 741,
constant = 1361.866906209325,
OFV = 2583.3109009417403,
OFV_with_constant = 3945.1778071510653,)
```

How does this relate to Pumas’ `loglikelihood`

? The answer is simply that `OBJECTIVE FUNCTION VALUE WITH CONSTANT`

is the same as Pumas’ `-2*loglikelihood(fit)`

. We will confirm this in the following but first we need a Pumas translation of the model. The NONMEM model to be converted to Pumas is

```
println.(
findfirst(contains("\$SUBROU"), nonmem_lst):findfirst(
nonmem_lst[contains("\$EST"),
nonmem_lst,
)] );
```

```
$SUBROUTINES ADVAN2 TRANS2
$PK
ET=1
IF(ETN.EQ.3) ET=1.3
KA = THETA(1)
CL = THETA(2)*((WT/70)**0.75)* EXP(ETA(1))
V = THETA(3)*EXP(ETA(2))
SC=V
$THETA
(0, 2) ; KA
(0, 3) ; CL
(0, 10) ; V2
(0.02) ; RUVp
(1) ; RUVa
$OMEGA
0.05 ; iiv CL
0.2 ; iiv V2
$SIGMA
1 FIX
$ERROR
IPRED = F
IRES = DV-IPRED
W = IPRED*THETA(4) + THETA(5)
IF (W.EQ.0) W = 1
IWRES = IRES/W
Y= IPRED+W*ERR(1)
FAKE=1 ; for testing
$EST METHOD=1 INTERACTION MAXEVAL=9999 SIG=3 PRINT=5 NOABORT POSTHOC
```

A Pumas translation of this model could look like

```
= @model begin
mdl @param begin
∈ RealDomain(lower = 0.0, init = 2.0)
θKa ∈ RealDomain(lower = 0.0, init = 3.0)
θCL ∈ RealDomain(lower = 0.0, init = 10.0)
θVc
∈ RealDomain(lower = 0.0, init = √(0.05))
ωCL ∈ RealDomain(lower = 0.0, init = √(0.2))
ωVc
∈ RealDomain(lower = 0.0, init = 1.0)
σₐ ∈ RealDomain(lower = 0.0, init = 0.02)
σₚ end
@covariates wt
@random begin
~ LogNormal(log(θCL), ωCL)
_CL ~ LogNormal(log(θVc), ωVc)
_Vc end
@pre begin
= θKa
Ka = _CL * (wt / 70)^0.75
CL = _Vc
Vc end
@dynamics Depots1Central1
@derived begin
:= @. Central / Vc
μ ~ @. Normal(μ, μ * σₚ + σₐ)
dv end
end
```

```
PumasModel
Parameters: θKa, θCL, θVc, ωCL, ωVc, σₐ, σₚ
Random effects: _CL, _Vc
Covariates: wt
Dynamical system variables: Depot, Central
Dynamical system type: Closed form
Derived: dv
Observed: dv
```

Notice that the error model is non-standard. The usual combined error model is `Normal(μ, √((μ * σₚ)^2 + σₐ^2))`

, but the version here matches the NONMEM version as we will see below.

Next step is to load the dataset

```
= CSV.read(
data_df Downloads.download(
"https://raw.githubusercontent.com/metrumresearchgroup/bbr/main/inst/extdata/acop.csv",
),
DataFrame,
)filter!(t -> t.id != 2, data_df)
"cmt"] .= 1
data_df[!, = read_pumas(data_df; mdv = :mdv, covariates = [:wt]) data_pd
```

```
┌ Warning: 38 row(s) has(ve) non-missing observation(s) with mdv set to one. mdv is taking precedence.
└ @ Pumas ~/.julia/packages/Pumas/Aiu7N/src/data_parsing/io.jl:2026
```

```
39-element Vector{Subject{NamedTuple{(:dv,), Tuple{Vector{Union{Missing, Float64}}}}, Pumas.ConstantCovar{NamedTuple{(:wt,), Tuple{Float64}}}, T3, Vector{Float64}} where T3}:
Subject(id = 1, ...)
Subject(id = 3, ...)
Subject(id = 4, ...)
Subject(id = 5, ...)
Subject(id = 6, ...)
Subject(id = 7, ...)
Subject(id = 8, ...)
Subject(id = 9, ...)
Subject(id = 10, ...)
Subject(id = 11, ...)
⋮
Subject(id = 32, ...)
Subject(id = 33, ...)
Subject(id = 34, ...)
Subject(id = 35, ...)
Subject(id = 36, ...)
Subject(id = 37, ...)
Subject(id = 38, ...)
Subject(id = 39, ...)
Subject(id = 40, ...)
```

The reported OFVs in the `lst`

file are based on the final parameter values, so to compare the two models we need to read these values. Please notice that we evaluate the value of the log-likelihood in Pumas in the final parameters of the NONMEM run instead of fitting the model in Pumas. Not only will this save us time because model fitting might be slow, but more importantly, it focuses the comparison on just the model formulation and leaves out influences of possible estimation differences between Pumas and NONMEM.

Often, the final estimates of a NONMEM run are stored in an accompanying XML file which is easy to load in most programming languages. However, such an XML file is not included in the output available to us. Instead, we can read the parameter values from `lst`

file

```
=
NONMEM_final_v findlast(contains("NPARAMETR:"), nonmem_lst) |>
-> parse.(Float64, split(nonmem_lst[i])[2:end]) i
```

```
7-element Vector{Float64}:
2.3172
54.615
462.51
-0.082008
4.1796
0.098533
0.15682
```

This parameter vector can then be converted into a `NamedTuple`

of the parameters used in the Pumas model

```
= (;
NONMEM_final_nt = NONMEM_final_v[1],
θKa = NONMEM_final_v[2],
θCL = NONMEM_final_v[3],
θVc = NONMEM_final_v[4],
σₚ = NONMEM_final_v[5],
σₐ = sqrt(NONMEM_final_v[6]),
ωCL = sqrt(NONMEM_final_v[7]),
ωVc )
```

```
(θKa = 2.3172,
θCL = 54.615,
θVc = 462.51,
σₚ = -0.082008,
σₐ = 4.1796,
ωCL = 0.3138996654983882,
ωVc = 0.39600505047284434,)
```

and with these values, we can confirm that `-2*loglikelihood`

is the same as NONMEM’s OFV with constant

`-2 * loglikelihood(mdl, data_pd, NONMEM_final_nt, FOCE()), NONMEM_OFV_nt.OFV_with_constant) (`

`(3945.177809918999, 3945.1778071510653)`

As mentioned, the OFV with constant was only introduced in NONMEM version 7.4.0 so it can be useful to compute the version without the constant which all NONMEM versions report. This can easily be done by using the `nobs`

function in Pumas, i.e.

```
(-2 * loglikelihood(mdl, data_pd, NONMEM_final_nt, FOCE()) - nobs(data_pd) * log(2 * π),
NONMEM_OFV_nt.OFV, )
```

`(2583.310903709674, 2583.3109009417403)`

This version can also be useful if you want to compare to the OFV of the NONMEM file in the initial parameters either by reading the value at iteration zero

```
=
NONMEM_init_OFV_f findfirst(contains("ITERATION NO."), nonmem_lst) |>
-> parse(Float64, match(r"[0-9]+\.[0-9]+", nonmem_lst[i]).match) i
```

`14995.7806174588`

or setting `MAXEVAL=0`

. In either case, the OFV with constant is not included in the `lst`

file. Evaluating the model in the initial parameters is by far the fastest way to compare models in the case where the model has not yet been run.

```
(-2 * loglikelihood(mdl, data_pd, init_params(mdl), FOCE()) - nobs(data_pd) * log(2 * π),
NONMEM_init_OFV_f, )
```

`(14995.780626411153, 14995.7806174588)`

Parallel NONMEM has a bug that causes the `N`

value to be wrong in some but not all cases. The bug affects the reported OFV with constant so it is safer to compare with the traditional OFV when the `lst`

was based on a parallel NONMEM run. The bug has been reported but is present in the current version (7.5.1) of NONMEM.

It is safe to assume that the model has been correctly translated from NONMEM to Pumas if the OFV of NONMEM matches the adjusted log-likelihood value, as in the examples above, but the opposite might not be true. In our experience, the more likely reason for the values to differ is still an error in the model translation, but there could be other reason as well. Other reasons for a difference could be:

- The dataset. Sometimes the datasets used for the comparison might not be exactly identical. Alternatively, the dataset might have been read incorrectly if a argument to
`read_pumas`

was not been specified. There might also be a difference in how the dataset is filtered, i.e. between NONMEM’s`IGNORE`

and a`filter`

call in Pumas. - The parameter values. The values might have been incorrectly entered in the Pumas version. Either because the values have been read in the wrong order or because differences in the parametetrization, i.e. a parameter might be a variance in the NONMEM version but a standard deviation in Pumas version and should therefore be converted with the square root.
- The interpolation direction. NONMEM uses
*next observation carried backwards*(NOCB) while Pumas’ default is*last observation carried forward*(LOCB). - The empirical Bayes estimates. For complicated models, there might be multiple local extrema of the joint likelihood function (as a function of the random effects). In these cases, there is a risk that NONMEM and Pumas do not compute the same empirical Bayes estimates and therefore the objective functions values will differ. It can therefore be useful to compare the NONMEM’s
`phi`

file with the output from Pumas’`empirical_bayes`

. - The algorithms. While Pumas has been extensively tested against NONMEM, there might be minor differences in the numerical integration of the differential equations or the calculation of derivatives. There differences can in rare cases cause the results to differ.

## 3 Why are the two objective functions different?

The *constant* term discussed in the previous section does not depend on the model parameters and consequently does not incluence any inference aspects of the model. Some applications choose to leave out the constant for simplicity while other include it order to maintain the link between the probability density function (pdf) of the data and the likelihood function. In the following, we will provide the details behind these two versions of the objective function. To simplify the notation, we will use a simplified model with the same constant term as the Gaussian non-linear mixed effects model.

The likelihood function (e.g., Casella and Berger 2002), for a sample, \(Y_1,\dots,Y_n\), of independent and identically distributed (iid) observations is defined as

\[ L(\theta) = \prod_{i=1}^n p(y_i) \]

where \(p\) is the pdf of \(Y_i\). It is generally much simpler to work with the logarithm of the likelihood function, the log-likelihood, \(\ell(\theta) = \log(L(\theta))\). In

\[ \ell(\theta) = \sum_{i=1}^n \log p(y_i) \]

The value of \(\theta\) that maximizes this function also maximizes \(L(\theta)\).

The pdf of a normally distributed random variable \(Y_i\) is \(\frac{1}{\sqrt{2\pi\sigma^2}}\exp\left\{\frac{(y_i - \mu)^2}{2\sigma^2}\right\}\) so the log-likelihood of an indenpendently normally distributed sample is

\[ \ell(\theta) = -\frac{n}{2}\log(2\pi) - \frac{n}{2}\log(\sigma^2) - \frac{1}{2\sigma^2}\sum_i^n (y_i - \mu)^2 \]

All terms in the expression have a negative sign and \(2\) in the denominator so it is often simpler to work with

\[ -2\ell(\theta) = n\log(2\pi) + n\log(\sigma^2) + \frac{1}{\sigma^2}\sum_i^n (y_i - \mu)^2 \]

Furthermore, it is clear that the term \(n\log(2\pi)\) does not depend on the model parameters \((\mu,\sigma^2)\) so leaving out the term from the optimization will not change the optimizer. The version without the constant term is the NONMEM OFV

\[ OFV(\theta) = -2\ell(\theta) - n\log(2\pi) = n\log(\sigma^2) + \frac{1}{2\sigma^2}\sum_i^n (y_i - \mu)^2 \]

## 4 References

*Statistical Inference*. Duxbury.