Translation of ACCP Case Study II from NONMEM to Pumas

Author

Patrick Kofod Mogensen

1 From NONMEM to Pumas

In Case Study I through III we re-did the analyses from here as they would have to be done in Pumas. Many new users of Pumas come from a background in NONMEM. A common question is then: “How do I translate my NONMEM model into Pumas?” Hopefully, this tutorial will help with the understand of the connection between parts of a NONMEM control stream and a Pumas model script for Case Study 2.

2 Case Study II

The full control stream for Case Study II looks as follows.

# CS2_ORALESTADDL.ctl
$PROBLEM PROJECT multipledose oral study 
$DATA cs2_oralestaddl.csv IGNORE=C 
$INPUT ID TIME CONC=DV AMT ADDL II MDV 
; II = refers to interdose interval 
; ADDL = refers to additional identical doses given 

$SUBROUTINE ADVAN4 TRANS4
; DATE 6-2-04          PROGRAMMER:XXXX
; UNITS: Time=hour, Concentration=ug/ml
; Dose=500mg, Clearance=L/hr, Volume = L,
; ADVAN4 is for a two compartment extravascular administration
; TRANS4 is parameterization in terms of CL, Vc, Q, Vp and ka
$PK 
    TVCL = THETA(1)
    CL = TVCL*EXP(ETA(1)) ;Clearance in L/hr
    TVV2 = THETA(2)
    V2 = TVV2*EXP(ETA(2)) ;Central Volume of distribution in L
    TVQ = THETA(3) 
    Q = TVQ*EXP(ETA(3)) ;Inter compartmental clearanc
    TVV3 = THETA(4)
    V3 = TVV3*EXP(ETA(4)) ;Peripheral volume of distribution in L
    TVKA = THETA(5)
    KA = TVKA*EXP(ETA(5));Absorption rate constant (1/hr)

    S2=V2 
    S3=V3 
    F1 = 0.5 ; Absolute oral bioavailability is 50%

$ERROR 
    IPRED=F 
    Y=F+F*ERR(1)+ERR(2) 

$THETA (0.2,2) ;POPCL
$THETA (0.1,4) ;POPV2
$THETA (0.1,1.5) ;POPQ
$THETA (1,10) ;POPV3
$THETA (0.01,0.5) ;POPKA


$OMEGA 0.09 ;BSVCL
$OMEGA 0.09 ;BSVV2
$OMEGA 0.09 ;BSVQ
$OMEGA 0.09 ;BSVV3
$OMEGA 0.09 ;BSVKA


$SIGMA 0.09 ;ERRCV
$SIGMA 5    ;ERRSD

$ESTIMATION METHOD=0  MAXEVAL=9990 PRINT=10 POSTHOC
$COVARIANCE 

$TABLE ID TIME DV IPRED AMT ADDL II MDV 
NOPRINT ONEHEADER FILE= 

To understand how to write this in Pumas, we’ll break it down step-by-step, following the order of the control stream. At the end, we will write out the complete code.

2.1 $PROBLEM

The first part of the control stream is the problem block

$PROBLEM PROJECT multipledose oral study 

Pumas does not have a direct $PROBLEM equivalent. In NONMEM, it is used to give a title to the project including model, estimation, inference, and more. In Pumas, each analysis is much more flexible such that one script can contain several calls to inference using infer (including bootstraps). There is the possibility of adding a name to a model. Inside the the @model block it is possible to add a description using the @metadata block.

@model begin
    @metadata begin
        desc = "PROJECT multipledose oral study"
        timeu = u"hr" # hour
    end
end

2.2 $DATA

The next part of the control stream is the data input.

$DATA cs2_oralestaddl.csv IGNORE=C 

Here, we point to the data set and rows to ignore. The specific line tells us to ignore all rows that start with C (for Comment). In Pumas, we also have to read in the data from a source such as xpt, sas7bdat, or csv. In Case Study II, the equivalent line is

pkdata = CSV.read("CS2_oralestaddl.csv", DataFrame; header=5)

CSV.read expects a source first (the path to the file) and a sink (the type of table to construct, here a DataFrame). The header keyword tells us to ignore the first 5 lines (those that start with C),

2.3 $INPUT

The $INPUT statement in NONMEM allows you to name the columns from the file in the $DATA statement.

$INPUT ID TIME CONC=DV AMT ADDL II MDV 

The CONC=DV statements indicates that the third column should use the name CONC and that it is the reserved DV keyword. In other words, the thirds column is the dependent variable and we want all output regarding this variable to have the CONC label in the output.

The dataset used in the tutorial was ready for NONMEM, but not exactly in the format we expect in Pumas. The tutorial has the required data steps. After those, the equivalent in Pumas is the following.

population = read_pumas(
  pkdata;
  id           = :ID,
  time         = :TIME,
  observations = [:CONC],
  evid         = :EVID,
  amt          = :AMT,
  addl         = :ADDL,
  ii           = :II,
  cmt          = :CMT,
)

This looks extra verbose, but this is only because we are using “NONMEM” style column names. If all the column names had been lowercase instead, we can simply write the step as follows.

population = read_pumas(pkdata; observations = [:conc],)

2.4 $SUBROUTINE

The next statement in the control stream is the $SUBROUTINE. This specifies the ADVAN and the parameter format.

$SUBROUTINE ADVAN4 TRANS4

The equivalent code in Pumas is found in the @model. The @dynamics block specifies the equivalent to ADVAN4 TRANS4 below.

@model begin
    
    @metadata begin
        desc = "PROJECT multipledose oral study"
        timeu = u"hr" # hour
    end
    @dynamics Depots1Central1Periph1
end

2.5 $PK

The $PK statement specifies the model parameters including random effects and covariate models in NONMEM.

$PK 
    TVCL = THETA(1)
    CL = TVCL*EXP(ETA(1)) ;Clearance in L/hr
    TVV2 = THETA(2)
    V2 = TVV2*EXP(ETA(2)) ;Central Volume of distribution in L
    TVQ = THETA(3) 
    Q = TVQ*EXP(ETA(3)) ;Inter compartmental clearanc
    TVV3 = THETA(4)
    V3 = TVV3*EXP(ETA(4)) ;Peripheral volume of distribution in L
    TVKA = THETA(5)
    KA = TVKA*EXP(ETA(5));Absorption rate constant (1/hr)

    S2=V2 
    S3=V3 
    F1 = 0.5 ; Absolute oral bioavailability is 50%

ADVAN4 TRANS4 understands the reserved keywords CL, V2, Q, V3, KA, S2, S3, and F1. There is no scale (S2, S3) equivalent in Pumas. Instead, you can scale the variables when needed. The other parameters have similar reserved parameter names in Pumas when using the Depots1Central1Periph1 model:

  • KA in NONMEM is Ka in Pumas
  • CL in NONMEM is CL in Pumas
  • V2 in NONMEM is Vc in Pumas (Volume of distribution for Central)
  • Q in NONMEM is Q in Pumas
  • V3 in NONMEM is Vp in Pumas (Volume of distribution for Peripheral)

This results in the following @pre block (the Pumas equivalent to $PK)

@model begin
    
    @metadata begin
        desc = "PROJECT multipledose oral study"
        timeu = u"hr" # hour
    end
    @pre begin
        Ka = θka * exp(η[1])
        CL = θcl * exp(η[2])
        Vc = θvc * exp(η[3])
        Vp = θvp * exp(η[4])
        Q  = θq  * exp(η[5])
    end
    @dosecontrol begin
        bioav = (Depot=0.5,)
    end
    @dynamics Depots1Central1Periph1
end

The bioavailability F1 is written in the @dosecontrol block using the bioav keyword. The @dosecontrol block is different to @pre in that it is only evaluated at “event times” (evid in (1,3,4)) where as @pre is always evaluated at the current model time, see this explanation in the documentation on dose control parameters . To specify dose control parameters, the user must specify the type of parameter and the compartment it applies to. F1 = 0.5 is then equivalent to

@dosecontrol begin
    bioav = (Depot=0.5,)
end

If we had injections where the drug might crystalize in the tubes we could model bioavailability in Central as follows

@dosecontrol begin
    bioav = (Central=0.5,)
end

which would be equivalent to F2 = 0.5 in NONMEM for this ADVAN. So far, the Pumas model looks as follows.

@model begin
    
    @metadata begin
        desc = "PROJECT multipledose oral study"
        timeu = u"hr" # hour
    end
    @pre begin
        Ka = θka * exp(η[1])
        CL = θcl * exp(η[2])
        Vc = θvc * exp(η[3])
        Vp = θvp * exp(η[4])
        Q  = θq  * exp(η[5])
    end
    @dosecontrol begin
        bioav = (Depot=0.5,)
    end
    @dynamics Depots1Central1Periph1
end

2.6 $ERROR

The statistical model of the modeled concentrations are specified in the $ERROR statement.

$ERROR 
    IPRED=F 
    Y=F+F*ERR(1)+ERR(2) 

Here, F specifies the prediction of the model given the parameters and solution. Since this statement is evaluated with etas during estimation, F is assigned to IPRED here. Then we build the error model with ERR statements. In Pumas, we do not build the distribution of Y like this. Rather we specify the distributional assumption directly.

@model begin
    
    @metadata begin
        desc = "PROJECT multipledose oral study"
        timeu = u"hr" # hour
    end
    @pre begin
        Ka = θka * exp(η[1])
        CL = θcl * exp(η[2])
        Vc = θvc * exp(η[3])
        Vp = θvp * exp(η[4])
        Q  = θq  * exp(η[5])
    end
    @dosecontrol begin
        bioav = (Depot=0.5,)
    end
    @dynamics Depots1Central1Periph1
    @derived begin
        conc_model := @. Central / Vc
        CONC ~ @. Normal(conc_model, sqrt(σ_add^2 + (conc_model*σ_prop)^2))
    end
end

Here, σ_add is the standard deviation associated with ERR(2) in the NONMEM model and σ_prop is the same for ERR(1).

2.7 $THETA, $OMEGA, $SIGMA

Finally, we get to the last part of the model code: the fixed effects and random effects specification. In Pumas, these are typically placed first instead of last. If we start with the fixed effects, these are specified as follows in the control stream.

$THETA (0.2,2) ;POPCL
$THETA (0.1,4) ;POPV2
$THETA (0.1,1.5) ;POPQ
$THETA (1,10) ;POPV3
$THETA (0.01,0.5) ;POPKA


$OMEGA 0.09 ;BSVCL
$OMEGA 0.09 ;BSVV2
$OMEGA 0.09 ;BSVQ
$OMEGA 0.09 ;BSVV3
$OMEGA 0.09 ;BSVKA


$SIGMA 0.09 ;ERRCV
$SIGMA 5    ;ERRSD

In the Pumas tutorial we excluded the omega on Ka. Here, we’ve included it at well.

@param begin
    θka  RealDomain(lower=0.01, init = 0.5)
    θcl  RealDomain(lower=0.2, init= 2.0)
    θvc  RealDomain(lower=0.1, init = 4.0)
    θvp  RealDomain(lower=1.0, init = 10.0)
    θq   RealDomain(lower=0.1, init = 1.5)
    Ω    PDiagDomain([0.09, 0.09, 0.09, 0.09, 0.09])
    σ_add   RealDomain(lower=0.0, init = 5^2)
    σ_prop  RealDomain(lower=0.0, init = 0.09^2)
end

Since we specify the “SIGMA” parameters as standard deviations in this example in Pumas, we have to square the initial values. We put no restriction on names and where different parameters can be used in other blocks, but we do suggest the best practice of using Ω for variance-covariance matrices for random effects, ω for individual standard deviations used in scalar random effect specifications, and σ for standard deviations used in error models. If you prefer to specify variances over standard deviations, we suggest using ω² and σ² respectively.

In NONMEM, the random effect specification was implicit based on the OMEGAs. In Pumas, we have more flexibility with respect to the specification of named random effects and the distributions of them (say, a Beta distributed random effect for a bioavailability). In the current case study, the random effects specification is simple.

@random begin
    η ~ MvNormal(Ω)
end

Then, the final model looks as follows.

@model begin
    @metadata begin
        desc = "PROJECT multipledose oral study"
        timeu = u"hr" # hour
    end
    @param begin
        θka  RealDomain(lower=0.01, init = 0.5)
        θcl  RealDomain(lower=0.2, init= 2.0)
        θvc  RealDomain(lower=0.1, init = 4.0)
        θvp  RealDomain(lower=1.0, init = 10.0)
        θq   RealDomain(lower=0.1, init = 1.5)
        Ω    PDiagDomain([0.09, 0.09, 0.09, 0.09, 0.09])
        σ_add   RealDomain(lower=0.0, init = 5^2)
        σ_prop  RealDomain(lower=0.0, init = 0.09^2)
    end
    @random begin
        η ~ MvNormal(Ω)
    end
    @pre begin
        Ka = θka * exp(η[1])
        CL = θcl * exp(η[2])
        Vc = θvc * exp(η[3])
        Vp = θvp * exp(η[4])
        Q  = θq  * exp(η[5])
    end
    @dosecontrol begin
        bioav = (Depot=0.5,)
    end
    @dynamics Depots1Central1Periph1
    @derived begin
        conc_model := @. Central / Vc
        CONC ~ @. Normal(conc_model, sqrt(σ_add^2 + (conc_model*σ_prop)^2))
    end
end

As you may have noticed, we write the parameter, random effects, pk parameters, dynamical system specification and error model in a different order than what was in the original control stream. This is because we find it more clear when reading the code to define things before they are used.

2.8 $ESTIMATION

Just as we had for the $DATA and $INPUT statements, we have a difference between the two systems when it comes to $ESTIMATION. In Pumas, we do not include this information in the “model” because the workflow is typically more interactive.

$ESTIMATION METHOD=0  MAXEVAL=9990 PRINT=10 POSTHOC

The equivalent to the above is something like

param0 = init_params(model)
model_fit = fit(model, population, param0, FO(); optim_options = (iterations = 9900))

Here, METHOD=0 is equivalent to specifying FO() in Pumas, MEXAEVAL is the same as the iterations key in optim_options, PRINT has no direct equivalent in Pumas. POSTHOC can be obtained by called the empirical_bayes function later in the script. The specification in NONMEM is necessary because FO does not need to estimate the empirical bayes estimates (EBEs) during fitting. POSTHOC forces the calculation of the EBEs at the end.

2.9 $COVARIANCE

In Pumas, the $COVARIANCE step is equivalent to the infer function.

$COVARIANCE 

To calculate the asymptotic variance-covariance matrix, use the infer function on the fit output and grab the vcov field.

model_infer = infer(model_fit)
model_vcov = model_infer.vcov

2.10 $TABLE

Since each NONMEM run is invoked based on the control stream it is necessary to specify what to output when the fitting and inference steps have completed. In Pumas, it is possible interactively work with the objects and save what is needed when the users wishes.

$TABLE ID TIME DV IPRED AMT ADDL II MDV 
NOPRINT ONEHEADER FILE= 

The above information can also be saved by computing the inspect quantities, convert them to a DataFame and save them to a file.

model_inspect = inspect(model_fit)
model_inspect_df = DataFrame(model_inspect)
CSV.write("model2_inspect.csv", model_inspect_df)

3 Conclusion

In this tutorial, we saw how to translate the ACCP Case Study II from NONMEM to Pumas. Hopefully, this helps new users connect the dots and understand how one section of a NONMEM control stream relates to a model block or function call in Pumas.