using Pumas
using DataFramesMeta
using PumasUtilities
Translation from NONMEM to Pumas of Case Study III - Development of a population PKPD model
1 From NONMEM to Pumas
In Case Study I through III we re-did the analyses from here as they would have to be done in Pumas. Many new users of Pumas come from a background in NONMEM. A common question is then: “How do I translate my NONMEM model into Pumas?” Hopefully, this tutorial will help with the understand of the connection between parts of a NONMEM control stream and a Pumas model script for Case Study III.
2 Case Study III: PK Model
The full control stream for Case Study III looks as follows.
$PROBLEM PROJECT singledose i.v infusion ;DATE 6-2-04 PROGRAMMER:XXXX
;UNITS: Time=hour, Concentration=ng/ml
;Dose=100mg, Clearance=L/hr, Volume = L
$DATA cs3_ivinfest.csv IGNORE=C
$INPUT ID TIME CONC=DV AMT RATE MDV
;RATE specifies the infusion rate, for a dose of 100mg, rate of 16.7mg/hr
;means infusion for 6hrs
$SUBROUTINE ADVAN1 TRANS2 ; A one compartment model from PREDPP library
$PK
TVCL= THETA(1)
CL= TVCL*EXP(ETA(1)) ;Clearance in L/hr
TVV= THETA(2)
V= TVV*EXP(ETA(2)) ;Volume of distribution in L
S1=V/1000 ;scaling factor to match concentration in ng/ml
$ERROR
IPRED=F
Y=F+F*ERR(1)+ERR(2) ;Combined residual error model
$THETA (5,25) ;POPCL
$THETA (10,86) ;POPV
$OMEGA 0.09 ;BSVCL;30% BSV for Clearance
$OMEGA 0.09 ;BSVV;30% BSV for volume
$SIGMA 0.02 ;ERRCV;Proportional error = 15%
$SIGMA 10 ;ERRSD;Additive error =3ng/ml
$ESTIMATION METHOD=0 MAXEVAL=9990 PRINT=10 POSTHOC
$COVARIANCE
$TABLE ID TIME DV IPRED AMT CL V ETA1 ETA2
NOPRINT ONEHEADER FILE=In order to understand how write this in Pumas we need to break it down. We will go by the order of the control stream and in the end we will write out the code in full.
2.1 $PROBLEM
The first part of the control stream is the problem title
$PROBLEM PROJECT singledose i.v infusionPumas does not have a direct $PROBLEM equivalent. In NONMEM, it is used to give a title to the project including model, estimation, inference, and more. In Pumas, each analysis is much more flexible such that one script can contain several calls to inference using infer (including bootstraps). There is the possibility of adding a name to a model. Inside the the @model block it is possible to add a description using the @metadata block including the unit of time for example.
@model begin
@metadata begin
desc = "singledose i.v infusion"
timeu = u"hr" # hour
end
end2.2 $DATA
The next part of the control stream is the data input.
$DATA cs3_ivinfest.csv IGNORE=CHere, we point to the data set and rows to ignore. The specific line tells us to ignore all rows that start with C (for Comment). In Pumas, we also have to read in the data from a source such as xpt, sas7bdat, or csv. In Case Study III, the equivalent line is
pkdata = CSV.read("cs3_ivinfest.csv", DataFrame; header=5)CSV.read expects a source first (the path to the file) and a sink (the type of table to construct, here a DataFrame). The header keyword tells us to ignore the first 5 lines (those that start with C),
2.3 $INPUT
The $INPUT statement in NONMEM allows you to name the columns from the file in the $DATA statement.
$INPUT ID TIME CONC=DV AMT RATE MDVThe CONC=DV statements indicates that the third column should use the name CONC and that it is the reserved DV keyword. In other words, the thirds column is the dependent variable and we want all output regarding this variable to have the CONC label in the output.
The dataset used in the tutorial was ready for NONMEM, but not exactly in the format we expect in Pumas. The tutorial has the required data steps. After those, the equivalent in Pumas is the following.
population_pk = read_pumas(
pkdata;
id = :ID,
time = :TIME,
observations = [:CONC],
evid = :EVID,
amt = :AMT,
rate = :RATE,
cmt = :CMT,
)This looks extra verbose, but this is only because we are using “NONMEM” style column names. If all the column names had been lowercase instead, we can simply write the step as follows.
population_pk = read_pumas(pkdata; observations = [:conc],)2.4 $SUBROUTINE
The next statement in the control stream is the $SUBROUTINE. This specifies the ADVAN and the parameter format.
$SUBROUTINE ADVAN1 TRANS2 ; A one compartment model from PREDPP libraryThe equivalent code in Pumas is found in the @model. The @dynamics block specifies the equivalent to ADVAN1 TRANS2 below.
@model begin
@metadata begin
desc = "PROJECT multipledose oral study"
timeu = u"hr" # hour
end
@dynamics Central1
end2.5 $PK
The $PK statement specifies the model parameters including random effects and covariate models in NONMEM.
$PK
TVCL = THETA(1)
CL = TVCL*EXP(ETA(1)) ;Clearance in L/hr
TVV = THETA(2)
V = TVV*EXP(ETA(2)) ;Volume of distribution in L
S1 = V/1000 ;scaling factor to match concentration in ng/mlADVAN1 TRANS2 understands the reserved keywords CL, V, and S1. There is no scale (S1) equivalent in Pumas. Instead, you can scale the variables when needed. The other parameters have similar reserved parameter names in Pumas when using the Central1 model:
CLin NONMEM isCLin PumasVin NONMEM isVcin Pumas (Volume of distribution for Central)
This results in the following @pre block (the Pumas equivalent to $PK)
@model begin
@metadata begin
desc = "PROJECT multipledose oral study"
timeu = u"hr" # hour
end
@pre begin
CL = θcl * exp(η[1])
Vc = θvc * exp(η[2])
end
@dynamics Central1
endSo far, the Pumas model looks as follows.
@model begin
@metadata begin
desc = "PROJECT multipledose oral study"
timeu = u"hr" # hour
end
@pre begin
CL = θcl * exp(η[1])
Vc = θvc * exp(η[2])
end
@dynamics Central1
end2.6 $ERROR
The statistical model of the modeled concentrations are specified in the $ERROR statement.
$ERROR
IPRED=F
Y=F+F*ERR(1)+ERR(2) ;Combined residual error modelHere, F specifies the prediction of the model given the parameters and solution. Since this statement is evaluated with etas during estimation, F is assigned to IPRED here. Then we build the error model with ERR statements. In Pumas, we do not build the distribution of Y like this. Rather we specify the distributional assumption directly.
@model begin
@metadata begin
desc = "PROJECT multipledose oral study"
timeu = u"hr" # hour
end
@pre begin
CL = θcl * exp(η[1])
Vc = θvc * exp(η[2])
end
@dynamics Central1
@derived begin
conc_model := @. Central / Vc
CONC ~ @. Normal(conc_model, sqrt(σ_add^2 + (conc_model*σ_prop)^2))
end
endHere, σ_add is the standard deviation associated with ERR(2) in the NONMEM model and σ_prop is the same for ERR(1).
2.7 $THETA, $OMEGA, $SIGMA
Finally, we get to the last part of the model code: the fixed effects and random effects specification. In Pumas, these are typically placed first instead of last. If we start with the fixed effects, these are specified as follows in the control stream.
$THETA (5,25) ;POPCL
$THETA (10,86) ;POPV
$OMEGA 0.09 ;BSVCL;30% BSV for Clearance
$OMEGA 0.09 ;BSVV;30% BSV for volume
$SIGMA 0.02 ;ERRCV;Proportional error = 15%
$SIGMA 10 ;ERRSD;Additive error =3ng/mlIn Pumas, this is equivalent to:
@param begin
θcl ∈ RealDomain(lower=0.0)
θvc ∈ RealDomain(lower=0.0)
Ω ∈ PDiagDomain(2)
σ_add ∈ RealDomain(lower=0.0)
σ_prop ∈ RealDomain(lower=0.0)
endSince we specify the “SIGMA” parameters as standard deviations in this example in Pumas, we have to square the initial values. We put no restriction on names and where different parameters can be used in other blocks, but we do suggest the best practice of using Ω for variance-covariance matrices for random effects, ω for individual standard deviations used in scalar random effect specifications, and σ for standard deviations used in error models. If you prefer to specify variances over standard deviations, we suggest using ω² and σ² respectively.
In NONMEM, the random effect specification was implicit based on the OMEGAs. In Pumas, we have more flexibility with respect to the specification of named random effects and the distributions of them (say, a Beta distributed random effect for a bioavailability). In the current case study, the random effects specification is simple.
@random begin
η ~ MvNormal(Ω)
endThen, the final model looks as follows.
inf1cmt = @model begin
@param begin
θcl ∈ RealDomain(lower=0.0)
θvc ∈ RealDomain(lower=0.0)
Ω ∈ PDiagDomain(2)
σ_add ∈ RealDomain(lower=0.0)
σ_prop ∈ RealDomain(lower=0.0)
end
@random begin
η ~ MvNormal(Ω)
end
@pre begin
CL = θcl * exp(η[1])
Vc = θvc * exp(η[2])
end
@dynamics Central1
@derived begin
conc_model := @. Central / Vc
CONC ~ @. Normal(conc_model, sqrt(σ_add^2 + (conc_model*σ_prop)^2))
end
endAs you may have noticed, we write the parameter, random effects, pk parameters, dynamical system specification and error model in a different order than what was in the original control stream. This is because we find it more clear when reading the code to define things before they are used.
2.8 $ESTIMATION
Just as we had for the $DATA and $INPUT statements, we have a difference between the two systems when it comes to $ESTIMATION. In Pumas, we do not include this information in the “model” because the workflow is typically more interactive.
$ESTIMATION METHOD=0 MAXEVAL=9990 PRINT=10 POSTHOCThe equivalent to the above is something like
param0 = (
θcl = 1.0,
θvc = 1.0,
Ω = Diagonal([0.09, 0.09]),
σ_add = sqrt(10.0),
σ_prop = sqrt(0.01),
)
inf1cmt_results = fit(inf1cmt, population_pk, param0, Pumas.FOCE(); optim_options=(show_every=10, iterations=9990,))Here, METHOD=0 is equivalent to specifying FO() in Pumas, MEXAEVAL is the same as the iterations key in optim_options, PRINT is roughly equivalent to show_every. POSTHOC can be obtained by called the empirical_bayes function later in the script. The specification in NONMEM is necessary because FO does not need to estimate the empirical bayes estimates (EBEs) during fitting. POSTHOC forces the calculation of the EBEs at the end.
2.9 $COVARIANCE
In Pumas, the $COVARIANCE step is equivalent to the infer function.
$COVARIANCE To calculate the asymptotic variance-covariance matrix, use the infer function on the fit output and grab the vcov field.
model_infer = infer(inf1cmt_results)
model_vcov = model_infer.vcov2.10 $TABLE
Since each NONMEM run is invoked based on the control stream it is necessary to specify what to output when the fitting and inference steps have completed. In Pumas, it is possible interactively work with the objects and save what is needed when the users wishes.
$TABLE ID TIME DV IPRED AMT CL V ETA1 ETA2The above information can also be saved by computing the inspect quantities, convert them to a DataFame and save them to a file.
inf1cmt_insp = inspect(inf1cmt_results)
df_inspect = DataFrame(inf1cmt_insp)
CSV.write("inspect_file.csv", df_inspect)Then, the PK model is translated.
3 Case Study III: PD model
$PROBLEM PROJECT POPULATION PK-PD MODELING ;DATE 6-2-04 PROGRAMMER:XXXX
;UNITS: Time=hour, Concentration=ng/ml
;Dose=100mg/6HR, Clearance=L/hr, Volume = L
;Biomarker = Blood Histamine
;concentration,ng/ml
$DATA cs3_ivinfpdest.csv IGNORE=C
$INPUT ID TIME HIST=DV AMT CMT RATE MDV CLI VI
$SUBROUTINE ADVAN6 TRANS1 TOL=3 ;User defined model written as differential equations
$MODEL ;$MODEL defines the no of compartments in the model
COMP = CENTRAL
COMP = EFFECT
$PK
CL = CLI ;Individual clearance estimates from PK analysis
V = VI ;Individual Volume estimates
S1 = V/1000 ;To get concentration in ng/ml
KIN = THETA(1)*EXP(ETA(1))
KOUT = THETA(2)*EXP(ETA(2))
IC50 = THETA(3)*EXP(ETA(3))
F2 = KIN/KOUT ;Initializing to baseline; R0 = kin/kout
$DES
DADT(1) = -CL/V*A(1) ;Plasma
INH = A(1)/(IC50+A(1)) ;Inhibitory function
DADT(2) = KIN*(1-INH) - KOUT*A(2) ;Inhibition of input (Inhibitory Indirect Response model)
$ERROR
CP1 = A(1)/S1
IPRED = F
Y = F+F*ERR(1)+ERR(2)
$THETA (0.1,7) ;POPKin
$THETA (0.01,0.3) ;POPKout
$THETA (0.1,6) ;POPIC50
$OMEGA 0.09 ;BSV Kin
$OMEGA 0.09 ;BSV Kout
$OMEGA 0.09 ;BSV IC50
$SIGMA 0.01 ;ERRCCV
$SIGMA 1 ;ERRADD
$ESTIMATION METHOD=0 MAXEVAL=9990 PRINT=10 POSTHOC
$COVARIANCE
$TABLE ID TIME HIST IPRED AMT CMT RATE
NOPRINT ONEHEADER FILE=3.1 $PROBLEM
The first part of the control stream is the problem title
$PROBLEM PROJECT POPULATION PK-PD MODELINGThis is
@model begin
@metadata begin
desc = "POPULATION PK-PD MODELING"
timeu = u"hr" # hour
end
end3.2 $DATA
The next part of the control stream is the data input. This time we input the data that has been augmented with PK predictions.
$DATA cs3_ivinfpdest.csv IGNORE=CHere, we point to the data set and rows to ignore. The specific line tells us to ignore all rows that start with C (for Comment). In Pumas, we also have to read in the data from a source such as xpt, sas7bdat, or csv. In Case Study III, the equivalent line is
pkdata = CSV.read("cs3_ivinfest.csv", DataFrame; header=5)CSV.read expects a source first (the path to the file) and a sink (the type of table to construct, here a DataFrame). The header keyword tells us to ignore the first 5 lines (those that start with C),
3.3 $INPUT
For this $INPUT statement we also need to input the PD derived variable and the individually predicted PK parameters from the PK model.
$INPUT ID TIME HIST=DV AMT CMT RATE MDV CLI VIThe HIST=DV statements indicates that the third column should use the name HIST and that it is the reserved DV keyword as it did for the PK model.
In Pumas, this is equivalent to the following after a bit of data wrangling that is shown in the tutorial:
population_pd = read_pumas(
pd_dataframe;
id = :ID,
time = :TIME,
observations = [:HIST],
amt = :AMT,
cmt = :CMT,
rate = :RATE,
covariates = [:CLi, :Vci],
)This looks extra verbose, but this is only because we are using “NONMEM” style column names. If all the column names had been lowercase instead, we can simply write the step as follows.
population_pd = read_pumas(
pd_dataframe;
observations = [:HIST],
covariates = [:CLi, :Vci],
)3.4 $SUBROUTINE
The next statement in the control stream is the $SUBROUTINE. This specifies the ADVAN and the parameter format.
$SUBROUTINE ADVAN6 TRANS1 TOL=3 ;User defined model written as differential equationsNote, that this time we use ADVAN6 ans TRANS1 as well as a tolerance. We are going to write out the model manually and use a numerical integrator to solve the model. For this reason, we also need to define the compartment name as is done in the $MODEL statement:
$MODEL ;$MODEL defines the no of compartments in the model
COMP = CENTRAL
COMP = EFFECT
and the dynamics in the $DES statement
$DES
DADT(1) = -CL/V*A(1) ;Plasma
INH = A(1)/(IC50+A(1)) ;Inhibitory function
DADT(2) = KIN*(1-INH) - KOUT*A(2) ;Inhibition of input (Inhibitory Indirect Response model)
The equivalent code in Pumas is found in the @model. The @dynamics block below specifies the equivalent to the NONMEM model statements:
@init begin
Response = bsl
end
@dynamics begin
Central' = -CL/Vc*Central
Response' = kin*(1 - imax*(Central/Vc)/(ic50 + Central/Vc)) - kout*Response
endThe baseline parameter bsl is going to be defined below.
ADVAN6 and TOL are specified in the model, but we specify this in the calls later on in Pumas
3.5 $PK
The $PK statement specifies the model parameters including random effects and covariate models in NONMEM.
$PK
CL = CLI ;Individual clearance estimates from PK analysis
V = VI ;Individual Volume estimates
S1 = V/1000 ;To get concentration in ng/ml
KIN = THETA(1)*EXP(ETA(1))
KOUT = THETA(2)*EXP(ETA(2))
IC50 = THETA(3)*EXP(ETA(3))
F2 = KIN/KOUT ;Initializing to baseline; R0 = kin/koutThis time there are no reserved keywords in the Pumas model because we are using handwritten systems, but NONMEM does make use of some reserved keywords. F2 is the bioabailability of drug going into A(2) (the response compartment). Note, that we do not have a dose affecting response this way. This is a way to set the initial conditions. In the dataset, the are doses of 1 unit going into compartment 2 and if we multiply this my KIN/KOUT we get that A(2) starts at KIN/KOUT. Notice, that this is a bit of a strange way to do it, as there is the option to use A_0(2) = KIN/KOUT in $PK.
This results in the following @pre block (the Pumas equivalent to $PK)
@covariates CLi Vci
@pre begin
kin = tvkin*exp(η[1])
kout = tvkout*exp(η[2])
ic50 = tvic50*exp(η[3])
bsl = kin/kout
imax = tvimax
CL = CLi
Vc = Vci
endWe included the imax parameter because it belongs in the model, but it will be fixed to 1 later. So far, the Pumas model looks as follows. The two covariates we bring in (CLi, Vci) are the predicted individual parameters from the PK model.
irm1 = @model begin
@metadata begin
desc = "POPULATION PK-PD MODELING"
timeu = u"hr" # hour
end
@covariates CLi Vci
@pre begin
kin = tvkin*exp(η[1])
kout = tvkout*exp(η[2])
bsl = kin/kout
ic50 = tvic50*exp(η[3])
imax = tvimax
CL = CLi
Vc = Vci
end
@init begin
Response = bsl
end
@dynamics begin
Central' = -CL/Vc*Central
Response' = kin*(1 - imax*(Central/Vc)/(ic50 + Central/Vc)) - kout*Response
end
end3.6 $ERROR
The statistical model of the modeled concentrations are specified in the $ERROR statement.
$ERROR
CP1 = A(1)/S1
IPRED = F
Y = F+F*ERR(1)+ERR(2)Here, F specifies the prediction of the model given the parameters and solution. Since this statement is evaluated with etas during estimation, F is assigned to IPRED here. Then we build the error model with ERR statements. In Pumas, we do not build the distribution of Y like this. Rather we specify the distributional assumption directly.
@derived begin
HIST ~ @. Normal(Response, sqrt(σ_add_pd^2 + (Response*σ_prop_pd)^2))
endHere, σ_add_pd is the standard deviation associated with ERR(2) in the NONMEM model and σ_pro_pd_ is the same for ERR(1).
3.7 $THETA, $OMEGA, $SIGMA
Finally, we get to the last part of the model code: the fixed effects and random effects specification. In Pumas, these are typically placed first instead of last. If we start with the fixed effects, these are specified as follows in the control stream.
$THETA (0.1,7) ;POPKin
$THETA (0.01,0.3) ;POPKout
$THETA (0.1,6) ;POPIC50
$OMEGA 0.09 ;BSV Kin
$OMEGA 0.09 ;BSV Kout
$OMEGA 0.09 ;BSV IC50
$SIGMA 0.01 ;ERRCCV
$SIGMA 1 ;ERRADDIn Pumas, this is equivalent to:
@param begin
tvkin ∈ RealDomain(lower=0)
tvkout ∈ RealDomain(lower=0)
tvic50 ∈ RealDomain(lower=0)
tvimax ∈ RealDomain(lower=0)
Ω ∈ PDiagDomain(3)
σ_add_pd ∈ RealDomain(lower=0)
σ_prop_pd ∈ RealDomain(lower=0)
endSince we specify the “SIGMA” parameters as standard deviations in this example in Pumas, we have to square the initial values. We put no restriction on names and where different parameters can be used in other blocks, but we do suggest the best practice of using Ω for variance-covariance matrices for random effects, ω for individual standard deviations used in scalar random effect specifications, and σ for standard deviations used in error models. If you prefer to specify variances over standard deviations, we suggest using ω² and σ² respectively.
In NONMEM, the random effect specification was implicit based on the OMEGAs. In Pumas, we have more flexibility with respect to the specification of named random effects and the distributions of them (say, a Beta distributed random effect for a bioavailability). In the current case study, the random effects specification is simple.
@random begin
η ~ MvNormal(Ω)
endThen, the final model looks as follows.
irm1 = @model begin
@metadata begin
desc = "POPULATION PK-PD MODELING"
timeu = u"hr" # hour
end
@param begin
tvkin ∈ RealDomain(lower=0)
tvkout ∈ RealDomain(lower=0)
tvic50 ∈ RealDomain(lower=0)
tvimax ∈ RealDomain(lower=0)
Ω ∈ PDiagDomain(3)
σ_add_pd ∈ RealDomain(lower=0)
σ_prop_pd ∈ RealDomain(lower=0)
end
@random begin
η ~ MvNormal(Ω)
end
@covariates CLi Vci
@pre begin
kin = tvkin*exp(η[1])
kout = tvkout*exp(η[2])
bsl = kin/kout
ic50 = tvic50*exp(η[3])
imax = tvimax
CL = CLi
Vc = Vci
end
@init begin
Response = bsl
end
@dynamics begin
Central' = -CL/Vc*Central
Response' = kin*(1 - imax*(Central/Vc)/(ic50 + Central/Vc)) - kout*Response
end
@derived begin
HIST ~ @. Normal(Response, sqrt(σ_add_pd^2 + (Response*σ_prop_pd)^2))
end
endAs you may have noticed, we write the parameter, random effects, pk parameters, dynamical system specification and error model in a different order than what was in the original control stream. This is because we find it more clear when reading the code to define things before they are used.
3.8 $ESTIMATION
Just as we had for the $DATA and $INPUT statements, we have a difference between the two systems when it comes to $ESTIMATION. In Pumas, we do not include this information in the “model” because the workflow is typically more interactive.
$ESTIMATION METHOD=0 MAXEVAL=9990 PRINT=10 POSTHOCThe equivalent to the above is something like
param0 = (
tvkin = 5.4,
tvkout = 0.3,
tvic50=3.9,
tvimax=1.0,
Ω = Diagonal([0.2, 0.2, 0.2]),
σ_add_pd=0.05,
σ_prop_pd=0.05)
inf1cmt_results = fit(irm1, population_pd, init_θ, Pumas.FOCE(); constantcoef=(tvimax=1.0,), optim_options = (show_every=10, optim_options=(iterations=9990,))Here, METHOD=0 is equivalent to specifying FO() in Pumas, MEXAEVAL is the same as the iterations key in optim_options, PRINT is roughly similar to show_every. POSTHOC can be obtained by called the empirical_bayes function later in the script. The specification in NONMEM is necessary because FO does not need to estimate the empirical bayes estimates (EBEs) during fitting. POSTHOC forces the calculation of the EBEs at the end.
3.9 $COVARIANCE
In Pumas, the $COVARIANCE step is equivalent to the infer function.
$COVARIANCE To calculate the asymptotic variance-covariance matrix, use the infer function on the fit output and grab the vcov field.
model_infer = infer(inf1cmt_results)
model_vcov = model_infer.vcov3.10 $TABLE
Since each NONMEM run is invoked based on the control stream it is necessary to specify what to output when the fitting and inference steps have completed. In Pumas, it is possible interactively work with the objects and save what is needed when the users wishes.
$TABLE ID TIME HIST IPRED AMT CMT RATEThe above information can also be saved by computing the inspect quantities, convert them to a DataFame and save them to a file.
inf1cmt_insp = inspect(inf1cmt_results)
df_inspect = DataFrame(inf1cmt_insp)
CSV.write("inspect_file.csv", df_inspect)Then the PD model is translated.
4 Conclusion
In this tutorial, we saw how to translate the ACCP Case Study III from NONMEM to Pumas. Hopefully, this helps new users connect the dots and understand how one section of a NONMEM control stream relates to a model block or function call in Pumas. This case study was a little bit more complicated as it took in parameter from one model into the other, but using the functionality for subject covariates we could easily do sequential modeling in Pumas.