Statistical Models Without Differential Equations

Author

Patrick Kofod Mogensen

using Pumas
using AlgebraOfGraphics
using CairoMakie
using StableRNGs

1 Introduction

This tutorial represents a slight deviation from most other models presented and performed in Pumas that often includes pharmacokinetic modeling. Here, we will focus on simple Exposure-Response (ER) models. ER models can be models of efficacy, safety, toxicity, or any other response. The response can be continuous in nature, it can be binary, or even multinomial such as ordinal pain score models.

The main point of the tutorial is to introduce the situation where there is data that does not come with an inherent dynamical system component. Instead, we want to study the effect of an exposure measure such as plasma concentration (potentially as a proxy for biophase concentration) or AUC on some kind of response. The response could be almost any response you can think of from the total area of the body affected by eczema, a pain-score, pain relief indicator, body weight, presence of vomiting, or any other efficacy, safety, or toxicity measure. To keep things simple and focus on higher level ideas we go with a sigmoidal Emax model (with a Hill exponent) for some hypothetical response given some level of plasma concentrations.

# Data generating parameters
Emax = 100
C50 = 40
h = 2
# Function to evaluate the Emax model
hill_model(cp, emax, c50, hill) = emax * cp^hill / (c50^hill + cp^hill)
hill_model (generic function with 1 method)

The Hill function is well-known in pharmacodynamics when we have an exposure-response relationship that has an upper effect limit. The Hill parameter allows us to change where in the exposure domain the change in effect strongest. It includes two additional parameters: C50 and Emax. First, is the half maximal effective concentration C50 (sometimes EC50) parameter that is the exposure (here concentration, hence C) that leads to half of the response in excess of a baseline value. Here, the baseline is zero. Second, is the parameter Emax that is simply the maximum effect or response that can occur. If we sample with a technique that truly has additive error the observed maximal effect could in principle be above Emax.

Let us draw such a function with the data generating parameters.

Show plotting code
hill_plot =
    data((exposure = 0:200, effect = hill_model.(0:200, Emax, C50, h))) *
    mapping(:exposure, :effect) *
    visual(Lines) +
    data((exposure = [C50], effect = [0, hill_model(C50, Emax, C50, h)])) *
    mapping(:exposure, :effect) *
    visual(Lines; linestyle = :dot) +
    data((exposure = [0, C50], effect = [hill_model(C50, Emax, C50, h)])) *
    mapping(:exposure, :effect) *
    visual(Lines; linestyle = :dot) +
    data((exposure = [0, 200], effect = [Emax])) *
    mapping(:exposure, :effect) *
    visual(Lines; linestyle = :dot)

axis_spec = (;
    axis = (
        limits = (0, 200, 0, Emax + 10),
        xticks = ([0, C50, 100, 200], ["0", "C50", "100", "200"]),
        yticks = ([0, 25, Emax / 2, 75, Emax], ["0", "25", "Emax/2", "75", "Emax"]),
    )
)
draw(hill_plot; axis_spec...)

Since we have no model to explain the exposure but it is rather just a measured quantity we will simply sample exposures for this example. To ensure proper behavior of the estimator, we will make sure to sample well above C50.

2 Data Without Any Events

One basic property of the data for this kind of analysis we need to consider is that there are no events in the provided data. By events we think of the usual dose events associated with infusion rates and durations, bolus dose amounts, and other information about the specific mechanisms at play. We normally parse this information and use it to explicitly build a model for the trajectories of pharmacokinetic variables and how they might influence pharmacodynamics - but not here! We still require a time column, but often it may only be used to separate individual observations within subjects instead of being used for dynamical modeling.

Instead of using a predefined dataset we will construct a simple one as described in the previous section using sampled concentrations, the Hill model, and an additive error model. We need at least id, time, and observations to define an eventless dataset, but to drive the Emax model we need to include cp_i that are the measured or predicted exposures. The observations will be called resp here for response.

# Define the number of concentrations to sample
N = 40
# Define the random number generator
rng = StableRNG(983)
# Sample concentrations from a log-normal distribution
cp_i = rand(rng, LogNormal(log(C50 + 5), 0.6), N)
# Generate response variables given the exposure, cp_i and parameters for the Hill model
resp = @. rand(rng, Normal(hill_model(cp_i, Emax, C50, h), 3.2))
# Combine results into a DataFrame
response_df = DataFrame(id = 1:N, time = 1, cp_i = cp_i, resp = resp)
40×4 DataFrame
15 rows omitted
Row id time cp_i resp
Int64 Int64 Float64 Float64
1 1 1 29.9617 32.829
2 2 1 67.714 78.3184
3 3 1 58.7976 69.3724
4 4 1 37.54 46.1287
5 5 1 58.5205 67.6474
6 6 1 51.4522 61.8933
7 7 1 106.938 84.1643
8 8 1 17.5912 21.9294
9 9 1 14.974 11.6478
10 10 1 40.0817 47.5374
11 11 1 67.7002 77.037
12 12 1 38.6564 43.8903
13 13 1 31.8099 40.5693
29 29 1 109.159 82.9605
30 30 1 21.5369 22.6964
31 31 1 79.237 82.8221
32 32 1 94.6278 82.5703
33 33 1 53.9975 59.4201
34 34 1 44.5625 52.5194
35 35 1 130.494 92.3875
36 36 1 66.5144 75.9562
37 37 1 18.7343 22.4305
38 38 1 62.8681 71.8646
39 39 1 63.7703 73.3971
40 40 1 48.1412 64.6768
Show plotting code
draw(hill_plot + data(response_df) * mapping(:cp_i, :resp) * visual(Scatter); axis_spec...)

2.1 Defining Pumas Population Without Events

To map from tabular data in response_df to a Population in response_pop we use read_pumas just as we did in the case with event data. The important part is to turn off event_data to disable checks that are not relevant to this eventless example. If event_data is not set to false we would get errors about missing event columns for example.

response_pop = read_pumas(
    response_df,
    id = :id,
    time = :time,
    covariates = [:cp_i],
    observations = [:resp],
    event_data = false,
)
Population
  Subjects: 40
  Covariates: cp_i
  Observations: resp

2.2 A Model Without Dynamics

Above we simply defined the input exposure and output response as simple arrays and function evaluations to defer the PumasModel definition until after seeing how to load eventless datasets. However, to perform a maximum likelihood fit we need to define a proper PumasModel.

response_model = @model begin
    @param begin
        θemax  RealDomain(lower = 0, init = 90)
        θc50  RealDomain(lower = 0, init = 30)
        θhill  RealDomain(lower = 0, init = 3)
        σ  RealDomain(lower = 1e-5, init = 0.1)
    end
    @covariates cp_i
    @pre begin
        emax_i = hill_model(cp_i, θemax, θc50, θhill)
    end
    @derived begin
        resp ~ @. Normal(emax_i, σ)
    end
end
PumasModel
  Parameters: θemax, θc50, θhill, σ
  Random effects:
  Covariates: cp_i
  Dynamical system variables:
  Dynamical system type: No dynamical model
  Derived: resp
  Observed: resp

2.3 Fitting

To fit the model, we simply invoke fit with the model, population, parameters, and likelihood approximation method. Since there are no random effects in this model there is no integral to approximate so we use NaivePooled(). This will perform a maximum likelihood estimation according to the distribution used in @derived. The parameters in this case will match a non-linear least squares fit since we have an additive gaussian error model, but inference may deviate as usual.

emax_fit = fit(response_model, response_pop, init_params(response_model), NaivePooled())
[ Info: Checking the initial parameter values.
[ Info: The initial negative log likelihood and its gradient are finite. Check passed.
Iter     Function value   Gradient norm 
     0     2.191392e+05     2.289229e+06
 * time: 0.03755307197570801
     1     6.818291e+04     2.824317e+05
 * time: 1.5713469982147217
     2     4.291959e+04     9.008422e+04
 * time: 1.571563959121704
     3     2.526194e+04     7.438611e+04
 * time: 1.5716519355773926
     4     1.555090e+04     5.381298e+04
 * time: 1.5717310905456543
     5     9.814559e+03     6.070009e+04
 * time: 1.5718069076538086
     6     7.614303e+03     5.099953e+04
 * time: 1.571882963180542
     7     6.795333e+03     3.666906e+04
 * time: 1.5719599723815918
     8     6.332366e+03     2.139158e+04
 * time: 1.57204008102417
     9     5.860625e+03     1.311493e+04
 * time: 1.572119951248169
    10     5.219282e+03     1.507053e+04
 * time: 1.5721979141235352
    11     4.162225e+03     1.424461e+04
 * time: 1.5722899436950684
    12     2.938417e+03     1.622318e+04
 * time: 1.572371006011963
    13     2.319141e+03     1.430247e+04
 * time: 1.5724480152130127
    14     1.257223e+03     6.948150e+03
 * time: 1.5725231170654297
    15     7.239484e+02     3.073878e+03
 * time: 1.572601079940796
    16     4.221635e+02     1.591054e+03
 * time: 1.5726780891418457
    17     2.681901e+02     8.356172e+02
 * time: 1.5727550983428955
    18     1.904231e+02     4.511418e+02
 * time: 1.5728459358215332
    19     1.552170e+02     2.590796e+02
 * time: 1.5729219913482666
    20     1.412507e+02     1.609948e+02
 * time: 1.5729999542236328
    21     1.368116e+02     1.106316e+02
 * time: 1.5730769634246826
    22     1.357202e+02     8.485850e+01
 * time: 1.5731520652770996
    23     1.353898e+02     7.013620e+01
 * time: 1.5732290744781494
    24     1.350937e+02     5.450614e+01
 * time: 1.5733110904693604
    25     1.347410e+02     3.733061e+01
 * time: 1.5733869075775146
    26     1.345492e+02     6.266036e+01
 * time: 1.5734639167785645
    27     1.345033e+02     6.975752e+01
 * time: 1.5735399723052979
    28     1.344943e+02     6.841408e+01
 * time: 1.573617935180664
    29     1.344938e+02     6.745130e+01
 * time: 1.5736920833587646
    30     1.344931e+02     6.665860e+01
 * time: 1.5737669467926025
    31     1.344910e+02     6.496068e+01
 * time: 1.5738420486450195
    32     1.344858e+02     6.244698e+01
 * time: 1.5739150047302246
    33     1.344718e+02     5.817264e+01
 * time: 1.5739901065826416
    34     1.344356e+02     5.122402e+01
 * time: 1.5740649700164795
    35     1.343407e+02     4.121832e+01
 * time: 1.5741419792175293
    36     1.340960e+02     5.595102e+01
 * time: 1.5742180347442627
    37     1.334812e+02     7.977600e+01
 * time: 1.57430100440979
    38     1.320677e+02     1.141588e+02
 * time: 1.5743789672851562
    39     1.295160e+02     1.330673e+02
 * time: 1.574455976486206
    40     1.270177e+02     7.901632e+01
 * time: 1.5745329856872559
    41     1.264533e+02     2.155415e+01
 * time: 1.5746099948883057
    42     1.262863e+02     8.225580e+00
 * time: 1.5746889114379883
    43     1.262719e+02     6.032004e+00
 * time: 1.5747649669647217
    44     1.262707e+02     4.659021e+00
 * time: 1.5748400688171387
    45     1.262707e+02     4.722942e+00
 * time: 1.5749149322509766
    46     1.262706e+02     4.816900e+00
 * time: 1.57499098777771
    47     1.262705e+02     4.995290e+00
 * time: 1.5750648975372314
    48     1.262702e+02     5.268485e+00
 * time: 1.575139045715332
    49     1.262694e+02     5.722605e+00
 * time: 1.5752160549163818
    50     1.262674e+02     6.456700e+00
 * time: 1.5752949714660645
    51     1.262619e+02     7.663981e+00
 * time: 1.5753920078277588
    52     1.262474e+02     9.659510e+00
 * time: 1.5754780769348145
    53     1.262080e+02     1.301621e+01
 * time: 1.5755629539489746
    54     1.260905e+02     1.869585e+01
 * time: 1.5756521224975586
    55     1.254864e+02     1.996336e+01
 * time: 1.575740098953247
    56     1.242755e+02     2.959843e+01
 * time: 1.5758230686187744
    57     1.218934e+02     3.470690e+01
 * time: 1.575913906097412
    58     1.211416e+02     1.164980e+02
 * time: 1.5760118961334229
    59     1.173281e+02     1.882403e+02
 * time: 1.576111078262329
    60     1.148987e+02     6.454397e+01
 * time: 1.5761959552764893
    61     1.142754e+02     5.925562e+01
 * time: 1.576301097869873
    62     1.139379e+02     4.781775e+01
 * time: 1.5763919353485107
    63     1.134167e+02     4.884944e+01
 * time: 1.576474905014038
    64     1.110742e+02     6.710274e+01
 * time: 1.5765600204467773
    65     1.071014e+02     7.528856e+01
 * time: 1.5766448974609375
    66     1.048398e+02     6.135959e+01
 * time: 1.576740026473999
    67     1.031549e+02     6.415724e+01
 * time: 1.576841115951538
    68     1.023007e+02     1.955391e+01
 * time: 1.576936960220337
    69     1.019906e+02     1.355104e+01
 * time: 1.5770199298858643
    70     1.019243e+02     1.030982e+01
 * time: 1.5771069526672363
    71     1.019192e+02     1.001753e+00
 * time: 1.5771920680999756
    72     1.019190e+02     3.009678e-01
 * time: 1.5772819519042969
    73     1.019189e+02     4.438935e-02
 * time: 1.5773730278015137
    74     1.019189e+02     1.971922e-03
 * time: 1.5774590969085693
    75     1.019189e+02     2.678153e-05
 * time: 1.5775480270385742
FittedPumasModel

Dynamical system type:          No dynamical model

Number of subjects:                             40

Observation records:         Active        Missing
    resp:                        40              0
    Total:                       40              0

Number of parameters:      Constant      Optimized
                                  0              4

Likelihood approximation:              NaivePooled
Likelihood optimizer:                         BFGS

Termination Reason:                   GradientNorm
Log-likelihood value:                   -101.91895

-----------------
        Estimate
-----------------
θemax   104.13
θc50     41.558
θhill     1.8147
σ         3.0927
-----------------

and we may use the usual workflow to get estimates of parameter uncertainty

infer(emax_fit)
[ Info: Calculating: variance-covariance matrix.
[ Info: Done.
Asymptotic inference results using sandwich estimator

Dynamical system type:          No dynamical model

Number of subjects:                             40

Observation records:         Active        Missing
    resp:                        40              0
    Total:                       40              0

Number of parameters:      Constant      Optimized
                                  0              4

Likelihood approximation:              NaivePooled
Likelihood optimizer:                         BFGS

Termination Reason:                   GradientNorm
Log-likelihood value:                   -101.91895

--------------------------------------------------
        Estimate   SE        95.0% C.I.
--------------------------------------------------
θemax   104.13     4.1798    [ 95.942 ; 112.33  ]
θc50     41.558    2.1123    [ 37.418 ;  45.698 ]
θhill     1.8147   0.1155    [  1.5883;   2.0411]
σ         3.0927   0.27055   [  2.5624;   3.6229]
--------------------------------------------------

as well as inspect. Diagnostics for these models are typically simple enough that they can be constructed from the DataFrame constructed from inspect output.

2.4 Extensions

Since we used a normal PumasModel we can extend the response analysis with:

  • covariate effects including time, dose level, etc
  • random effects if there are multiple observations per subject
  • more complicated response models such as binary response and ordinal response

3 Concluding Remarks

This tutorial introduced a simple Exposure-Response (ER) modeling approach using the Emax model, focusing on cases where no explicit time component or dynamic system is involved. By working with eventless data, we demonstrated how to structure the dataset, define a model without pharmacokinetics, and fit it using maximum likelihood estimation in Pumas.

The methodology presented provides a foundation for analyzing ER relationships in various contexts, from efficacy assessments to safety evaluations. While we used a basic model, the approach can be extended to incorporate covariate effects, random effects, and more complex response types, making it highly adaptable to different modeling needs. By leveraging these capabilities, users can perform robust ER analyses even in the absence of traditional pharmacokinetic modeling constraints.