`using Pumas`

# Introduction to Bayesian Models in Pumas

# 1 Introduction to Bayesian Models in Pumas

This notebook will provide the basics on how to specify and fit Bayesian models in Pumas.

## 1.1 Bayesian Statistics

Bayesian statistics is a data analysis approach based on Bayes’ theorem where available knowledge about the parameters of a statistical model is updated with the information of observed data (Gelman et al., 2013; McElreath, 2020).

Previous knowledge is expressed as a prior distribution and combined with the observed data in the form of a likelihood function to generate a posterior distribution. The posterior can also be used to make predictions about future events.

## 1.2 Advantages of Bayesian analysis

The Bayesian workflow allows analysts to:

- Incorporate domain knowledge and insights from previous studies using prior distributions.
- Quantify the epistemic uncertainty in the model parameters’ values. Parameter values can be uncertain in reality due to model non-identifiability or practical non-identifiability issues because of lack of data.

This epistemic uncertainty can be further propagated forward via simulation to get a distribution of predictions which when used to make decisions, e.g. dosing decisions, can make those decisions more robust to changes in the parameter values. The above are advantages of Bayesian analysis which the traditional frequentist workflow typically doesn’t have a satisfactory answer for. Using bootstrapping or asymptotic estimates of standard errors can be considered somewhat ad-hoc and assumptions methods to quantify uncertainty in the parameter estimates when Bayesian inference uses the established theory of probability to more rigorously quantify said uncertainty with fewer assumptions about the model.

It is important to note that one can still reap the second benefit of Bayesian analysis due to its flexibility even when little to no domain knowledge is imposed on the model. So the Bayesian workflow doesn’t force you to incorporate your domain knowledge but it empowers you to if you want.

Let’s formalize some of the advantages.

First, Bayesian Statistics uses **probabilistic statements** in terms of:

- one or more parameters \(\theta\)
- unobserved data \(\tilde{y}\)

These statements are **conditioned on the observed values of \(y\)**:

- \(P(\theta \mid y)\)
- \(P(\tilde{y} \mid y)\)

We also, implicitly, **condition on the observed data from any covariate \(x\)**

Generally, we are interested in:

**expected response of a new subject to a drug**, e.g. \(\operatorname{E}[\hat{y} \mid y]\)- the
**probability of drug effect is higher than zero**, e.g. \(P(\theta > 0 \mid y) \geq 0.95\)

## 1.3 How to specify Bayesian models in Pumas?

Let’s revisit the 1-compartment model we specified in the Introduction to Pumas tutorial:

```
@model begin
@param begin
∈ RealDomain(lower = 0, init = 3.2)
tvcl ∈ RealDomain(lower = 0, init = 16.4)
tvv ∈ RealDomain(lower = 0, init = 3.8)
tvka ∈ PDiagDomain(init = [0.04, 0.04, 0.04])
Ω ∈ RealDomain(lower = 0.0001, init = 0.2)
σ_p end
@random begin
~ MvNormal(Ω)
η end
@covariates begin
Doseend
@pre begin
= tvcl * exp(η[1])
CL = tvv * exp(η[2])
Vc = tvka * exp(η[3])
Ka end
@dynamics Depots1Central1
@derived begin
:= @. Central / Vc
cp ~ @. Normal(cp, abs(cp) * σ_p)
Conc end
end
```

Bayesian models in Pumas are quite easy: **you just need to specify priors for all parameters in the @param block**.

Instead of having:

```
@param begin
∈ RealDomain(lower = 0, init = 3.2)
tvcl ...
end
```

We would say that `tvcl`

has a prior:

```
@param begin
~ LogNormal(log(1.5), 1)
tvcl ...
end
```

where `tvcl`

(typical value for clearance) is the parameter that we want the model to estimate and `LogNormal(log(1.5), 1)`

is the prior we are giving to `tvcl`

.

**You can use any prior you want**. Here we are giving `tvcl`

a `LogNormal`

prior since it has a positive-only domain and allows us to give more weight to certain values while also making it possible for higher values due to the wide-tail nature.

There is no one prior that fits all cases and generally it might be a good practice to follow a previous similar study’s priors where a good reference can be found.

However, if you have the task of choosing good prior distributions for an all-new model, it will generally be a **multi-step process consisting of**:

**Deciding the support of the prior**. The support of the prior distribution must match the the domain of the parameter. For example, different priors can be used for positive parameters than those for parameters between 0 and 1.**Deciding the center of the prior**, e.g. mean, median or mode.**Deciding the strength (aka informativeness) of the prior**. This is often controlled by a**standard deviation**or**scale parameter**in the distribution constructor. A*small*standard deviation or scale parameter implies low uncertainty in the parameter value which leads to a stronger (aka more informative) prior. A*large*standard deviation or scale parameter implies high uncertainty in the parameter value which leads to a weak (aka less informative) prior. You should study each prior distribution you are considering before using it to make sure the strength of the prior reflects your confidence level in the parameter values.**Deciding the shape of the probability density function (PDF) of the prior**. Some distributions are left skewed, others are right skewed and some are symmetric. Some have heavier tails than others. You should make sure the shape of the PDF reflects your knowledge about the parameter value prior to observing the data.

#### 1.3.0.1 Data preparation for modeling

We’ll use the `PharmaDatasets.jl`

package to load the data:

```
using PharmaDatasets
= dataset("pk_painrelief")
pkpain_df first(pkpain_df, 5)
```

Row | Subject | Time | Conc | PainRelief | PainScore | RemedStatus | Dose |
---|---|---|---|---|---|---|---|

Int64 | Float64 | Float64 | Int64 | Int64 | Int64 | String7 | |

1 | 1 | 0.0 | 0.0 | 0 | 3 | 1 | 20 mg |

2 | 1 | 0.5 | 1.15578 | 1 | 1 | 0 | 20 mg |

3 | 1 | 1.0 | 1.37211 | 1 | 0 | 0 | 20 mg |

4 | 1 | 1.5 | 1.30058 | 1 | 0 | 0 | 20 mg |

5 | 1 | 2.0 | 1.19195 | 1 | 1 | 0 | 20 mg |

Let’s filter out the placebo data as we don’t need that for the PK analysis:

```
using DataFramesMeta
= @rsubset pkpain_df :Dose != "Placebo"; pkpain_noplb_df
```

If you want to learn more about **data wrangling**, don’t forget to check our **Data Wrangling in Julia** tutorials!

Also we need to add the `:amt`

column:

`@rtransform! pkpain_noplb_df :amt = :Time == 0 ? parse(Int, chop(:Dose; tail = 3)) : missing;`

PumasNDF requires the presence of `:evid`

and `:cmt`

columns in the dataset:

```
@rtransform! pkpain_noplb_df begin
:evid = :Time == 0 ? 1 : 0
:cmt = :Time == 0 ? 1 : 2
end;
```

Further, observations at time of dosing, i.e., when `evid = 1`

have to be `missing`

:

`@rtransform! pkpain_noplb_df :Conc = :evid == 1 ? missing : :Conc;`

Here is the final result:

`first(pkpain_noplb_df, 10)`

Row | Subject | Time | Conc | PainRelief | PainScore | RemedStatus | Dose | amt | evid | cmt |
---|---|---|---|---|---|---|---|---|---|---|

Int64 | Float64 | Float64? | Int64 | Int64 | Int64 | String7 | Int64? | Int64 | Int64 | |

1 | 1 | 0.0 | missing | 0 | 3 | 1 | 20 mg | 20 | 1 | 1 |

2 | 1 | 0.5 | 1.15578 | 1 | 1 | 0 | 20 mg | missing | 0 | 2 |

3 | 1 | 1.0 | 1.37211 | 1 | 0 | 0 | 20 mg | missing | 0 | 2 |

4 | 1 | 1.5 | 1.30058 | 1 | 0 | 0 | 20 mg | missing | 0 | 2 |

5 | 1 | 2.0 | 1.19195 | 1 | 1 | 0 | 20 mg | missing | 0 | 2 |

6 | 1 | 2.5 | 1.13602 | 1 | 1 | 0 | 20 mg | missing | 0 | 2 |

7 | 1 | 3.0 | 0.873224 | 1 | 0 | 0 | 20 mg | missing | 0 | 2 |

8 | 1 | 4.0 | 0.739963 | 1 | 1 | 0 | 20 mg | missing | 0 | 2 |

9 | 1 | 5.0 | 0.600143 | 0 | 2 | 0 | 20 mg | missing | 0 | 2 |

10 | 1 | 6.0 | 0.425624 | 1 | 1 | 0 | 20 mg | missing | 0 | 2 |

Finally, we’ll import `pkpain_noplb_df`

`DataFrame`

into a `Population`

with `read_pumas`

:

```
= read_pumas(
pkpain_noplb
pkpain_noplb_df,= :Subject,
id = :Time,
time = :amt,
amt = [:Conc],
observations = [:Dose],
covariates = :evid,
evid = :cmt,
cmt )
```

```
Population
Subjects: 120
Covariates: Dose
Observations: Conc
```

### 1.3.1 Pumas Bayesian 1-compartment Model

Let’s then add the priors in the `@param`

block.

Since the original model in the Introduction to Pumas tutorial has `Ω`

as a `PDiagDomain`

, we’ll be using scalars and not a matrix for our between-subject variability parameters. Hence, we’ll define `ω²`

for every diagonal element of the original `Ω`

.

We’ll cover covariance matrices in the Random Effects in Bayesian Models tutorial.

Now we fit our model by using the `fit`

function. Instead of using estimation method as `FOCE`

or `NaivePooled`

, we will be using `BayesMCMC`

which uses No-U-Turn Sampler (NUTS) – a very fast and state-of-the-art Hamiltonian Monte Carlo (HMC) based sampler:

If you want to learn more about Markov chain Monte Carlo (MCMC) methods, Hamiltonian Monte Carlo (HMC), and No-U-Turn Sampler (NUTS); please check the Pumas Bayesian Workflow Documentation at `docs.pumas.ai`

.

```
= @model begin
pk_1cmp @param begin
~ LogNormal(log(3.2), 1)
tvcl ~ LogNormal(log(16.4), 1)
tvv ~ LogNormal(log(3.8), 1)
tvka ~ LogNormal(log(0.04), 0.25)
ω²cl ~ LogNormal(log(0.04), 0.25)
ω²v ~ LogNormal(log(0.04), 0.25)
ω²ka ∈ LogNormal(log(0.2), 0.25)
σ_p end
@random begin
~ Normal(0, sqrt(ω²cl))
ηcl ~ Normal(0, sqrt(ω²v))
ηv ~ Normal(0, sqrt(ω²ka))
ηka end
@covariates begin
Doseend
@pre begin
= tvcl * exp(ηcl)
CL = tvv * exp(ηv)
Vc = tvka * exp(ηka)
Ka end
@dynamics Depots1Central1
@derived begin
:= @. Central / Vc
cp ~ @. Normal(cp, abs(cp) * σ_p)
Conc end
end
```

```
PumasModel
Parameters: tvcl, tvv, tvka, ω²cl, ω²v, ω²ka, σ_p
Random effects: ηcl, ηv, ηka
Covariates: Dose
Dynamical variables: Depot, Central
Derived: Conc
Observed: Conc
```

By default `BayesMCMC`

uses `nsamples = 10_000`

as the number of MCMC samples to generate (including the adaptation samples); and `nadapts = 2_000`

as the number of adaptation steps in the NUTS algorithm which must be less than `nsamples`

. Additionally, `BayesMCMC`

by default uses 4 parallels chains with `nchains = 4`

and `parallel_chains = true`

. Parallelism is also extended to subjects with `parallel_subjects = true`

by default.

Here we’ll use `nsamples = 2_000`

and `nadapts = 1_000`

.

Similar to the original model fit in the Introduction to Pumas tutorial, we’ll fix `tvka`

to 2 as we don’t have a lot of information before `tmax`

.

```
= fit(
pk_1cmp_fit
pk_1cmp,
pkpain_noplb,init_params(pk_1cmp),
BayesMCMC(; nsamples = 2_000, nadapts = 1_000);
= (; tvka = 2),
constantcoef )
```

```
[ Info: Checking the initial parameter values.
[ Info: The initial log probability and its gradient are finite. Check passed.
```

Chains MCMC chain (2000×6×4 Array{Float64, 3}): Iterations = 1:1:2000 Number of chains = 4 Samples per chain = 2000 Wall duration = 467.22 seconds Compute duration = 467.12 seconds parameters = tvcl, tvv, ω²cl, ω²v, ω²ka, σ_p Summary Statistics parameters mean std mcse ess_bulk ess_tail rhat ⋯ Symbol Float64 Float64 Float64 Float64 Float64 Float64 ⋯ tvcl 3.1915 0.0806 0.0052 242.9638 506.5235 1.0105 ⋯ tvv 13.2615 0.2773 0.0098 784.2330 1450.6731 1.0033 ⋯ ω²cl 0.0729 0.0085 0.0003 825.6482 1413.5741 1.0033 ⋯ ω²v 0.0462 0.0056 0.0001 1765.2645 2873.9651 1.0042 ⋯ ω²ka 1.0922 0.1554 0.0024 5297.6226 4844.7888 1.0007 ⋯ σ_p 0.1044 0.0078 0.0002 6401.3528 4481.2766 1.0000 ⋯ 1 column omitted Quantiles parameters 2.5% 25.0% 50.0% 75.0% 97.5% Symbol Float64 Float64 Float64 Float64 Float64 tvcl 3.0419 3.1343 3.1882 3.2438 3.3561 tvv 12.7337 13.0769 13.2682 13.4410 13.7977 ω²cl 0.0582 0.0671 0.0724 0.0781 0.0907 ω²v 0.0362 0.0423 0.0459 0.0498 0.0577 ω²ka 0.8320 0.9904 1.0846 1.1854 1.4131 σ_p 0.0996 0.1024 0.1040 0.1056 0.1089

Pumas Bayesian models returns a different object that of other Pumas models which can be:

- exported to a
`DataFrame`

containing**all of the MCMC samples**with`DataFrame(Chains(fit_result))`

- summarized to a
`DataFrame`

containing**summary statistics of the MCMC samples**with`DataFrame(summarystats(fit_result))`

- summarized to a
`DataFrame`

containing**quantiles of the MCMC samples**with`DataFrame(quantile(fit_result))`

Here are our estimates both as summary statistics with `summarystats`

and quantiles with `quantile`

:

`DataFrame(summarystats(pk_1cmp_fit))`

Row | parameters | mean | std | mcse | ess_bulk | ess_tail | rhat | ess_per_sec |
---|---|---|---|---|---|---|---|---|

Symbol | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | |

1 | tvcl | 3.19153 | 0.0805674 | 0.00521381 | 242.964 | 506.523 | 1.01054 | 0.520128 |

2 | tvv | 13.2615 | 0.277268 | 0.00977303 | 784.233 | 1450.67 | 1.00327 | 1.67886 |

3 | ω²cl | 0.0728857 | 0.00854188 | 0.000296103 | 825.648 | 1413.57 | 1.00327 | 1.76752 |

4 | ω²v | 0.0461588 | 0.00559418 | 0.000133573 | 1765.26 | 2873.97 | 1.00419 | 3.77901 |

5 | ω²ka | 1.09216 | 0.155409 | 0.0024036 | 5297.62 | 4844.79 | 1.00075 | 11.341 |

6 | σ_p | 0.104353 | 0.00780061 | 0.000179481 | 6401.35 | 4481.28 | 0.999995 | 13.7038 |

`DataFrame(quantile(pk_1cmp_fit))`

Row | parameters | 2.5% | 25.0% | 50.0% | 75.0% | 97.5% |
---|---|---|---|---|---|---|

Symbol | Float64 | Float64 | Float64 | Float64 | Float64 | |

1 | tvcl | 3.04191 | 3.13434 | 3.18819 | 3.24376 | 3.35607 |

2 | tvv | 12.7337 | 13.0769 | 13.2682 | 13.441 | 13.7977 |

3 | ω²cl | 0.0581788 | 0.0670766 | 0.0724374 | 0.0781288 | 0.0907436 |

4 | ω²v | 0.0361928 | 0.0423419 | 0.0458682 | 0.0497765 | 0.0576807 |

5 | ω²ka | 0.831977 | 0.990353 | 1.08456 | 1.18539 | 1.41312 |

6 | σ_p | 0.0995631 | 0.102419 | 0.103958 | 0.105616 | 0.108895 |

## 1.4 References

Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., & Rubin, D. B. (2013). Bayesian Data Analysis. Chapman and Hall/CRC.

McElreath, R. (2020). Statistical rethinking: A Bayesian course with examples in R and Stan. CRC press.