A Course on Bioequivalence: Unit 5 - The Linear Model

Authors

Yoni Nazarathy

Andreas Noack

This is Unit 5 of a complete bioequivalence analysis course using Pumas. There are 15 units in the course in total.

1 Unit overview

In the previous unit we considered various designs. In this unit we consider analysis of the \(2 \times 2\) cross over design using a linear model. We also consider other designs that use the linear model. In particular we consider:

  • Basics of linear models with categorical variables.
  • The linear model applied to the \(2 \times 2\) cross over design.
  • Interpretation of the variability.
  • Extensions for multiple formulations.
  • Regulatory specifications and examples, also considering older SAS code specifications.
  • A brief exploration of Mixed Models for the \(2 \times 2\) cross over design.

We use the following packages.

using Pumas # available with Pumas products - also reexports Distributions
using NCA # available with Pumas products
using Bioequivalence  # available with Pumas products
using BioequivalencePower  # available with Pumas products
using PharmaDatasets # available with Pumas products
using GLM
using DataFramesMeta
using SummaryTables
using DataFrames
using DataFramesMeta
using AlgebraOfGraphics
using CairoMakie
using Random
using LinearAlgebra
using MixedModels

We have used most of these packages in previous units. One new package to this unit is the GLM package, standing for “Generalized Linear Models”. As a Pumas Bioequivalence user, most of the time you do not need to explicitly use GLM, yet for our understanding we will carry out model fits directly with GLM as well. Also, while we have already used the MixedModels package previously, we make more extensive use of this package in this unit. We also briefly use the in-built LinearAlgebra package not used in previous units.

Note that the unit also presents a few snippets of SAS code for the purpose of understanding how it appears in some regulatory guidelines.

2 Linear Models Overview

Linear models are a fundamental tool in statistics, used to quantify the relationship between a response variable and one or more predictor variables.

In the context of bioequivalence, the response variable is the endpoint on the log scale and the predictor variables are various elements of the design. Before we see how the linear model works for crossover designs, let us consider some basics of linear models.

Basic Linear Regression

The simplest linear model is simple linear regression, where we predict a response variable \(y\) based on one predictor \(x\).

For example (outside the scope of a standard bioequivalence trial), we may consider the relationship between an endpoint value and body weight.

So in this example, \(y\) can be the AUC (for example, in nanogram hours per milliliter) and \(x\) can be the body weight (in kilograms). For this, the model is \[ y = \beta_0 + \beta_1 x + \varepsilon, \] where the parameters of the model are \(\beta_0\), the intercept, and \(\beta_1\), the slope.

Here, \(\varepsilon\) signifies random error around the regression line. We model this error as having a normal distribution with \(0\) mean and a variance \(\sigma^2\). Hence, this \(\sigma\) (or \(\sigma^2\)) is also an unknown parameter.

Now, when we have data of weights \(x_i\) and respective AUC values \(y_i\) for \(i=1,\ldots,n\), we can use these datapoints to estimate \(\beta_0\), \(\beta_1\), and \(\sigma^2\).

The standard way to estimate parameters is via maximum likelihood estimation.

We briefly discussed such an estimation approach in Unit 2 in the context of the log-normal distribution. Now, for basic linear regression, it is carried out by considering the likelihood function.

Here is the expression of the likelihood function (we present it here only as an illustration): \[ L(\beta_0, \beta_1, \sigma^2) = \prod_{i=1}^n \frac{1}{\sqrt{2\pi \sigma^2}} \exp\left(-\frac{(y_i - \beta_0 - \beta_1 x_i)^2}{2\sigma^2}\right). \] It turns out that maximizing this expression is equivalent to:

  • Solving a least squares minimization problem to estimate \(\beta_0\) and \(\beta_1\), where the estimators are denoted \(\hat{\beta}_0\) and \(\hat{\beta}_1\).
  • Estimating \(\sigma^2\) by calculating the average squared residual (or, more precisely, the maximum likelihood estimator of the variance):

\[ \hat{\sigma}^2 = \frac{1}{n} \sum_{i=1}^n (y_i - \hat{\beta}_0 - \hat{\beta}_1 x_i)^2. \]

Note: In standard regression output, an unbiased estimator divides by \(n-2\) instead of \(n\). (Here \(2\) is the number of regression coefficients estimated).

We do not cover the basic ideas of least squares here, and you may proceed with the material even without an understanding of least squares. Yet if you are interested to learn more about least squares, one possible source is Section 2.3 in Liquet, Moka, and Nazarathy (2024). Another possible source is Section 8.1 in Nazarathy and Klok (2021).

Importantly, given some arbitrary weight value \(x\), we can predict the AUC value we expect for that \(x\) via,

\[ \hat{y} = \hat{\beta}_0 + \hat{\beta}_1 x. \]

Let us consider a small example with \(n=5\) datapoints. The GLM package provides the lm function (standing for linear model). A statistical model is specified via @formula(AUC ~ 1 + weight) where AUC and weight are columns in a dataframe. The 1 + in the formula stands for the intercept (note that even without this we fit an intercept unless we write 0 +).

df = DataFrame(
    weight = [60, 65, 70, 75, 80], # x values
    AUC = [125.5, 119.1, 137.0, 128.5, 132.2],
) # y values

# Fit simple linear regression model: AUC ~ weight
model = lm(@formula(AUC ~ 1 + weight), df)

# present the coefficient table neatly
simple_table(model |> coeftable, halign = :left)
Name Coef. Std. Error t Pr(>|t|) Lower 95% Upper 95%
(Intercept) 96.5 29.4 3.28 0.0464 2.85 190
weight 0.456 0.418 1.09 0.356 -0.876 1.79

The result above is a fit model. In the output above you see the estimated coefficients in the left most column. There is also additional statistical information including p-values for considering if the coefficients are meaningful in the model.

Specifically when a p-value is close to 0 (e.g. less than \(5\%\)) it implies that the estimated coefficient is most likely meaningful within the model. Similarly there are confidence intervals for the estimated parameters. More on this follows below.

For more clarity, here are the estimated regression line parameters, \(\hat{\beta}_0\) and \(\hat{\beta}_1\) as well as the estimated standard deviation of \(\varepsilon\) which we denote via \(\hat{\sigma}\).

β̂₀, β̂₁ = round.(coef(model), digits = 2)
println("Estimates of regression line parameters: β̂₀ = $(β̂₀), β̂₁ = $(β̂₁)")

# The estimation of σ̂²
rss = deviance(model) # Residual sum of squares
df_resid = dof_residual(model) # n- 2
σ̂ = round(sqrt(rss / df_resid), digits = 2)
println("σ̂: ", σ̂)
Estimates of regression line parameters: β̂₀ = 96.54, β̂₁ = 0.46
σ̂: 6.62

The estimated value of \(\beta_1\) at about \(0.46\) implies that an increase of the weight by a single kilogram implies an expected increase of the AUC by \(0.46\) ng·h/mL.

The estimated value of \(\hat{\sigma}\) is an indication of the spread of the AUC values around the regression line. To see this further, let us also observe the data and the plotted line visually.

In this case let us use the predict function applied to model:

x_points = range(50, stop = 100, length = 100)
pred = predict(model, DataFrame(weight = x_points))

plt =
    data(df) * mapping(:weight, :AUC) * visual(Scatter) +
    data(DataFrame(weight = x_points, auc_pred = pred)) *
    mapping(:weight, :auc_pred) *
    visual(Lines)

draw(
    plt;
    axis = (
        xlabel = "X: Weight (kg)",
        ylabel = "Y: AUC (ng·h/mL)",
        title = "Linear Regression: AUC (X) vs. Weight (Y)",
        limits = ((50, 100), (100, 150)),
    ),
)

Multiple Linear Regression

The simple linear regression case is quite easily extended to a cases where in place of the single response variable there are multiple predictor variables.

For example (still outside the scope of a standard bioequivalance trial), assume that AUC is not only predicted by the subject’s weight, but is also predicted by the height, and age. In this case the model can be denoted as,

\[ y = \beta_0 + \beta_1 x^{(1)} + \beta_2 x^{(2)} + \beta_3 x^{(3)} + \varepsilon, \]

where now \(x^{(1)}\) is the weight (Kg), \(x^{(2)}\) is the height (cm), and \(x^{(3)}\) is the age (years). Similarly there could have been more variables.

In this case, for each subject \(i=1,\ldots,n\), the data contains the response \(y_i\) and the predictors \(x^{(1)}_i, x^{(2)}_i, x^{(3)}_i\).

With this, the same ideas of maximum likelihood estimation through least squares can yield estimates, \(\hat{\beta}_0, \hat{\beta}_1, \hat{\beta}_2, \hat{\beta}_3\), as well as \(\hat{\sigma}^2\).

Now again for some given some specific \(x^{(1)}, x^{(2)}, x^{(3)}\) values we can predict the AUC as,

\[ \hat{y} = \hat{\beta}_0 + \hat{\beta}_1 x^{(1)} + \hat{\beta}_2 x^{(2)} + \hat{\beta}_3 x^{(3)}. \]

Let us see this via the GLM package where we provide data points for all three predictors (weight, height, and age). As you can see in the code below we use the @formula(AUC ~ 1 + weight + height + age) inside the lm function:

df = DataFrame(
    weight = [60, 65, 70, 75, 80],   # kg
    height = [165, 170, 172, 168, 175], # cm
    age = [25, 25, 27, 32, 31],   # years
    AUC = [126, 130, 134, 131, 136],
)

model = lm(@formula(AUC ~ 1 + weight + height + age), df)
simple_table(model |> coeftable, halign = :left)
Name Coef. Std. Error t Pr(>|t|) Lower 95% Upper 95%
(Intercept) -682.0 390 -1.75 0.331 -5642.0 4279
weight -4.67 2.77 -1.69 0.341 -39.8 30.5
height 5.33 2.65 2.01 0.294 -28.3 39
age 8.33 4.77 1.75 0.331 -52.3 68.9

The output of the fit model above now has four rows, with one row for every fit coefficient. In this particular case, the p-values for all coefficients are not small (e.g. none are less than \(5\%\)). This indicates that for each of those predictors there is no clear evidence of an effect of the predictor on the output (AUC). This does not mean that there is no such effect, but only indicates that we do not have evidence of such an effect.

While in practice we will not use this model due to the non-low p-values, for demonstration, let us still consider the estimated parameters:

β̂₀, β̂₁, β̂₂, β̂₃ = round.(coef(model), digits = 2)

println("Estimates of regression line parameters:")
println("β̂₀ (intercept) = $(β̂₀)")
println("β̂₁ (weight)    = $(β̂₁)")
println("β̂₂ (height)    = $(β̂₂)")
println("β̂₃ (age)       = $(β̂₃)")

rss = deviance(model)
df_resid = dof_residual(model)
σ̂ = round(sqrt(rss / df_resid), digits = 2)
println("σ̂: ", σ̂)
Estimates of regression line parameters:
β̂₀ (intercept) = -681.93
β̂₁ (weight)    = -4.67
β̂₂ (height)    = 5.33
β̂₃ (age)       = 8.33
σ̂: 0.73

Categorical Variables in Linear Models

Often, and certainly in bioequivalence studies, we need to model a categorical predictor such as formulation (Test vs. Reference), period, sequence, and others.

This is done by means of multiple linear regression where each categorical variable is represented by multiple boolean (or indicator) variables.

As an example assume we want to predict AUC as a function of the predictor formulation, and there are three formulations, R, S, and T. The “trick” for handling such a case, is to create a linear model,

\[ y = \beta_0 + \beta_1 x^{(\texttt{S})} + \beta_2 x^{(\texttt{T})} + \varepsilon. \]

Now the variables \(x^{(\texttt{S})}\) and \(x^{(\texttt{T})}\) are associated with the formulation predictor as follows:

  • If the formulation is R then both variables are \(0\).
  • If the formulation is S then \(x^{(\texttt{S})} = 1\) and \(x^{(\texttt{T})} = 0\).
  • If the formulation is T then \(x^{(\texttt{S})} = 0\) and \(x^{(\texttt{T})} = 1\).

Hence with categorical predictors, the \(x\) variables are set as boolean (also sometimes called indicator variables).

Such an approach allows us to incorporate categorical variables in the model. With this:

  • In the case of R formulation, \(\hat{y} = \hat{\beta}_0\).
  • In the case of S formulation, \(\hat{y} = \hat{\beta}_0 + \hat{\beta}_1\).
  • In the case of T formulation, \(\hat{y} = \hat{\beta}_0 + \hat{\beta}_2\).

In the same way, if there are more categorical predictors, we can create more categorical variables for this purpose.

Observe that a categorical predictor with \(k\) levels (e.g. \(k=3\) in the example above), requires \(k-1\) levels (e.g. \(2\) in this example).

Here is again an example using the GLM package:

df = DataFrame(
    formulation = ["R", "S", "T", "R", "S", "T"],
    AUC = [120.0, 125.5, 130.7, 119.2, 125.1, 132.4],
)

model = lm(@formula(AUC ~ 1 + formulation), df)
simple_table(model |> coeftable, halign = :left)
Name Coef. Std. Error t Pr(>|t|) Lower 95% Upper 95%
(Intercept) 120 0.555 216 2.2e-7 118 121
formulation: S 5.7 0.784 7.27 0.00537 3.2 8.2
formulation: T 11.9 0.784 15.2 0.000614 9.45 14.4

Let us again use the coefficients to interpret the effects:

# Show fitted coefficients
β̂₀, β̂₁, β̂₂ = round.(coef(model), digits = 2)
println("Estimates of regression line parameters:")
println("β̂₀ (intercept, R)      = $(β̂₀)")
println("β̂₁ (formulation: S)    = $(β̂₁), formulation S effect (β̂₀ + β̂₁) = $(β̂₀ + β̂₁)")
println("β̂₂ (formulation: T)    = $(β̂₂), formulation T effect (β̂₀ + β̂₂) = $(β̂₀ + β̂₂)")

# Estimate of σ̂² (residual variance)
rss = deviance(model)
df_resid = dof_residual(model)
σ̂ = round(sqrt(rss / df_resid), digits = 2)
println("σ̂: ", σ̂)
Estimates of regression line parameters:
β̂₀ (intercept, R)      = 119.6
β̂₁ (formulation: S)    = 5.7, formulation S effect (β̂₀ + β̂₁) = 125.3
β̂₂ (formulation: T)    = 11.95, formulation T effect (β̂₀ + β̂₂) = 131.54999999999998
σ̂: 0.78

We can also see these values directly with the predict function:

newdata = DataFrame(formulation = ["R", "S", "T"])
auc_pred = predict(model, newdata)
println("Predicted AUC for formulations R, S, T: ", round.(auc_pred, digits = 2))
Predicted AUC for formulations R, S, T: [119.6, 125.3, 131.55]

More than one categorical variable

Here is another example of categorical variables with more than one categorical variable.

Assume there is also a location variable which indicates the physical location where the trial took place.

df = DataFrame(
    formulation = ["R", "S", "T", "R", "S", "T"],
    location = ["loc_A", "loc_A", "loc_A", "loc_B", "loc_B", "loc_B"],
    AUC = [120.0, 125.5, 130.7, 119.2, 125.1, 132.4],
)

model = lm(@formula(AUC ~ 1 + formulation + location), df)
simple_table(model |> coeftable, halign = :left)
Name Coef. Std. Error t Pr(>|t|) Lower 95% Upper 95%
(Intercept) 120 0.775 154 4.21e-5 116 123
formulation: S 5.7 0.95 6 0.0266 1.61 9.79
formulation: T 11.9 0.95 12.6 0.00625 7.86 16
location: loc_B 0.167 0.775 0.215 0.85 -3.17 3.5

In this case, an associated linear model is:

\[ y = \beta_0 + \beta_1 x^{(\texttt{S})} + \beta_2 x^{(\texttt{T})} + \beta_3 x^{(\texttt{loc\_B})} + \varepsilon. \]

Where in such a model, we can interpret the estimate for the endpoint as follows:

  • R, locA: \(\hat{y} = \hat{\beta}_0\).
  • S, locA: \(\hat{y} = \hat{\beta}_0 + \hat{\beta}_1\).
  • T, locA: \(\hat{y} = \hat{\beta}_0 + \hat{\beta}_2\).
  • R, locB: \(\hat{y} = \hat{\beta}_0 \quad \quad ~ +\,\hat{\beta}_3\).
  • S, locB: \(\hat{y} = \hat{\beta}_0 + \hat{\beta}_1 + \,\hat{\beta}_3\).
  • T, locB: \(\hat{y} = \hat{\beta}_0 + \hat{\beta}_2 + \,\hat{\beta}_3\).

To see this, lets again use the predict function and compare with a manual computation where we use the coef function to extract the coefficients. As you can see we obtain the same values:

β̂₀, β̂₁, β̂₂, β̂₃ = coef(model)
simple_table(
    DataFrame(
        formulation = df.formulation,
        location = df.location,
        manual = round.(
            [β̂₀, β̂₀ + β̂₁, β̂₀ + β̂₂, β̂₀ + β̂₃, β̂₀ + β̂₁ + β̂₃, β̂₀ + β̂₂ + β̂₃],
            digits = 2,
        ),
        prediction = round.(predict(model, df), digits = 2),
    ),
    round_mode = nothing,
)
formulation location manual prediction
R loc_A 119.52 119.52
S loc_A 125.22 125.22
T loc_A 131.47 131.47
R loc_B 119.68 119.68
S loc_B 125.38 125.38
T loc_B 131.63 131.63

Relationship to ANOVA

When a linear model includes only categorical predictors and has a single predictor, as in the example with the formulation variable, the model is mathematically equivalent to a classical one-way analysis of variance (ANOVA). If there are two categorical predictors it is equivalent to a classical two-way analysis of variance (ANOVA). See Section 7.3 in Nazarathy and Klok (2021) for an elementary introduction to ANOVA (with the Julia language).

ANOVA is a statistical procedure traditionally used to test whether the means of several groups are equal. In the linear model formulation, the group assignments are represented by indicator variables, and fitting the model amounts to estimating the mean of each group and testing for differences among them.

We also specify very specific ANOVA style (Type III) p-values for various quantities in the model. More on this in the sequel.

Nesting or Multilevel (Hierarchical) Models

Another element of linear models that we may consider is nesting of variables.

This is a case where one variable can be considered differently based on another variable.

Do not confuse this with a different topic (not discussed here) called “nested models”.

To see our basic type of use of nesting consider this data where the formulation is either R or T and within each formulation we have an identifier of the subject, id_within_formulation, which is either 1, 2, or 3.

df = DataFrame(
    formulation = ["R", "R", "R", "R", "R", "R", "T", "T", "T", "T"],
    id_within_formulation = ["1", "1", "2", "2", "3", "3", "1", "1", "2", "3"],
    AUC = [120.0, 125.5, 130.7, 119.2, 125.1, 132.4, 129.3, 130.5, 132.1, 128.8],
)
simple_table(df, round_mode = nothing)
formulation id_within_formulation AUC
R 1 120.0
R 1 125.5
R 2 130.7
R 2 119.2
R 3 125.1
R 3 132.4
T 1 129.3
T 1 130.5
T 2 132.1
T 3 128.8

Observe that our meaning with this is that a subject with id 1 within formulation R is a different subject then the subject with id 1 within formulation T. Similarly for the other ids.

Now if we were to naively specify a linear model formula such as AUC ~ formulation + id_within_formulation, then we get:

model = lm(@formula(AUC ~ formulation + id_within_formulation), df)
simple_table(model |> coeftable, halign = :left)
Name Coef. Std. Error t Pr(>|t|) Lower 95% Upper 95%
(Intercept) 124 2.81 44 9.16e-9 117 131
formulation: T 5.12 3.08 1.66 0.147 -2.41 12.7
id_within_formulation: 2 1.86 3.63 0.513 0.626 -7.01 10.7
id_within_formulation: 3 3.3 3.63 0.908 0.399 -5.58 12.2

The problem with this approach is that it did not separate id_within_formulation across formulations.

Instead what we want to do is have id_within_formulation nested within formulation. For this we use & in the statistical formula, or more specifically we introduce a term such as:

formulation & id_within_formulation

This term means that we want to consider id_within_formulation separably for each of the formulations. Note that in other languages such as SAS this would appear as, id_within_formulation(formulation).

With such a nesting specification, this is what the estimated model looks like now:

model = lm(@formula(AUC ~ formulation + formulation & id_within_formulation), df)
simple_table(model |> coeftable, halign = :left)
Name Coef. Std. Error t Pr(>|t|) Lower 95% Upper 95%
(Intercept) 123 3.68 33.3 4.84e-6 113 133
formulation: T 7.15 5.21 1.37 0.242 -7.32 21.6
formulation: R & id_within_formulation: 2 2.2 5.21 0.422 0.695 -12.3 16.7
formulation: T & id_within_formulation: 2 2.2 6.38 0.345 0.748 -15.5 19.9
formulation: R & id_within_formulation: 3 6 5.21 1.15 0.314 -8.47 20.5
formulation: T & id_within_formulation: 3 -1.1 6.38 -0.172 0.872 -18.8 16.6

Let us understand the meaning of the coefficients fit for this linear model. For this consider this representation of the linear model with a total of 6 estimated coefficients:

\[ y = \beta_0 + \beta_1 x^{(\texttt{T})} + \beta_2 x^{(\texttt{R\&2})} + \beta_3 x^{(\texttt{T\&2})} + \beta_4 x^{(\texttt{R\&3})} + \beta_5 x^{(\texttt{T\&3})} + \varepsilon. \]

As before, when working with categorical variables each of these \(x\) variables are set to either \(0\) or \(1\) depending on the case of the data. For example \(x^{(\texttt{R\&3})} = 1\) only for data points where formulation is R and id_within_formulation is 3.

With these settings the meaning that the coefficients take are as follows:

  • R, 1: \(\hat{y} = \hat{\beta}_0\).
  • R, 2: \(\hat{y} = \hat{\beta}_0 + \hat{\beta}_2\).
  • R, 3: \(\hat{y} = \hat{\beta}_0 + \hat{\beta}_4\).
  • T, 1: \(\hat{y} = \hat{\beta}_0 + \hat{\beta}_1\).
  • T, 2: \(\hat{y} = \hat{\beta}_0 + \hat{\beta}_1 + \,\hat{\beta}_3\).
  • T, 3: \(\hat{y} = \hat{\beta}_0 + \hat{\beta}_1 + \,\hat{\beta}_5\).

We can verify it using the predict function compared to a manual computation. Note that unique simply removes extra repeated lines from the dataframe.

β̂₀, β̂₁, β̂₂, β̂₃, β̂₄, β̂₅ = coef(model)
df_predict = unique(
    DataFrame(
        formulation = df.formulation,
        id_within_formulation = df.id_within_formulation,
        prediction = round.(predict(model, df), digits = 2),
    ),
)
df_predict.manual =
    round.(
        [β̂₀, β̂₀ + β̂₂, β̂₀ + β̂₄, β̂₀ + β̂₁, β̂₀ + β̂₁ + β̂₃, β̂₀ + β̂₁ + β̂₅],
        digits = 2,
    )
simple_table(df_predict, round_mode = nothing)
formulation id_within_formulation prediction manual
R 1 122.75 122.75
R 2 124.95 124.95
R 3 128.75 128.75
T 1 129.9 129.9
T 2 132.1 132.1
T 3 128.8 128.8

Note that in principle we could have had identifier numbers for all subjects, not repeating them within each formulation. However, when it comes to considering type III p-values, a discussion of a future unit, our implementation requires to separate the ids into id_within_formulation as above.

3 Inference with the linear model

The statistical assumptions of a linear model enable us to carry out statistical inference, including confidence intervals, hypothesis tests, and estimation of certain quantities which we call contrasts.

In general a statistical model makes use of specific statistical assumptions. As such, it differs from a machine learning model which is typically focused on finding an input-output relationship between \(x\) and \(y\) with minimal statistical assumptions. See for example, Chapter 2 of Liquet, Moka, and Nazarathy (2024) where linear models are presented as machine learning models.

A theoretical example

As a statistical model, the benefit of making statistical assumptions is that we know the distributional properties of the estimators \(\hat{\beta}\).

To get a feel for this, let us adapt Listing 8.5 from Nazarathy and Klok (2021). Note that this example and code does not have practical bioequivalence value, but is rather here for pedagogic purposes.

Here assume the simple linear regression case where: the actual population \(\beta_0\) is \(2.0\), the actual population \(\beta_1\) is \(1.5\), and the standard deviation of \(\varepsilon\), is at \(\sigma = 2.5\). Assume further that we consider a sample of n=10 points over \(x_1 =1, x_2=2\), and up to \(x_{10}=10\).

Now to get a feel for the distribution of the estimators, let us simulate this sampling many times, e.g. N=10_000 times. This snippet of code then simulates ten thousand samples and for each it records a pair of estimators \((\hat{\beta}_0, \hat{\beta}_1)\) in the array ests:

β₀, β₁, σ = 2.0, 1.5, 2.5
n, N = 10, 10_000

function coef_est()
    x_vals = 1:n
    y_vals = β₀ .+ β₁ * x_vals + rand(Normal(0, σ), n)
    data = DataFrame(X = x_vals, Y = y_vals)
    model = lm(@formula(Y ~ X), data)
    return coef(model)
end

ests = [coef_est() for _ = 1:N];

Here is a scatter plot of the ten thousand simulation repetitions which can give a hint at the distribution of the estimators:

fig = Figure()
ax = Axis(
    fig[1, 1];
    xlabel = L"\hat{\beta}_0",
    ylabel = L"\hat{\beta}_1",
    xlabelsize = 20,
    ylabelsize = 20,
)
scatter!(ax, first.(ests), last.(ests); color = :blue, markersize = 3)
fig

Part of the beauty of linear models is that we know the exact properties of the distributions of these estimators. For example, while the details are not critical, based on \(\sigma\) and the (design) values of \(x\), we know the covariance matrix of the estimator, \(\Sigma\).

See for example Section 8.2 of Nazarathy and Klok (2021) for more details:

x_bar = mean(1:n)
s_xx, s_x2 = sum((x - x_bar)^2 for x = 1:n), sum(x^2 for x = 1:n)
var_0, var_1 = σ^2 * s_x2 / (n * s_xx), σ^2 / s_xx
cov_01 = -σ^2 * x_bar / s_xx
Σ = [
    var_0 cov_01
    cov_01 var_1
]
2×2 Matrix{Float64}:
  2.91667   -0.416667
 -0.416667   0.0757576

These covariance values (including the variances) together with the fact that the estimators follow a (multivariate/bivariate) normal distribution, can even be translated into probalistic information.

For example this code which uses a bit of linear algebra and probability distributions (the details can be omitted) to find an ellipse in the \((\hat{\beta}_0, \hat{\beta}_1)\) space which covers \(100\times(1-\alpha)\) percent of the predictors:

α = 0.05
A = cholesky(Σ).L
r = quantile(Rayleigh(), 1 - α)
is_in_ellipse(β̂) = norm(inv(A) * (β̂ - [β₀, β₁]))  r;

We can then apply the is_in_ellipse function on every estimate see if it is in the ellipse or not. You can see the proportion is close to \(1-\alpha\).

est_in = is_in_ellipse.(ests)
println("Proportion of points inside ellipse: ", mean(est_in))
Proportion of points inside ellipse: 0.9493

Here is a plot presenting this ellipse with the points inside the ellipse and those not. Also since the estimators are unbiased the center of the ellipse is at the actual \((\beta_0, \beta_1)\) point:

fig = Figure()
ax = Axis(
    fig[1, 1];
    xlabel = L"\hat{\beta}_0",
    ylabel = L"\hat{\beta}_1",
    xlabelsize = 20,
    ylabelsize = 20,
)

in_ellipse = is_in_ellipse.(ests)
ellipse = [r * A * [cos(t), sin(t)] .+ [β₀, β₁] for t in range(0, 2π; length = 500)]
xs_in, ys_in = first.(ests[in_ellipse]), last.(ests[in_ellipse])
xs_out, ys_out = first.(ests[.!in_ellipse]), last.(ests[.!in_ellipse])

scatter!(ax, xs_in, ys_in; color = :green, markersize = 3)
scatter!(ax, xs_out, ys_out; color = :blue, markersize = 3)
lines!(ax, first.(ellipse), last.(ellipse); color = :red, linewidth = 2)
scatter!(ax, [β₀], [β₁]; color = :red, markersize = 8)
fig

The point of all this is that we have a-lot of distributional information in the linear model. We know the properties of the estimators, and based on that, statistical inference theory is developed.

While this distributional information relies on knowing \(\sigma\), in the realistic case where we do not know \(\sigma\) and estimate it based on the data, the resuliting marginal distribution of each \(\hat{\beta}_i\) is closely related to a T-distribution. This allows us to carry out inference.

Basic parameter inference

Let us return to the simple linear regression example from the top of this unit, and now consider confidence intervals and hypothesis tests. These ideas also extend to more complex linear models that involve multiple estimated coefficients. Here is the data example and model fitting once more:

df = DataFrame(weight = [60, 65, 70, 75, 80], AUC = [125.5, 119.1, 137.0, 128.5, 132.2])
model = lm(@formula(AUC ~ 1 + weight), df)
simple_table(model |> coeftable, halign = :left)
Name Coef. Std. Error t Pr(>|t|) Lower 95% Upper 95%
(Intercept) 96.5 29.4 3.28 0.0464 2.85 190
weight 0.456 0.418 1.09 0.356 -0.876 1.79

In addition to the estimated coefficients in the first column, we are also presented additional information, with each row relating to each individual coefficient in isolation.

The second numerical column, Std. Error, presents model based estimates of the standard error associated with each coefficient.

These estimates are related to \(\hat{\sigma}\) (not appearing here) and from quantities associated with the \(x\) values. The specific formulas used to compute these standard errors are not as important.

The GLM function stderror can retrieve these estimates:

round.(stderror(model), digits = 3)
2-element Vector{Float64}:
 29.439
  0.418

To understand the interpretation of these standard errors, consider both the estimate \(\hat{\beta}_i\) and standard error \(S_{\hat{\beta}_i}\) as a ratio,

\[ T_i = \frac{\hat{\beta}_i}{S_{\hat{\beta}_i}}. \]

The key is now to consider a case where the actual unknown coefficient is zero, namely, \(\beta_i = 0\). If this is the case, \(T_i\) follows a T-distribution with \(n-p\) degrees of freedom, where \(p\) is the number of estimated coefficients (e.g. \(p=2\) in simple linear regression).

For our specific data, here are those T-values, also appearing via the third numerical column, t, in the table:

round.(coef(model) ./ stderror(model), digits = 2)
2-element Vector{Float64}:
 3.28
 1.09

Each such value can be viewed as a statistic of a specific hypothesis test about a parameter within the model. For example for the second row, weight, there is the hypothesis test:

\[ {\cal T}_{\beta_1} = \begin{cases} H_0: & \beta_1 = 0 \qquad (\text{No effect of of weight on AUC}),\\ H_1: & \beta_1 \neq 0 \qquad (\text{Yes effect of weight on AUC}). \end{cases} \]

In this hypothesis test, rejecting \(H_0\) implies determining that there is an effect of weight on AUC.

To see a manual calculation of the p-value, we can use the cdf function from the Distributions package together with TDist which stands for a T-distribution.

Since this a two sided hypothesis, based on T_statistic, we obtain the p-value via 2*(1-cdf(TDist(n-p), T_statistic)). You can see this value agrees with the table above:

n = 5
p = 2
T_statistic = coef(model)[2] / stderror(model)[2] # [2] is for weight
p_value = 2 * (1 - cdf(TDist(n - p), T_statistic))
round(p_value, digits = 3)
0.356

Such a non-low p-value then implies that we do not reject \(H_0\) and hence we do not have evidence that weight affects AUC. If the p-value would have been low, as it is for the (Intercept) row, then we would reject \(H_0\) and conclude that the coefficient is meaningful within the model.

In a similar way, we can obtain the final column of the table, where in the weight case it presents a \(95\%\) confidence intervals for the estimated quantity \(\beta_1\). To reproduce these values, note the use of \(97.5\%\) qunatiles of the T-distribution:

β̂₁ = coef(model)[2]
ci_radius = stderror(model)[2] * quantile(TDist(n - p), 0.975)
ci = [β̂₁ - ci_radius, β̂₁ + ci_radius]
round.(ci, digits = 3) |> println
[-0.876, 1.788]

You can see that this manually calculated CI is the same as that appearing in the coefficient table.

Note that we can also use the confint function, supported by GLM directly. Observe these values agree with the coefficient table:

confint(model)
2×2 Matrix{Float64}:
  2.85315   190.227
 -0.875607    1.78761

Importantly, if we want a different confidence level we can use the level= keyword argument in confint. For example if we were to use the output of a linear model in the context of a TOST bioequivalence test, the most common way is to use a \(100\times(1-2\alpha)\) confidence interval, i.e. \(90\%\). In this case:

confint(model, level = 0.9)
2×2 Matrix{Float64}:
 27.2602  165.82
 -0.5287    1.4407

Again we stress that the above example is not a typical bioequivalence example. Yet the ideas and principles follow.

The estimated covariance matrix of the coefficient estimates

Just like we computed the theoretical covariance matrix of the coefficient estimates above, based on data, we also have an estimated covariance matrix. This is sometimes also called a variance-covariance matrix of the coefficient estimates.

With the GLM package we can obtain this matrix via the vcov function:

vcov(model)
2×2 Matrix{Float64}:
 866.633   -12.2554
 -12.2554    0.175077

As a verification, if we consider the diagonal entries of the matrix, via diag and take their square roots, we obtain the standard errors again:

round.(sqrt.(diag(vcov(model))), digits = 3)
2-element Vector{Float64}:
 29.439
  0.418

Contrasts

While inference on individual parameters like \(\beta_1\) is useful, we are often interested in linear combinations of several parameters. Such a linear combination is called a contrast.

Consider the parameter \(\beta = (\beta_0, \beta_1, \dots, \beta_p)\). Now take a contrast coefficient vector \(c = (c_0, c_1, \dots, c_p)\). The contrast itself is the scalar value: \[ L = \sum_{i=0}^p c_i \beta_i. \]

This is simply a linear combination of the various coefficients. Note that \(c_i\) values may be positive, negative, or zero.

The estimate of the contrast is naturally \[ \hat{L} = \sum_{i=0}^p c_i \hat{\beta}_i. \]

It turns out, that just like we can do inference for individual coefficients using a T-distribution, we can also do inference for contrasts.

The standard error for a contrast with coefficient vector \(c\) is given by, \[ \text{SE} = \sqrt{c^\top \hat{\Sigma} \, c}, \] where \(\hat{\Sigma}\) is the estimated variance covariance matrix (as you would get from vcov), and \(c^\top\) is a row vector of the coefficients while \(c\) is a column vector. Hence \(c^\top \hat{\Sigma} \, c\) is a matrix multiplication (row times matrix times column) which results in a single number (scalar).

With this, we can form a T-statistic to test the null hypothesis \(H_0: L=0\) vs. the alternative hypothesis \(H_1: L \neq 0\). That T-statistic is: \[ T = \frac{\hat{L}}{\text{SE}} \] Just as it did for a single coefficient, this statistic follows a T-distribution with \(n-p\) degrees of freedom, where \(n\) is the total number of observations and \(p\) is the number of parameters in the model.

As a simple example let us revisit an example from above where there are three formulations R, S, and T, with some modified values in comparison to above:

df = DataFrame(
    formulation = ["R", "S", "T", "R", "S", "T"],
    AUC = log.([90.0, 127.9, 125.7, 90.2, 122.1, 119.8]), # observe the log transformation
)
simple_table(df)
formulation AUC
R 4.5
S 4.85
T 4.83
R 4.5
S 4.8
T 4.79

As above let us fit a model for it, where we know that formulation is a categorical predictor with \(3\) separate levels (hence we expect \(2\) coefficients for it):

model = lm(@formula(AUC ~ 1 + formulation), df)
simple_table(model |> coeftable, halign = :left)
Name Coef. Std. Error t Pr(>|t|) Lower 95% Upper 95%
(Intercept) 4.5 0.0193 233 1.74e-7 4.44 4.56
formulation: S 0.327 0.0273 12 0.00125 0.24 0.414
formulation: T 0.309 0.0273 11.3 0.00148 0.222 0.396

Now hypotheically say we are only interested in carrying out a bioequivalence comparison between S and T. As you recall from above the estimator for the effect of S is \(\beta_0 + \beta_1\) while that of T is \(\beta_0 + \beta_2\). Thus are quantity of intest is,

\[ L = (\beta_0 + \beta_2) - (\beta_0 + \beta_1) = \beta_2 - \beta_1. \] Positioned as a contrast we can set \(c = (0, -1, 1)\) and thus,

\[ L = \sum_{i=0}^2 c_i \beta_i = \beta_2 - \beta_1. \]

Now recall the concept of a TOST test from Unit 3. The general idea was to consider a \(100 \times(1-2\alpha)\) percent confidence interval and consider if it lies within \(\log(0.8)\) and \(\log(1.25)\).

For this let us estimate the contrast:

c = [0, -1, 1];
= c' * coef(model)
-0.01818364421195323

And present back from the log scale on the natural scale, the estimate for the GMR is:

exp(L̂)
0.9819806807302752

If we wanted to obtain a \(90\%\) confidence interval for it, we can use the formula for the standard error of the contrast:

Σ̂ = vcov(model)
SE = sqrt(c' * Σ̂ * c)
0.027294111055944363

Now we can construct the confidence interval

n = 6
p = 3
ci_radius = SE * quantile(TDist(n - p), 0.95)
log_scale_ci = [L̂ - ci_radius, L̂ + ci_radius]
log_scale_ci |> println
[-0.08241660715643286, 0.04604931873252639]

And presenting this confidence interval back on the natural scale we get:

percent_ci = round.(100exp.(log_scale_ci), digits = 2) |> println
[92.09, 104.71]

Thus if this was part of a BE trial we would conclude the treatments are BE since the CI is a subset of \([80, 125]\).

In practice, we do not typically formulate such contrasts ourselves, but rather let the Bioequivalence package do the work for us.

Further, often the contrast at question would be quite simple as it would be based on a single coefficient. Details follow. Still, understanding the concepts of contrasts is important.

4 Linear model for the \(2 \times 2\) crossover design

The above concepts are now used for our linear model of a \(2 \times 2\) crossover design. Let us explore this on an example dataset and also consider the subtle point of dropping subjects that do not complete both periods or not.

Two subtle variants of an example datasets

To explore this let us consider an example dataset appearing in Park, Kim, and Bae (2020):

data_2020_PKB = DataFrame(
    id = [1, 1, 2, 2, 3, 4, 4, 5, 5, 6],
    sequence = ["TR", "TR", "TR", "TR", "TR", "RT", "RT", "RT", "RT", "RT"],
    period = [1, 2, 1, 2, 1, 1, 2, 1, 2, 1],
    cmax = [269.3, 410.4, 120.2, 137.3, 105.2, 90.9, 68.9, 228.3, 301.5, 105.3],
)
simple_table(data_2020_PKB, round_mode = nothing)
id sequence period cmax
1 TR 1 269.3
1 TR 2 410.4
2 TR 1 120.2
2 TR 2 137.3
3 TR 1 105.2
4 RT 1 90.9
4 RT 2 68.9
5 RT 1 228.3
5 RT 2 301.5
6 RT 1 105.3

While this is a dataset is unrealistically small, it serves as a good example for understanding the analysis.

Observe that there are only \(6\) subjects, half of which were allocated to the TRsequence and the other half allocated to the RT sequence. Observe also that subjects \(3\) and \(6\) dropped out after the first period, so for subject \(3\) we only have the endpoint result for the T formulation and for subject \(6\) we only have the endpoint result for the R formulation.

Let us also consider a variant of this dataset which removes subjects \(3\) and \(6\) since they did not complete both periods. Such filtering of the dataset can be done more generally as follows:

ids_with_both_periods =
    (@rsubset @combine(@groupby(data_2020_PKB, :id), :count = length(:id)) :count == 2).id
data_2020_PKB_complete_case = @rsubset data_2020_PKB :id in ids_with_both_periods
simple_table(data_2020_PKB_complete_case, round_mode = nothing)
id sequence period cmax
1 TR 1 269.3
1 TR 2 410.4
2 TR 1 120.2
2 TR 2 137.3
4 RT 1 90.9
4 RT 2 68.9
5 RT 1 228.3
5 RT 2 301.5

Now let’s apply Pumas on both variants of the dataset. First for data_2020_PKB:

be_result = pumas_be(data_2020_PKB, endpoint = :cmax)
Observation Counts
Sequence ╲ Period 1 2
RT 3 2
TR 3 2
Paradigm: Non replicated crossover design
Model: Linear model
Criteria: Standard ABE
Endpoint: cmax
Formulations: Reference(R), Test(T)
Results(cmax) Assessment Criteria
R Geometric Marginal Mean 160.5
Geometric Naive Mean 165.2
T Geometric Marginal Mean 139.8
Geometric Naive Mean 147.9
Geometric Mean T/R Ratio (%) 87.08
Degrees of Freedom 2
90% Confidence Interval (%) [55.16, 137.5] Fail CI ⊆ [80, 125]
Variability CV (%) | σ̂ 22.39 | 0.2212
ANOVA Formulation (p-value) 0.4698
Sequence (p-value) 0.2088
Period (p-value) 0.4684

And now for data_2020_PKB_complete_case:

be_result_complete_case = pumas_be(data_2020_PKB_complete_case, endpoint = :cmax)
Observation Counts
Sequence ╲ Period 1 2
RT 2 2
TR 2 2
Paradigm: Non replicated crossover design
Model: Linear model
Criteria: Standard ABE
Endpoint: cmax
Formulations: Reference(R), Test(T)
Results(cmax) Assessment Criteria
R Geometric Marginal Mean 184.9
Geometric Naive Mean 184.9
T Geometric Marginal Mean 161
Geometric Naive Mean 161
Geometric Mean T/R Ratio (%) 87.08
Degrees of Freedom 2
90% Confidence Interval (%) [55.16, 137.5] Fail CI ⊆ [80, 125]
Variability CV (%) | σ̂ 22.39 | 0.2212
ANOVA Formulation (p-value) 0.4698
Sequence (p-value) 0.1476
Period (p-value) 0.4684

Observe that the majority of the output values of pumas_be are exactly the same for both of the above cases.

The one notable difference is in the marginal and naive means for the geometric means of R and T. We discuss these differences when we take a deep dive to understand marginal means in a future unit.

The reason that both outputs were very similar is that when using a Linear model (as we do here), we only use subjects that have evaluable data for both the test and comparator products. That is, the linear model inherently implements a complete case analysis.

This practice in agreement with Section 2.2.3.2 of International council for harmonisation of technical requirements for pharmaceuticals for human use (2024) which states:

The primary statistical analyses should include all data for all subjects who provide evaluable data for both the test and comparator products.

It is also in agreement with EMA (2010) document where Section 4.1.8 states:

… subjects in a crossover trial who do not provide evaluable data for both of the test and reference products (or who fail to provide evaluable data for the single period in a parallel group trial) should not be included.

The same is echoed in EMA (2015) in Section 8.1:

The question of whether to use fixed or random effects is not important for the standard two period, two sequence (2×2) crossover trial. In section 4.1.8 of the guideline it is stated that “subjects in a crossover trial who do not provide evaluable data for both of the test and reference products should not be included. Provided this is followed the confidence intervals for the formulation effect will be the same regardless of whether fixed or random effects are used.”

This aspect of using a fixed effects model (Linear model as we do here) or a mixed effects model (Mixed model as is used in the sequel), is also discussed further below. Note that since some guidances rely on SAS code, there is sometimes a discussion about using SAS’s PROC GLM or alterntativly SAS’s PROC MIXED. In fact, the FDA guidance FDA (2022) states in Section II.B.2 (Missing Data and Intercurrent Events):

Statistical methods for handling missing data include complete case analysis, available case analysis, weighting methods, imputation, and model-based approaches. For example, in a two-way crossover study, a complete case analysis could be a general linear model as implemented in SAS PROC GLM, which removes all subjects with any missing observations for any variables included in the GLM model (i.e., removes subjects missing one or both periods). An available case analysis could be done using SAS PROC MIXED, which uses all observed data (e.g., in a two-way crossover study, uses all subjects with one or two complete periods of data).

Exploring the linear model used

Since pumas_be uses a Linear model for the analysis, let us understand the components of this model. For now let us stick with the data data_2020_PKB, and return to the smaller data_2020_PKB_complete_case dataset later.

Note that part of the process that pumas_be uses to analyze the data involves internally activating the preprocess_be function.

This functions makes slight modifications to the data including applying a log transformation, removing missing values, and other steps. To see that modified data used in the analysis, you can use the data field of the result of pumas_be.

In our presentation here, let us consider only a subset of the fields in a certain order, [:endpoint, :sequence, :formulation, :period, :id_within_sequence], and then inspect that data:

data_2020_PKB_processed = select(
    be_result.data,
    [:endpoint, :sequence, :formulation, :period, :id_within_sequence],
)
simple_table(data_2020_PKB_processed)
endpoint sequence formulation period id_within_sequence
5.6 TR 'T' 1 1
6.02 TR 'R' 2 1
4.79 TR 'T' 1 2
4.92 TR 'R' 2 2
4.66 TR 'T' 1 3
4.51 RT 'R' 1 1
4.23 RT 'T' 2 1
5.43 RT 'R' 1 2
5.71 RT 'T' 2 2
4.66 RT 'R' 1 3

The linear model we use represents the endpoint as a function of the factors:

  • sequence - The categorical variable of the sequence used with levels RT and TR.
  • formulation - The categorical variable of the formulation used with levels R and T.
  • period - The categorical variable of the period with levels 1 and 2.
  • id_within_sequence (nested within formulation) - This is a categorical variable associated with the subject. In our analysis we nest this categorical variable within each sequence.

We can now inspect attributes of this the linear model by considering the model field of the result of pumas_be.

Namely, be_result.model contains details of the statistical model (this holds both for the case of linear models and other models used by pumas_be).

In particular we can use the formula function from GLM to see the statistical formula used for the model (println helps format cleaner output here):

be_result.model |> formula |> println
endpoint ~ 1 + sequence + formulation + period + sequence & id_within_sequence

The formula above shows us that the linear model creates an estimate of the endpoint, based on the categorical variables we described above. The term sequence & id_within_sequence implies nesting of id_within_sequence in sequence. We describe the rational of this term and the others below.

We can also use the coeftable function from GLM to see a summary of the estimated parameters (coefficients) as well as their associated p-values and other information (simple_table helps format cleaner output here):

simple_table(be_result.model |> coeftable, halign = :left)
Name Coef. Std. Error t Pr(>|t|) Lower 95% Upper 95%
(Intercept) 4.87 0.128 38.1 0.000688 4.32 5.42
sequence: TR 0.286 0.156 1.83 0.209 -0.387 0.959
formulation: T -0.138 0.156 -0.884 0.47 -0.811 0.535
period: 2 0.139 0.156 0.888 0.468 -0.534 0.812
sequence: RT & id_within_sequence: 2 0.704 0.143 4.93 0.0388 0.0894 1.32
sequence: TR & id_within_sequence: 2 -0.297 0.143 -2.08 0.173 -0.911 0.318
sequence: RT & id_within_sequence: 3 -0.209 0.181 -1.16 0.367 -0.986 0.568
sequence: TR & id_within_sequence: 3 -0.358 0.181 -1.98 0.186 -1.13 0.419

Since sequence, formulation, and period each have two levels, they each have a single coefficient estimated. The rationale for this is based on the same approach for categorical variables as presented above.

As for id_within_sequence in sequence nested with sequence, it is nesting just as presented above.

The rationale for the linear model

The key to this linear model is the following:

  • sequence effects - in case there happen to be any - although we hope there aren’t any.
  • formulation effects - this is the most important point. Capturing any effect that the formulation has on the endpoint. A significant effect beyond \(20\%\) (or any other threshold) would imply not being bioequivalent.
  • period effects - in case there happen to be any - although we hope there aren’t any.
  • id_within_sequence (nested within formulation) effects. These effects are expected, as subjects are all different. The purpose of these effects is to be able to separate subject effects.

Interpreting the results

Once a linear model is fit, for bioequivalence purposes, we can consider interpretation of the results as composed of two parts:

  • Part I: Ensure the model to experiment fit is sensible. Do this by making sure that the experiment does not seem to be influenced strongly by the sequence, period, or other factors.
  • Part II: Consider the confidence interval for the GMR to see if it is contained within the bounds (typically \(80\%\)\(125\%\)).

. To explore this, let’s reproduce the output for data_2020_PKB_complete_case for Cmax here:

be_result_complete_case = pumas_be(data_2020_PKB_complete_case, endpoint = :cmax)
Observation Counts
Sequence ╲ Period 1 2
RT 2 2
TR 2 2
Paradigm: Non replicated crossover design
Model: Linear model
Criteria: Standard ABE
Endpoint: cmax
Formulations: Reference(R), Test(T)
Results(cmax) Assessment Criteria
R Geometric Marginal Mean 184.9
Geometric Naive Mean 184.9
T Geometric Marginal Mean 161
Geometric Naive Mean 161
Geometric Mean T/R Ratio (%) 87.08
Degrees of Freedom 2
90% Confidence Interval (%) [55.16, 137.5] Fail CI ⊆ [80, 125]
Variability CV (%) | σ̂ 22.39 | 0.2212
ANOVA Formulation (p-value) 0.4698
Sequence (p-value) 0.1476
Period (p-value) 0.4684
  • Part I of the interprentation is to see that sequence and period are not significant. Indeed p-values are not small. More on such type-III p-values is in a future unit. With this, we get some confidence that the model is sound for further analysis. However, if we would have had strong evident of an effect on one or two of these factors (e.g. period p-value being less than \(5\%\)), then we would further investigate if there are any reasons to believe that strong period effects are present. As you can see in this case the p-values are 0.1476 and 0.4684 respectively.

  • Part II of the interprentation is the key part, we consider the confidence interval for the GMR, namely \([55.16, 137.5]\) in our case here. By comparing it to the \([80, 125]\) bounds we are carrying out a two one sided T-test (TOST). This test is on the contrast for the the difference between the test and reference, and since our confidence interval is a \(90\%\) confidence interval, the comparison is equivalent to a TOST with \(\alpha = 0.05\). As you can see in this case we Fail as we are not able to reject \(H_0\) of the TOST, and hence we may not conclude bioequivalence. (With such an unrealistically small data example this is not unexpected).

Let us reproduce the [55.16, 137.5] CI result centered around the GMR estimate 87.08, directly from the linear model that is used under the hood. Let’s look at the output of the linear model:

simple_table(be_result_complete_case.model |> coeftable, halign = :left)
Name Coef. Std. Error t Pr(>|t|) Lower 95% Upper 95%
(Intercept) 4.97 0.156 31.8 0.000989 4.3 5.64
sequence: TR 0.361 0.156 2.31 0.148 -0.312 1.03
formulation: T -0.138 0.156 -0.884 0.47 -0.811 0.535
period: 2 0.139 0.156 0.888 0.468 -0.534 0.812
sequence: RT & id_within_sequence: 2 0.599 0.111 5.42 0.0324 0.123 1.08
sequence: TR & id_within_sequence: 2 -0.475 0.111 -4.3 0.0501 -0.951 0.000429

As you can see, the formulation effect is in the third row and is with a coefficient estimate of -0.138, or a more exact quantity:

β_test = coef(be_result_complete_case.model)[3]
-0.13832798631103677

Now estimating the effect of difference in means between T and R is a simple (degenerate) form of a contrast. This contrast is just the coefficient for the test.

Hence the estimate of the contrast is just this -0.138 value. To consider it as a GMR, we exponentiate and (after rounding) get the GMR:

round(100exp(β_test), digits = 2)
87.08

Observe that this value exactly agrees with the pumas_be output.

Now to get a confidence interval for this coefficient, we can use the confint function with level = 0.9. This differs from the \(95\%\) confidence intervals that appear by default with the coeftable function. Note that we can use level = 0.9 with coeftable as well, but we are only interested in a \(90\%\) CI for a single coefficient, the third one:

log_scale_ci = confint(be_result_complete_case.model, level = 0.9)[3, :]
log_scale_ci |> println
[-0.5950139887763114, 0.31835801615423787]

This is a confidence interval for the contrast \(\mu_T - \mu_R\). To see it also explicitly as a contrast, you can see we get the same value (only with very minor differences due to a different numerical computation):

c = [0, 0, 1, 0, 0, 0]
= c' * coef(be_result_complete_case.model)
SE = sqrt(c' * vcov(be_result_complete_case.model) * c)
n = 8 # Observations in the model
p = 6 # Parameters in the linear model
ci_radius = SE * quantile(TDist(n - p), 0.95)
log_scale_ci = [L̂ - ci_radius, L̂ + ci_radius]
log_scale_ci |> println
[-0.5950139887763111, 0.31835801615423753]

Finally we can transform this confidence interval on the log scale to the natural scale with exp:

round.(100exp.(log_scale_ci), digits = 2) |> println
[55.16, 137.49]

As you can see, this value exactly agrees with the CI output of pumas_be.

5 On fixed and random effects

An alternative to the linear model is a mixed effects model (also known as a random effects model) where we treat each of the subjects influence as a random components. We dive in greater detail into random effect models when we study replicate designs and reference scaling in the sequel. At this point let us briefly illustrate this model.

Note that once we start to speak about random effects models, we can also call the linear models above fixed effects models.

The basic random effects model that we consider has this formula:

fml = @formula(endpoint ~ 1 + sequence + formulation + period + (1 | id))
println(fml)
endpoint ~ 1 + sequence + formulation + period + :(1 | id)

Here the expression (1 | id) indicates that id is a random effect. This implies that the model does not only involve residual randomness as the linear model, but also involves a random component per subject. We deal with model and variants in greater detail in future units. Very loosely, the model is,

\[ y = \beta_0 + \sum_{i=1}^p \beta_i x^{(i)} + \zeta + \varepsilon. \]

Here most of the model elements are like the fixed effects models, but there is also a random effect component \(\zeta\). This means that every observation of \(y\) doesn’t only depend on the fixed effects, \(\beta_0 + \sum_{i=1}^p \beta_i x^{(i)}\) and the residual noise \(\varepsilon\), but also depends on a random effects component \(\zeta\).

In terms of the @formula, the 1 + sequence + formulation + period component matches the \(\beta_0 + \sum_{i=1}^p \beta_i x^{(i)}\) part while the (1 | id) component matches the \(\zeta\) part.

Comparing the random effects and fixed effects models

To analyze the model let us use the data field from the previous result. Namely, be_result_complete_case.data has the endpoint data on the log scale, as well as other needed fields. In particular, our dataset for the mixed model is:

data_2020_PKB_complete_case_processed =
    select(be_result_complete_case.data, [:endpoint, :sequence, :formulation, :period, :id])
simple_table(data_2020_PKB_complete_case_processed)
endpoint sequence formulation period id
5.6 TR 'T' 1 1
6.02 TR 'R' 2 1
4.79 TR 'T' 1 2
4.92 TR 'R' 2 2
4.51 RT 'R' 1 4
4.23 RT 'T' 2 4
5.43 RT 'R' 1 5
5.71 RT 'T' 2 5

Now we can fit this model with the lmm function from the MixedModels package. We get into further details, such as the REML = true option, in future units. The result of the fit, model, is amenable to a coeftable similar to the fixed effects linear models used ab above:

model = lmm(fml, data_2020_PKB_complete_case_processed; REML = true)
simple_table(model |> coeftable, halign = :left)
Name Coef. Std. Error z Pr(>|z|)
(Intercept) 4.97 0.552 9 2.2e-19
sequence: TR 0.361 0.765 0.471 0.637
formulation: T -0.138 0.156 -0.884 0.376
period: 2 0.139 0.156 0.888 0.375

As this is a mixed model, T-distributions are no longer used, so you can see the p-values are in terms of a normal distribution (Z). The fixed effects are fit for the (Intercept), the sequence effect, the formulation effect, and the period. However, the \(\zeta\) random (1 | id) component is a random effect, so the subject does not appear in the coefficient table.

To compare, look at the output of the fixed effects model we had before:

simple_table(be_result_complete_case.model |> coeftable, halign = :left)
Name Coef. Std. Error t Pr(>|t|) Lower 95% Upper 95%
(Intercept) 4.97 0.156 31.8 0.000989 4.3 5.64
sequence: TR 0.361 0.156 2.31 0.148 -0.312 1.03
formulation: T -0.138 0.156 -0.884 0.47 -0.811 0.535
period: 2 0.139 0.156 0.888 0.468 -0.534 0.812
sequence: RT & id_within_sequence: 2 0.599 0.111 5.42 0.0324 0.123 1.08
sequence: TR & id_within_sequence: 2 -0.475 0.111 -4.3 0.0501 -0.951 0.000429

You can see that the first four values, 4.97, 0.361, -0.138, 0.139, appear in both the random effects model and the fixed effects model. So in this sense the models are the same. As for the subject effects, the fixed effects model captures these effects via the the 0.599 and -0.475 coefficient estaimtes. In contrast, the random effects model does not estimate such coefficients.

Instead, the random effects model estimates the variance (or standard deviation), not just of \(\varepsilon\), but also of \(\zeta\). For this we can apply the VarCorr function on the model to get the following:

VarCorr(model)
Column Variance Std.Dev
id (Intercept) 0.560657 0.748770
Residual 0.048922 0.221183

You can see that the estimated standard deviation of the residual, 0.221183, agrees with the estimate for the Variability appearing the pumas_be output above. Here is also this value directly:

round.(
    sqrt(
        deviance(be_result_complete_case.model) /
        dof_residual(be_result_complete_case.model),
    ),
    digits = 6,
)
0.221183

As for the 0.748770 estimated standard deviation of the subject (id). This is the variability of \(\zeta\) which is not estimated in the fixed effects model.

Note: At the moment pumas_be does not use a random effects model for \(2 \times 2\) cross over designs. So if you must use such a model, you can use the MixedModels package as we did here. However, as you can see, the results for both models are identical when we use the complete case data (as recommended by the EMA and ICH guidelines).

6 Comparing with SAS Code

Since some regulatory documents refer to SAS code, let us also consider SAS code for the \(2 \times 2\) design.

Basic SAS datastep

This is one way to setup the data in SAS. We set datasets named data_2020_PKB and data_2020_PKB_complete_case just like in Pumas. You may consult SAS documentation for more information on how this code exactly works.

DATA data_2020_PKB;
  INPUT SUBJ SEQ $ PRD CMAX;
  IF (SEQ='TR' AND PRD=1) OR (SEQ='RT' AND PRD=2) THEN TRT='T';
  ELSE TRT='R';
  LNCMAX=LOG(CMAX);
DATALINES;
1 TR 1 269.3
1 TR 2 410.4
2 TR 1 120.2
2 TR 2 137.3
3 TR 1 105.2
4 RT 1 90.9
4 RT 2 68.9
5 RT 1 228.3
5 RT 2 301.5
6 RT 1 105.3
;
RUN;

DATA data_2020_PKB_complete_case;
  SET data_2020_PKB;
  WHERE SUBJ NOT IN (3, 6);
RUN;

After you run the code above in SAS, you have datasets data_2020_PKB and data_2020_PKB_complete_case in your workspace and you can then carry out analysis. However, since SAS is not specifically designed for bioequivalence analysis, the results often output data in a less dedicated format than Pumas. Nevertheless, let us consider the comparison of these outputs.

We use both PROC GLM for the linear model and PROC MIXED for the mixed effects model.

PROC GLM on data_2020_PKB

The linear model is fit via SAS using SAS’s PROC GLM. In our case, here is the code for PROC GLM applied to data_2020_PKB:

PROC GLM DATA=data_2020_PKB;
  CLASS SUBJ SEQ PRD TRT;
  MODEL LNCMAX = SEQ SUBJ(SEQ) PRD TRT;
  RANDOM SUBJ(SEQ) / TEST;
  LSMEANS TRT / DIFF=CONTROL("R") CL ALPHA=0.1;
RUN;

Key parts of the output are as follows. We mark key values that have equivalent pumas_be outputs with light green:

These results are equivalent to the Pumas results which we repeat below for ease of comparison. Here are the key observations:

  • The degrees of freedom next to Error and under DF are 2.
  • The estimate of the variability appearing as Root MSE is at 0.221183. Observe that this is equivalent to a CV of 22.39 percent as you can verify with the sigma2geocv function.
  • The (TYPE-III) p-values for SEQ (sequence), PRD (period), and TRT (formulation) are 0.2088, 0.4684 and 0.4698 respectively, in agreement with the pumas_be output. More on this in a future unit.
  • The marginal means for R and T, under LSMEAN appear on the log-scale. The values, 5.07825945 and 4.93993146, exponentiated, agree with the marginal means of pumas_be:
round.(exp.([5.07825945, 4.93993146]), digits = 1) |> println
[160.5, 139.8]
  • The point estimate for the geometric mean ratio (GMR) on the log scale is -0.138328. When converted back to the natural scale and represented as a percentage, it again agrees with the pumas_be output:
round(100exp(-0.138328), digits = 2)
87.08
  • Finally, and most importantly, the \(90\%\) confidence interval on the log-scale is [-0.595006, 0.318351]. Again, when converted to the natural scale and represented as a percentage, it agrees with pumas_be output (up to rounding):
round.(100exp.([-0.595006, 0.318351]), digits = 2) |> println
[55.16, 137.49]

Here is the pumas_be output for ease of comparison:

be_result = pumas_be(data_2020_PKB, endpoint = :cmax)
Observation Counts
Sequence ╲ Period 1 2
RT 3 2
TR 3 2
Paradigm: Non replicated crossover design
Model: Linear model
Criteria: Standard ABE
Endpoint: cmax
Formulations: Reference(R), Test(T)
Results(cmax) Assessment Criteria
R Geometric Marginal Mean 160.5
Geometric Naive Mean 165.2
T Geometric Marginal Mean 139.8
Geometric Naive Mean 147.9
Geometric Mean T/R Ratio (%) 87.08
Degrees of Freedom 2
90% Confidence Interval (%) [55.16, 137.5] Fail CI ⊆ [80, 125]
Variability CV (%) | σ̂ 22.39 | 0.2212
ANOVA Formulation (p-value) 0.4698
Sequence (p-value) 0.2088
Period (p-value) 0.4684

PROC GLM on data_2020_PKB_complete_case

Now let us also consider the output if we apply it to data_2020_PKB_complete_case instead. This is the case with subjects 3 and 6 removed. To do this change PROC GLM DATA=data_2020_PKB to PROC GLM DATA=data_2020_PKB_complete_case in the SAS code above.

Key parts of the output are as follows. We now mark key values that changed in comparison to the previous SAS output with light yellow:

Here are some observations, where we compare both to the SAS output above and the respective pumas_be output which we present below.

  • Now as there are two fewer observations in the model, the Model degrees of freedom dropped from 7 to 5.
  • The SEQ (sequence) Type III, p-value changed and is now at 0.1476. This is in agreement with the pumas_be output for this case.
  • The marginal means are also modified and (on the log scale) are now at 5.21993029 for R and 5.08160230 for T. These agree with the pumas_be marginal means:
round.(exp.([5.21993029, 5.08160230]), digits = 1) |> println
[184.9, 161.0]

Here is the pumas_be output for ease of comparison:

be_result = pumas_be(data_2020_PKB_complete_case, endpoint = :cmax)
Observation Counts
Sequence ╲ Period 1 2
RT 2 2
TR 2 2
Paradigm: Non replicated crossover design
Model: Linear model
Criteria: Standard ABE
Endpoint: cmax
Formulations: Reference(R), Test(T)
Results(cmax) Assessment Criteria
R Geometric Marginal Mean 184.9
Geometric Naive Mean 184.9
T Geometric Marginal Mean 161
Geometric Naive Mean 161
Geometric Mean T/R Ratio (%) 87.08
Degrees of Freedom 2
90% Confidence Interval (%) [55.16, 137.5] Fail CI ⊆ [80, 125]
Variability CV (%) | σ̂ 22.39 | 0.2212
ANOVA Formulation (p-value) 0.4698
Sequence (p-value) 0.1476
Period (p-value) 0.4684

PROC MIXED

Let us also consider SAS’s PROC MIXED. First let us use the data_2020_PKB_complete_case dataset in which case the linear model (PROC GLM and pumas_be in this case) and the mixed effects model (PROC MIXED and also a Julia MixedModels which we present below) are identical.

The code is:

PROC MIXED DATA=data_2020_PKB_complete_case;
  CLASS SUBJ SEQ PRD TRT;
  MODEL LNCMAX = SEQ PRD TRT;
  RANDOM SUBJ(SEQ);
  ESTIMATE 'T VS R' TRT -1 1 / CL ALPHA=0.1;
RUN;

Key parts of the output are as follows. We mark key values that have equivalent pumas_be outputs with light green. We also mark fields that require some discussion with light blue:

Here are some observations:

  • The T VS R Estimate of -0.1383 is equivalent to the above. This is the GMR on the log scale.
  • Similarly, the Lower and Upper bounds of -0.5950 and 0.3184 are equivalent to the above.
  • The Type-III p-values for the period (PRD) and formulation (TRT) are also identical to the above at 0.4684 and 0.4698 respectively.
  • To get the variability we consider the estimated Residual variance, this appears in Covariance Parameter Estimtes and is at a value of 0.04892. Observe that the square root of this value is the standard deviation appearing above:
round(100sigma2geocv(sqrt(0.04892)), digits = 2), round(sqrt(0.04892), digits = 4)
(22.39, 0.2212)
  • As for the Type-III p-value of the sequence at 0.6838, it is different for the mixed model and does not agree with the pumas_be output (or the PROC GLM output).
  • Finally observe the estimated variability (variance) for SUBJ(SEQ) at 0.5607. Observe that it closely agrees with the estimated standard deviation of the MixedModel above for id which is at 0.748770:
round(sqrt(0.5607), digits = 5)
0.7488

Finally consider the data_2020_PKB dataset which includes subjects \(3\) and \(6\) having partial observations (only the first period and not the second). Here we see different numerical results because PROC MIXED uses the data also from all subjects including those with partial observations. We highlight those results in light red:

Here are the key values on the standard scale. As you can see the values differ somewhat from the linear model, due to those additional partial observations of subjects \(3\) and \(6\).

GMR = round(100exp(-0.1429), digits = 2)
CI = round.(100exp.([-0.5860, 0.3002]), digits = 2)
CV = round(100sigma2geocv(sqrt(0.04701)), digits = 2)
@show GMR
@show CI
@show CV;
GMR = 86.68
CI = [55.65, 135.01]
CV = 21.94

To obtain this output with Pumas we cannot use pumas_be at this point, but rather need to construct the model via MixedModels. We discuss this in a future unit.

7 Conclusion

This unit shifted our focus from experimental design to the statistical engine driving bioequivalence assessment: the linear model. We established how this fundamental tool is used to analyze data from the standard \(2 \times 2\) crossover design. The core of the unit involved breaking down the model’s components, starting with the basics of linear regression and progressing to the essential technique of representing formulation, period, and sequence as categorical variables.

Building on these fundamentals, we constructed the specific linear model that accounts for the fixed effects of sequence, period, and formulation, as well as the nested effect of subject within sequence. We demonstrated how to interpret the model’s output, showing that the geometric mean ratio (GMR) and its \(90\%\) confidence interval are derived directly from the coefficient for the formulation effect. The unit also introduced mixed-effects models as an alternative, clarifying that for a complete-case \(2 \times 2\) study, the key results are identical to the fixed-effects approach. Finally, we drew direct parallels to industry-standard SAS procedures like PROC GLM and PROC MIXED, solidifying the concepts from a practical and regulatory perspective.

With a solid understanding of how linear models are used to estimate the average treatment effect, we are now equipped to delve deeper into another critical aspect of the analysis: variability. The next unit will focus on the intra-subject coefficient of variation (CV), a key metric derived from the linear model’s error term. We will explore how it is calculated, its significance in study design and power, and its crucial role in the analysis of highly variable drugs (HVDs), which often require more advanced statistical approaches.

8 Unit exercises

  1. Understanding Categorical Variables in GLM

    In the section “Categorical Variables in Linear Models”, we used a simple dataset to model AUC based on formulation. Let’s explore the meaning of the coefficients.

    df = DataFrame(
        formulation = ["R", "S", "T", "R", "S", "T"],
        AUC = [120.0, 125.5, 130.7, 119.2, 125.1, 132.4],
    )
    model = lm(@formula(AUC ~ 1 + formulation), df)
    β̂ = coef(model)
    1. Based on the output, what is the reference formulation? What do the coefficients β̂[1], β̂[2], and β̂[3] represent in terms of the formulations?
    2. Using only the β̂ array, manually calculate the predicted AUC for formulation R, formulation S, and formulation T.
    3. Verify your manual calculations by using the predict function on a new DataFrame containing one of each formulation.
    predict(model, DataFrame(formulation = ["R", "S", "T"]))
  2. Reproducing pumas_be Results Manually

    The core of this unit is understanding the linear model behind pumas_be for a \(2 \times 2\) crossover. Let’s use the data_2020_PKB_complete_case dataset to do this from scratch.

    1. Using the processed data from be_result_complete_case.data, fit a linear model using GLM.jl’s lm function. The formula should be the same as the one used by pumas_be, which you can find in the unit.
    2. Extract the coefficient corresponding to the formulation: T effect. Use this coefficient to calculate the Geometric Mean Ratio (GMR) as a percentage. Does it match the pumas_be output?
    3. Use the confint function on your fitted model with level=0.9 to get the 90% confidence interval for the formulation: T coefficient.
    4. Convert this log-scale confidence interval to the natural scale (as percentages) and confirm that it matches the 90% CI reported by pumas_be.
  3. Calculating Intra-Subject CV

    The pumas_be output reports the Variability (CV%), which is the intra-subject coefficient of variation. This value is derived from the residual error of the linear model.

    1. Using the linear model you fit in Exercise 2, find the residual degrees of freedom using dof_residual(model).
    2. Find the residual sum of squares (also called deviance) using deviance(model).
    3. The Mean Squared Error (MSE) is the deviance divided by the degrees of freedom. The residual standard deviation is sqrt(MSE). Calculate this value.
    4. The Bioequivalence package provides the function sigma2geocv(σ) to convert a standard deviation on the log-scale to a CV. Use this function on the residual standard deviation you calculated in part (c). Does it match the Variability (CV%) from the pumas_be output for the data_2020_PKB_complete_case dataset?
  4. Fixed vs. Mixed Models with Missing Data

    This unit highlighted that for a \(2 \times 2\) crossover, the linear model (PROC GLM) uses a “complete case” analysis, while a mixed model (PROC MIXED) can use all available data. Let’s demonstrate this.

    1. Remind yourself of the pumas_be results for data_2020_PKB (with dropouts) and data_2020_PKB_complete_case. Confirm that the GMR and 90% CI are identical. Why is this?
    2. Now, fit a mixed-effects model on both datasets using the lmm function from MixedModels.jl as shown in the unit (@formula(endpoint ~ 1 + sequence + formulation + period + (1 | id))).
    3. For each of the two mixed models, extract the formulation: T coefficient and its standard error from the coeftable. Manually calculate the 90% confidence interval on the log-scale (Hint: use quantile(Normal(), 0.95) for the Z-score).
    4. Compare the GMR and 90% CI from the two mixed models. Are they still identical? Explain why the mixed model gives different results for the two datasets while the fixed model did not.
  5. Working with Contrasts

    In the “Contrasts” section, we created a model for three formulations (R, S, T) and manually calculated the CI for the T vs S comparison. Let’s use that same model to test a different hypothesis.

    df = DataFrame(
        formulation = ["R", "S", "T", "R", "S", "T"],
        AUC = log.([90.0, 127.9, 125.7, 90.2, 122.1, 119.8]),
    )
    model = lm(@formula(AUC ~ 1 + formulation), df)
    1. The estimated effect for S is β̂₀ + β̂₁ and for R is just β̂₀. The difference is β̂₁. Define a contrast vector c that isolates only the β̂₁ coefficient.
    2. Use your contrast vector c, coef(model), and vcov(model) to calculate the 90% CI for the S vs R comparison on the log scale.
    3. Convert the CI to the natural scale (as percentages). Would you conclude S and R are bioequivalent based on this result?
  6. Investigating the vcov Matrix

    The variance-covariance matrix, obtained with vcov(model), is fundamental to all inference.

    1. Get the vcov matrix for the linear model from Exercise 2 (data_2020_PKB_complete_case).
    2. The diagonal elements of this matrix are the variances of the coefficient estimates. Extract the diagonal using diag(vcov(model)).
    3. Take the square root of these diagonal elements. Compare the resulting values to the Std. Error column from the coeftable(model) output. What do you notice?

References

EMA. 2010. “Guideline on the Investigation of Bioequivalence.” 2010. https://www.ema.europa.eu/en/documents/scientific-guideline/guideline-investigation-bioequivalence-rev1_en.pdf.
———. 2015. “Questions & Answers: Positions on Specific Questions Addressed to the Pharmacokinetics Working Party (PKWP). November 2015.” 2015. https://www.ema.europa.eu/en/documents/scientific-guideline/questions-and-answers-positions-specific-questions-addressed-pharmacokinetics-working-party_en.pdf.
FDA. 2022. “Statistical Approaches to Establishing Bioequivalence. Guidance for Industry.” 2022. https://www.fda.gov/media/163638/download.
International council for harmonisation of technical requirements for pharmaceuticals for human use, ICH -. 2024. “Bioequivalence for Immediate Release Solid Oral Dosage Forms - M13A - July 2024.” 2024. https://database.ich.org/sites/default/files/ICH_M13A_Step4_Final_Guideline_2024_0723.pdf.
Liquet, Benoit, Sarat Moka, and Yoni Nazarathy. 2024. Mathematical Engineering of Deep Learning. Chapman; Hall/CRC. https://deeplearningmath.org/.
Nazarathy, Yoni, and Hayden Klok. 2021. Statistics with Julia: Fundamentals for Data Science, Machine Learning and Artificial Intelligence. Springer Nature. https://statisticswithjulia.org/.
Park, Gowooni, Hyungsub Kim, and Kyun-Seop Bae. 2020. “Bioequivalence Data Analysis.” Translational and Clinical Pharmacology 28 (4): 175. https://pmc.ncbi.nlm.nih.gov/articles/PMC7781810/.

Reuse