A Course on Bioequivalence: Unit 12 - Reference Scaling Part I

Authors

Yoni Nazarathy

Andreas Noack

This is Unit 12 of a complete bioequivalence analysis course using Pumas. There are 15 units in the course in total.

1 Unit overview

This is the first of two units dealing with reference scaling, sometimes known as reference scaled average bioequivalence (RSABE). A related approach is average bioequivalence with expanding limits (ABEL). These approaches arose from an approach called population bioequivalence and a related approach called individual bioequivalence.

In all of these cases, a unifying overarching idea is that the acceptance criteria for bioequivalence also depend on the variability of the product(s), and not only on their means. See Chapter 7 of Patterson and Jones (2017) for a general overview of these topics.

While historically, population bioequivalence is important, it is not our focus here because it has mostly been abandoned from human trials. Note that it is still recommended for some in vitro bioequivlaence studies which are not the focus in this course. For more information about current recommendations for population bioequivalence, a starting point can be the FDA video Choi (2023). We do not discuss population bioequivalence or individual bioequivalence further.

Within the range of topics of reference scaling, our focus in this unit and the next is on reference scaled average bioequivalence (RSABE). A succinct description of this approach is in the FDA video Schuirmann (2023). We cover this material in this unit and the next.

The RSABE approach is used for two typical cases:

  • Highly variable drugs products (HVDP or HVD) - These are drugs with a high CV for the within subject variability of the reference product, where a standard average bioequivalence approach (ABE) may often fail to show bioequivalence even in cases where the products are BE. Namely for HVDs, average bioequivalence (ABE) may have a producer’s risk that is too high unless very large sample sizes are used. In this case, RSABE may essentially widen the effective limits beyond 80-125.
  • Narrow therapeutic index drugs (NTID) - As discussed in the Previous Unit, these are products which are very sensitive and require more stringent criteria. In this case, RSABE may essentially narrow the effective limits to be less than 80-125.

The first case, HVD, is not only covered by RSABE with the FDA approach, but also with other expanding limits approaches such as ABEL. In any circumstance, using such approaches requires planning for such cases with the study design and trial approval. Our focus in this unit is on the FDA approach for HVD, namely RSABE for HVD. The main FDA document for this approach is FDA (2011).

The second case, NTID is also handled in the FDA via reference scaling, and this is the focus of the the next unit. As seen in the previous unit, NTID can also be handled in a simple approach by narrowing the limits (e.g. at a fixed change going from 80-125 to 90-111.11), yet the RSABE based FDA approach, outlined in FDA (2012) is more intricate and relies on RSABE.

We use the following packages.

using Bioequivalence  # available with Pumas products
using BioequivalencePower  # available with Pumas products
using PharmaDatasets # available with Pumas products
using SummaryTables

2 An Operational Description of RSABE for HVD

Let us first consider the basics of carrying out RSABE for HVD based on FDA (2011).

The approach is based on a three period, three sequence partial replicate design or a four period, two sequence fully replicate design. Thus the designs supported are:

  • Partial replicate: RRT|RTR|TRR.
  • Fully replicate: RTRT|TRTR, RTTR|TRRT, or RRTT|TTRR.

In both cases, an important ingredient is the estimation of within subject variability for the reference product. This is done via the auxiliary estimator described in Unit 10. The estimated variability is presented and denoted as CVᵣ.

Reference CV below threshold implies standard ABE criteria

If CVᵣ is less than Minimal CVᵣ for Reference Scaling which is set by the FDA as \(30\%\) then even if the study protocol was set as RSABE, we simply carry out standard ABE.

As an application of this consider the SLTGSF2020_DS16 dataset example where we purposefully take only rows 5:30 of the dataset to create modified_fully_replicate_data_1. This subset of the data is just an example where CVᵣ does not meet the threshold.

fully_replicate_data = dataset(joinpath("bioequivalence", "RTTR_TRRT", "SLTGSF2020_DS16"));
modified_fully_replicate_data_1 = fully_replicate_data[5:30, :]
simple_table(modified_fully_replicate_data_1)
id sequence period PK
2 RTTR 1 0.202
2 RTTR 2 0.178
2 RTTR 3 0.208
2 RTTR 4 0.36
3 TRRT 1 0.433
3 TRRT 2 0.313
3 TRRT 3 0.213
3 TRRT 4 0.234
4 RTTR 1 1.4
4 RTTR 2 0.628
4 RTTR 3 2.21
4 RTTR 4 2.45
5 RTTR 1 1.72
5 RTTR 2 1.31
5 RTTR 3 2.25
5 RTTR 4 3.11
6 TRRT 1 0.256
6 TRRT 2 1.1
6 TRRT 3 0.528
6 TRRT 4 1.14
7 RTTR 1 0.436
7 RTTR 2 0.278
7 RTTR 3 0.483
7 RTTR 4 0.642
8 TRRT 1 0.592
8 TRRT 2 0.359

Now if the trial protocol indicates RSABE for HVD, we use the FDA_HighlyVariable in pumas_be:

pumas_be(modified_fully_replicate_data_1, FDA_HighlyVariable, endpoint = :PK)
Observation Counts
Sequence ╲ Period 1 2 3 4
RTTR 4 4 4 4
TRRT 3 3 2 2
Paradigm: Replicated crossover that supports reference scaling
Model: Mixed model (unequal variance)
Criteria: FDA RSABE for HV
Endpoint: PK
Formulations: Reference(R), Test(T)
Results(PK) Assessment Criteria
R Geometric Marginal Mean 0.5819
Geometric Naive Mean 0.6678
T Geometric Marginal Mean 0.5235
Geometric Naive Mean 0.5427
Geometric Mean T/R Ratio (%) 89.96
Degrees of Freedom 9.655
90% Confidence Interval (%) [62.37, 129.8] Fail CI ⊆ [80, 125]
Variability CVᵣ (%) | σ̂ᵣ 10.57 | 0.1054 Not OK for RS CVᵣ ≥ Minimal CVᵣ
CVₜ (%) | σ̂ₜ 65.5 | 0.5975
Variability Ratio (%) 566.8
ANOVA Formulation (p-value) 0.6112
Sequence (p-value) 0.4157
Period (p-value) 0.00366
Reference Scaling Params Reference Scaling Constant 0.7967
Minimal CVᵣ for Reference Scaling (%) 30.0 | 0.294
Reference Scaling Analysis Geometric Mean T/R Ratio (%) 80.47
Standard Error (Log Scale) 0.0887
90% Confidence Interval (%) [66.6, 97.23]
Degrees of Freedom 4
Howe's Approx RSABE Stat (95%) 0.1565

In this particular case, with the modified_fully_replicate_data_1 dataset, you can see the Not OK for RS in the Assessment column and this is because the required criteria, CVᵣ ≥ Minimal CVᵣ was not met. Hence in this case, the trial is treated as a standard ABE trial with the CI ⊆ [80, 125] criteria.

Here as it is a standard ABE trial, since the confidence interval is [62.37, 129.8], we Fail. If the confidence interval would have been a subset of [80, 125] we would have obtained a Pass, even though we are still in Not OK for RS.

Note that that the Reference Scaling Params and Reference Scaling Analysis sections are still displayed, even though the lower CVᵣ implies we carried out standard ABE.

Reference CV above threshold implies RSABE criteria

Now let us slightly modify the dataset by selecting rows 1:30 of the original dataset. We get the following output:

modified_fully_replicate_data_2 = fully_replicate_data[5:35, :]
pumas_be(modified_fully_replicate_data_2, FDA_HighlyVariable, endpoint = :PK)
Observation Counts
Sequence ╲ Period 1 2 3 4
RTTR 4 4 4 4
TRRT 4 4 4 3
Paradigm: Replicated crossover that supports reference scaling
Model: Mixed model (unequal variance)
Criteria: FDA RSABE for HV
Endpoint: PK
Formulations: Reference(R), Test(T)
Results(PK) Assessment Criteria
R Geometric Marginal Mean 0.5468
Geometric Naive Mean 0.5468
T Geometric Marginal Mean 0.4757
Geometric Naive Mean 0.5116
Geometric Mean T/R Ratio (%) 87 Pass GMR ∈ [80, 125]
Degrees of Freedom 18.43
90% Confidence Interval (%) [66.74, 113.5]
Variability CVᵣ (%) | σ̂ᵣ 43.86 | 0.4195 OK for RS CVᵣ ≥ Minimal CVᵣ
CVₜ (%) | σ̂ₜ 59.47 | 0.5503
Variability Ratio (%) 131.2
ANOVA Formulation (p-value) 0.3745
Sequence (p-value) 0.3212
Period (p-value) 0.04397
Reference Scaling Params Reference Scaling Constant 0.7967
Minimal CVᵣ for Reference Scaling (%) 30.0 | 0.294
Reference Scaling Analysis Geometric Mean T/R Ratio (%) 87
Standard Error (Log Scale) 0.0957
90% Confidence Interval (%) [71.74, 105.5]
Degrees of Freedom 6
Howe's Approx RSABE Stat (95%) -0.005847 Pass ≤ 0

In this case, the estimate CVᵣ at 43.86 is above the minimal value, thus we get OK for RS. With this indication, the acceptance/rejection of bioequivalence depend on two separate criteria that must be met:

  1. Point estimate: The point estimate of the GMR needs to fall within the bounds. This is denoted as GMR ∈ [80, 125] in the Criteria column. In our case since the GMR is 87, it falls within the bounds and hence we Pass. (Note that it is just the point estimate, and not the 90% confidence interval that is considered).
  2. Reference scaling statistic: The statistic denoted as Howe's Approx RSABE Statneeds to fall below a 95% quantile. In practice this is done by making sure the statistic is not greeater than 0. This is denoted as ≤ 0 in the Criteria column. In our case, the statistic is at -0.005847 and is negative and hence we Pass.

Note that the point estimate criteria is less stringent than the standard ABE criteria because the GMR estimate is always contained inside the confidence interval. Also note the use of the mathematical symbol (an “element of”) for that criteria in contrast to the mathematical symbol (a “subset of”). This criteria was simply introduced by the FDA as an extra safety measure to ensure that we never approve a drug whose actual point estimate of the GMR falls out of the standard 80-125 bounds.

The second criteria, the reference scaling statistic lies at the heart of the RSABE approach and we describe it in more detail below. At this point consider that it is essentially implies that bioequivalence passed with adjusted (or implied) bounds which were adjusted based on the estimate of the within subject variability of the reference product.

In any case, to summarize, since we are in the reference CV above threshold case, to pass RSABE for HVD both the point estimate and the reference scaling criteria need to pass.

Here are see some situations that are OK for RS but are considered as not bioequivalent due to one or two reasons. Each of these cases is synthetically selected by choosing different subsets of the original dataset:

modified_fully_replicate_data_3 = fully_replicate_data[:, :] # subset is all of data
pumas_be(modified_fully_replicate_data_3, FDA_HighlyVariable, endpoint = :PK)
Observation Counts
Sequence ╲ Period 1 2 3 4
RTTR 20 20 20 20
TRRT 18 18 18 18
Paradigm: Replicated crossover that supports reference scaling
Model: Mixed model (unequal variance)
Criteria: FDA RSABE for HV
Endpoint: PK
Formulations: Reference(R), Test(T)
Results(PK) Assessment Criteria
R Geometric Marginal Mean 0.5916
Geometric Naive Mean 0.5957
T Geometric Marginal Mean 0.4663
Geometric Naive Mean 0.4688
Geometric Mean T/R Ratio (%) 78.83 Fail GMR ∈ [80, 125]
Degrees of Freedom 86.56
90% Confidence Interval (%) [69.31, 89.66]
Variability CVᵣ (%) | σ̂ᵣ 49.72 | 0.47 OK for RS CVᵣ ≥ Minimal CVᵣ
CVₜ (%) | σ̂ₜ 51.41 | 0.4843
Variability Ratio (%) 103.1
ANOVA Formulation (p-value) 0.00283
Sequence (p-value) 0.5266
Period (p-value) 0.03876
Reference Scaling Params Reference Scaling Constant 0.7967
Minimal CVᵣ for Reference Scaling (%) 30.0 | 0.294
Reference Scaling Analysis Geometric Mean T/R Ratio (%) 78.83
Standard Error (Log Scale) 0.0528
90% Confidence Interval (%) [72.12, 86.18]
Degrees of Freedom 36
Howe's Approx RSABE Stat (95%) -0.04805 Pass ≤ 0
modified_fully_replicate_data_4 = fully_replicate_data[120:end, :]
pumas_be(modified_fully_replicate_data_4, FDA_HighlyVariable, endpoint = :PK)
Observation Counts
Sequence ╲ Period 1 2 3 4
RTTR 6 6 6 6
TRRT 2 2 2 3
Paradigm: Replicated crossover that supports reference scaling
Model: Mixed model (unequal variance)
Criteria: FDA RSABE for HV
Endpoint: PK
Formulations: Reference(R), Test(T)
Results(PK) Assessment Criteria
R Geometric Marginal Mean 0.7263
Geometric Naive Mean 0.644
T Geometric Marginal Mean 0.6139
Geometric Naive Mean 0.5748
Geometric Mean T/R Ratio (%) 84.52 Pass GMR ∈ [80, 125]
Degrees of Freedom 11.45
90% Confidence Interval (%) [57.29, 124.7]
Variability CVᵣ (%) | σ̂ᵣ 32.98 | 0.3213 OK for RS CVᵣ ≥ Minimal CVᵣ
CVₜ (%) | σ̂ₜ 65.97 | 0.6011
Variability Ratio (%) 187.1
ANOVA Formulation (p-value) 0.4546
Sequence (p-value) 0.8717
Period (p-value) 0.6823
Reference Scaling Params Reference Scaling Constant 0.7967
Minimal CVᵣ for Reference Scaling (%) 30.0 | 0.294
Reference Scaling Analysis Geometric Mean T/R Ratio (%) 85.37
Standard Error (Log Scale) 0.1356
90% Confidence Interval (%) [65.59, 111.1]
Degrees of Freedom 6
Howe's Approx RSABE Stat (95%) 0.1009 Fail ≤ 0
modified_fully_replicate_data_5 = fully_replicate_data[5:50, :]
pumas_be(modified_fully_replicate_data_5, FDA_HighlyVariable, endpoint = :PK)
Observation Counts
Sequence ╲ Period 1 2 3 4
RTTR 4 4 4 4
TRRT 8 8 7 7
Paradigm: Replicated crossover that supports reference scaling
Model: Mixed model (unequal variance)
Criteria: FDA RSABE for HV
Endpoint: PK
Formulations: Reference(R), Test(T)
Results(PK) Assessment Criteria
R Geometric Marginal Mean 0.7028
Geometric Naive Mean 0.6366
T Geometric Marginal Mean 0.5618
Geometric Naive Mean 0.5268
Geometric Mean T/R Ratio (%) 79.94 Fail GMR ∈ [80, 125]
Degrees of Freedom 29.22
90% Confidence Interval (%) [65.61, 97.4]
Variability CVᵣ (%) | σ̂ᵣ 38.86 | 0.375 OK for RS CVᵣ ≥ Minimal CVᵣ
CVₜ (%) | σ̂ₜ 49.97 | 0.4721
Variability Ratio (%) 125.9
ANOVA Formulation (p-value) 0.0639
Sequence (p-value) 0.5747
Period (p-value) 0.00529
Reference Scaling Params Reference Scaling Constant 0.7967
Minimal CVᵣ for Reference Scaling (%) 30.0 | 0.294
Reference Scaling Analysis Geometric Mean T/R Ratio (%) 79.82
Standard Error (Log Scale) 0.078
90% Confidence Interval (%) [69.19, 92.09]
Degrees of Freedom 9
Howe's Approx RSABE Stat (95%) 0.03767 Fail ≤ 0

A partial replicate example

As you can see, the threshold decisions of RSABE for HVD does not depend on the within subject variability of the test product. It only depends on the reference product. This also allows us to use a partial replicate design.

partial_replicate_data =
    dataset(joinpath("bioequivalence", "RRT_RTR_TRR", "SLTGSF2020_DS07"));

Here are the rows with the smallest endpoint values and the rows with the highest endpoint values. This is just a means to get some sort of feeling for the data:

simple_table(first(sort(partial_replicate_data, :PK), 10))
id sequence period PK
134 RTR 2 11.9
271 TRR 1 14.8
271 TRR 3 15.4
259 TRR 1 15.9
191 RTR 2 16.8
271 TRR 2 19.3
134 RTR 3 19.7
73 RRT 3 19.8
278 TRR 1 19.9
93 RRT 1 21
simple_table(last(sort(partial_replicate_data, :PK), 10))
id sequence period PK
75 RRT 2 348
159 RTR 3 353
273 TRR 3 377
311 TRR 3 379
111 RRT 2 384
170 RTR 1 387
273 TRR 1 432
359 TRR 2 436
49 RRT 2 450
292 TRR 1 542

Now we can use pumas_be with FDA_HighlyVariable:

pumas_be(partial_replicate_data, FDA_HighlyVariable; endpoint = :PK)
Observation Counts
Sequence ╲ Period 1 2 3
RRT 120 120 120
RTR 120 120 120
TRR 120 120 120
Paradigm: Replicated crossover that supports reference scaling
Model: Mixed model (unequal variance)
Criteria: FDA RSABE for HV
Endpoint: PK
Formulations: Reference(R), Test(T)
Results(PK) Assessment Criteria
R Geometric Marginal Mean 99.41
Geometric Naive Mean 99.41
T Geometric Marginal Mean 89.05
Geometric Naive Mean 89.05
Geometric Mean T/R Ratio (%) 89.58 Pass GMR ∈ [80, 125]
Degrees of Freedom 357.7
90% Confidence Interval (%) [86.44, 92.83]
Variability CVᵣ (%) | σ̂ᵣ 34.23 | 0.3329 OK for RS CVᵣ ≥ Minimal CVᵣ
ANOVA Formulation (p-value) 0
Sequence (p-value) 0.08425
Period (p-value) 0.7295
Reference Scaling Params Reference Scaling Constant 0.7967
Minimal CVᵣ for Reference Scaling (%) 30.0 | 0.294
Reference Scaling Analysis Geometric Mean T/R Ratio (%) 89.58
Standard Error (Log Scale) 0.0216
90% Confidence Interval (%) [86.44, 92.83]
Degrees of Freedom 357
Howe's Approx RSABE Stat (95%) -0.06284 Pass ≤ 0

Designs for estimating CVᵣ and not supporting RSABE

Note that there are designs where in principle we could have carried out RSABE since CVᵣ is estimated, but RSABE is not supported since it is not in the FDA (2011) guidance. Here is an example (where we display the first few rows of the data with the lowest AUC):

dual_data = dataset(joinpath("bioequivalence", "RTT_TRR", "PJ2017_4_1"))
simple_table(first(sort(dual_data, :AUC), 10))
id sequence period AUC Cmax
167 TRR 2 3.31 0.32
124 TRR 1 5.82 0.347
186 RTT 1 6.06 0.31
157 TRR 2 7.16 0.459
142 RTT 2 8.08 0.65
179 RTT 3 9.82 0.71
179 RTT 1 10.1 1.02
157 TRR 3 10.9 0.756
138 RTT 2 11 0.62
101 TRR 3 11.3 0.533

If we would have tried to use pumas_be with FDA_HighlyVariable we would get an error:

pumas_be(dual_data, FDA_HighlyVariable) # throws an error!

Still, we may use this design with standard average bioequivalence. To be explicit about this we may use the StandardBioequivalenceCriterion input as the second argument. This is in fact the default value of that argument, and hence if needed, there is no need to specify it. Still here it is as an example:

pumas_be(dual_data, StandardBioequivalenceCriterion)
Observation Counts
Sequence ╲ Period 1 2 3
RTT 46 45 43
TRR 47 47 47
Paradigm: Replicated crossover that does not support reference scaling
Model: Mixed model (unequal variance)
Criteria: Standard ABE
Endpoint: AUC
Formulations: Reference(R), Test(T)
Results(AUC) Assessment Criteria
R Geometric Marginal Mean 102.8
Geometric Naive Mean 112
T Geometric Marginal Mean 100.1
Geometric Naive Mean 93.96
Geometric Mean T/R Ratio (%) 97.38
Degrees of Freedom 148.7
90% Confidence Interval (%) [86.86, 109.2] Pass CI ⊆ [80, 125]
Variability CVᵣ (%) | σ̂ᵣ 42.75 | 0.4097
CVₜ (%) | σ̂ₜ 69.65 | 0.6289
ANOVA Formulation (p-value) 0.7013
Sequence (p-value) 0.2506
Period (p-value) 0.01209

Observe that with this design the Paradigm is Replicated crossover that does not support reference scaling. Importantly we can estimate the within subject variability both for the reference (CVᵣ is 42.75) and for the test (CVₜ is 69.65). Using such a design in a pilot study can enable us to get preliminary estimates of the variability so that we can use them in a pivotal study down the road, perhaps with RSABE if we choose to apply that approach.

3 The rationale of RSABE (focused on HVD)

The RSABE approach for highly variable drugs is setout in FDA (2012). Further, as mentioned above the FDA video Schuirmann (2023) is a short neat (additional) description of the approach.

The basic hypothesis formulation

Let us go through the basics for the approach, and see the hypothesis formulation of RSABE.

We start with the geometric mean ratio (\(\text{GMR}\)) and the log transformed means, \(\mu_T\) and \(\mu_R\):

\[ \text{GMR} = \frac{\text{GM}_T}{\text{GM}_R} = \frac{e^{\mu_T}}{e^{\mu_R}}. \]

Now we can set \(\Delta = 1.25\) (and for NTID it is \(\Delta = 1.11111\)), and with standard ABE, say that R and T are bioequivalent if:

\[ \underbrace{\frac{1}{\Delta}}_{0.8} < \frac{e^{\mu_T}}{e^{\mu_R}} < \underbrace{\Delta}_{1.25}. \]

Take logarithms:

\[ \underbrace{-\log \Delta}_{-0.22314} < \mu_T - \mu_R < \underbrace{\log \Delta}_{0.22314}. \]

An equivalent inequality (where \(|x|\) is the absolute value of \(x\)) is:

\[ |\mu_T - \mu_R| < \log \Delta. \]

And further an equivalent inequality is:

\[ (\mu_T - \mu_R)^2 < (\log \Delta)^2. \]

This last (squaring based) representation of ABE is useful as it allows us to see how RSABE is a generalization of ABE.

Scaling

Now consider scaling the right hand side by a value \(\kappa\) that depends on the within subject variability. It is this value which influences the effective bounds:

\[ (\mu_T - \mu_R)^2 < (\log \Delta)^2 ~\kappa, \]

  • If \(\kappa = 1\) there is no change.
  • If \(\kappa < 1\) there are tighter bounds.
  • If \(\kappa > 1\) there are wider bounds.

An equivalent inequality is:

\[ |\mu_T - \mu_R| < (\log \Delta) ~\sqrt{\kappa}, \]

or

\[ -(\log \Delta) ~\sqrt{\kappa} ~<~ \mu_T - \mu_R ~<~ (\log \Delta) ~\sqrt{\kappa}. \]

To get \(\kappa\), Denote \(\sigma^2_{WR}\) as the within subject variability of the reference product, represented as a variance of the log transformed endpoint. Also denote \(\sigma_{W_0} = 0.25\) as a comparison constant. We now set,

\[ \kappa = \frac{\sigma^2_{WR}}{\sigma^2_{W_0}}. \]

The implied limits are then:

\[ |\mu_T - \mu_R| ~<~ (\log \Delta) \frac{\sigma_{WR}}{\sigma_{W_0}}. \]

Observe that as \(\sigma_{WR}\) the limits get wider (HVD) or as \(\sigma_{WR}\) decreases the limits get narrower (NTID).

Not that if we rearrange, we get:

\[ \frac{(\mu_T - \mu_R)^2}{\sigma_{WR}^2} ~<~ \frac{(\log \Delta)^2}{\sigma_{W_0}^2}. \]

The regulatory constant, linearized criterion, and hypothesis formulation

We can define the regulatory constant \(\theta\) as:

\[ \theta = \frac{(\log \Delta)^2}{\sigma_{W_0}^2}. \]

Thus the inequality for BE is:

\[ \frac{(\mu_T - \mu_R)^2}{\sigma_{WR}^2} < \theta. \]

Or in one last form, after rearranging, we have what is called the linearized criterion:

\[ (\mu_T - \mu_R)^2 - \theta \sigma^2_{WR} < 0. \]

It is really this linearized criterion which we consider in an hypothesis test. Hence finally here is the hypothesis test for reference scaled average bioequivalence:

\[ {\cal T}_{\text{RSABE}} = \begin{cases} H_{0}:& (\mu_T - \mu_R)^2 - \theta \sigma^2_{WR} \ge 0, \\[5pt] H_{1}:& (\mu_T - \mu_R)^2 - \theta \sigma^2_{WR} < 0.\\ \end{cases} \]

Here (us usual) in a TOST, \(H_{1}\) means we are not bioequivalent.

What is \(\theta\) for HVD, see the draft guidance FDA (2011)?

Δ = 1.25
σ_W₀ = 0.25
θ = (log(Δ) / σ_W₀)^2
0.7966887118898779

What is \(\theta\) for NTID, see the draft guidance FDA (2012)?

Δ = 1 / 0.9 #1.11111
σ_W₀ = 0.1
θ = (log(Δ) / σ_W₀)^2
1.1100838259683068

4 Implementing the hypothesis test

The hypothesis formulation \({\cal T}_{\text{RSABE}}\) can be implemented in different ways. The RSABE approach suggests a test statistic based on a \(95\%\) upper confidence band for the linearized criterion. That statistic is compare to the threshold level \(0\). So if it is negative we reject and otherwise not.

We use Howe’s Approximation I (Howe (1974)) and a few other subtle computations suggested by in FDA (2011).

SAS code from the FDA guidance

Here are snippets of SAS code suggested in that guidance:

x=estimate**2-stderr**2;
boundx=(max(abs(lower),abs(upper)))**2;
theta=((log(1.25))/0.25)**2;
y=-theta*s2wr;
boundy=y*dfd/cinv(0.95,dfd);
sWR=sqrt(s2wr);
critbound=(x+y)+sqrt(((boundx-x)**2)+((boundy-y)**2))

Here the key quantities are:

  • estimate is the log-GMR based on the model used for within subject variability estimation.
  • stderr is the standard error of that parameter from that model.
  • lower and upper are 90% confidence bounds from that model. These are used within Howe’s approximation.
  • s2wr is the estimate of \(\sigma_{WR}\) from that model.
  • dfd are the degrees of freedom of that model.

Further in SAS:

  • ** is squaring.
  • abs, max, and sqrt are functions as you may expect.
  • cinv is the quantile of a Chi-square distribution.

This computation then yields critbound which in Pumas we call the Howe's Approx RSABE Stat (95%).

A compatible Pumas/Julia computation

Here is for example how the computation is carried out inside Pumas once we are given, β (similar to SAS’s estimate), lower_bound (similar to SAS’s lower), upper_bound (similar to SAS’s upper), se (similar to SAS’s stderr):

x = β^2 - se^2
boundx = (max((abs(lower_bound)), (abs(upper_bound))))^2
y = -𝜃 * σwᵣ^2
boundy = y * k / quantile(Chisq(k), level_y)
howe_stat = x + y + (((boundx - x)^2) + ((boundy - y)^2))

We can now to the last output to understand the related quantities under the Reference Scaling Analysis section:

pumas_be(partial_replicate_data, FDA_HighlyVariable; endpoint = :PK)
Observation Counts
Sequence ╲ Period 1 2 3
RRT 120 120 120
RTR 120 120 120
TRR 120 120 120
Paradigm: Replicated crossover that supports reference scaling
Model: Mixed model (unequal variance)
Criteria: FDA RSABE for HV
Endpoint: PK
Formulations: Reference(R), Test(T)
Results(PK) Assessment Criteria
R Geometric Marginal Mean 99.41
Geometric Naive Mean 99.41
T Geometric Marginal Mean 89.05
Geometric Naive Mean 89.05
Geometric Mean T/R Ratio (%) 89.58 Pass GMR ∈ [80, 125]
Degrees of Freedom 357.7
90% Confidence Interval (%) [86.44, 92.83]
Variability CVᵣ (%) | σ̂ᵣ 34.23 | 0.3329 OK for RS CVᵣ ≥ Minimal CVᵣ
ANOVA Formulation (p-value) 0
Sequence (p-value) 0.08425
Period (p-value) 0.7295
Reference Scaling Params Reference Scaling Constant 0.7967
Minimal CVᵣ for Reference Scaling (%) 30.0 | 0.294
Reference Scaling Analysis Geometric Mean T/R Ratio (%) 89.58
Standard Error (Log Scale) 0.0216
90% Confidence Interval (%) [86.44, 92.83]
Degrees of Freedom 357
Howe's Approx RSABE Stat (95%) -0.06284 Pass ≤ 0

You may now work out these computations to verify Howe's Approx RSABE Stat (95%).

5 Conclusion

In this unit, we introduced the concepts and operational details of reference scaled average bioequivalence (RSABE), focusing on its application for highly variable drug products (HVDs) according to FDA guidance. We reviewed the motivations for reference scaling, emphasizing that acceptance criteria for bioequivalence can be adjusted based on the within-subject variability of the reference product, rather than fixed limits alone. Through practical examples using replicate study designs and the Pumas bioequivalence tools, we demonstrated how to determine when RSABE should be applied versus standard average bioequivalence, and clarified the two main criteria for RSABE: the point estimate of the geometric mean ratio (GMR) and the specialized reference scaling test statistic. We also explored the mathematical rationale behind RSABE, including its scaling mechanism that widens or narrows BE limits based on variability, and we discussed the hypothesis-testing approach that underpins regulatory decision-making. This unit thus provides a comprehensive foundation for understanding and implementing RSABE in the context of HVDs, setting the stage for the subsequent unit on reference scaling for narrow therapeutic index drugs (NTIDs).

6 Unit exercises

  1. Core Concepts of Reference Scaling

    1. In your own words, explain why reference scaling (RSABE) is used for highly variable drugs.
    2. What is the key statistic from the reference product that determines whether RSABE can be applied?
    3. How do the acceptance limits change for HVDs under RSABE as compared to standard average bioequivalence (ABE)?
  2. RSABE Decision Scenarios

    Suppose you are analyzing a fully replicate dataset for an HVD using the FDA RSABE approach:

    • In Dataset X, the estimated within-subject CV for the reference product (CVᵣ) is 28%.
    • In Dataset Y, CVᵣ is 42%. For each dataset, answer:
    1. Should RSABE be applied, or should standard ABE criteria be used?
    2. Briefly describe the criteria that must be met for the trial to pass bioequivalence under each scenario.
  3. Interpreting Pumas Output

    Given the following snippet of results from a Pumas FDA_HighlyVariable RSABE analysis (focus on the Assessment column):

    Criteria type Statistic name Value Criteria Assessment
    Point Estimate GMR 90.0 ∈ [80, 125] Pass
    Reference Scaling Stat Howe’s Approx RSABE Stat -0.23 ≤ 0 Pass
    Overall - - Both above pass Pass
    1. What does the Point Estimate criterion require, and was it met?
    2. What is the meaning of the Reference Scaling Stat, and was it met in this example?
    3. Why do both criteria need to be individually passed for the study to be considered bioequivalent via RSABE?
  4. Mathematical Rationale

    1. Write the linearized criterion for RSABE as described in the unit, naming all variables.
    2. For an HVD, what is the purpose of the regulatory constant \(\theta\) and how is it calculated? (Give the formula in terms of \(\Delta\) and \(\sigma_{W0}\).)
    3. Explain how changing the estimate of \(\sigma_{WR}\) affects the implied bioequivalence limits.
  5. Study Design and Regulatory Guidance

    1. List the replicate study designs that are eligible for the FDA’s RSABE approach for HVDs, based on this unit.
    2. Why do dual-sequence full replicate or partial replicate designs allow the estimation necessary for RSABE, while some other designs do not?
    3. Briefly explain what you might do with a study dataset that does not support RSABE but still provides within-subject variability estimates for reference and test products.

References

Choi, Sungwoo. 2023. “Video: FDA Draft Guidance on Statistical Approaches to Establishing Bioequivalence. Section: Statistical Test for Population Bioequivalence.” 2023. https://www.youtube.com/watch?v=VGE1KNFpPQQ&t=960s.
FDA. 2011. “FDA, Draft Guidance on Progesterone. Recommended Apr 2010; Revised Feb 2011.” 2011. https://www.accessdata.fda.gov/drugsatfda_docs/psg/Progesterone_caps_19781_RC02-11.pdf.
———. 2012. “FDA, Draft Guidance on Warfarin Sodium. Recommended Dec 2012.” 2012. https://www.accessdata.fda.gov/drugsatfda_docs/psg/Warfarin_Sodium_tab_09218_RC12-12.pdf.
Howe, WG. 1974. “Approximate Confidence Limits on the Mean of x+ y Where x and y Are Two Tabled Independent Random Variables.” Journal of the American Statistical Association.
Patterson, Scott D, and Byron Jones. 2017. Bioequivalence and Statistics in Clinical Pharmacology. Chapman; Hall/CRC. https://www.routledge.com/Bioequivalence-and-Statistics-in-Clinical-Pharmacology/Patterson-Jones/p/book/9780367782443.
Schuirmann, Donald. 2023. “Video: FDA Draft Guidance on Statistical Approaches to Establishing Bioequivalence. Section: Statistical Test for Population Bioequivalence.” 2023. https://www.youtube.com/watch?v=VGE1KNFpPQQ&t=1657s.

Reuse