A Course on Bioequivalence: Unit 12 - Reference Scaling Part I

1 Unit overview

This is the first of two units dealing with reference scaling, sometimes known as reference scaled average bioequivalence (RSABE). A related approach is average bioequivalence with expanding limits (ABEL). These approaches arose from an approach called population bioequivalence and a related approach called individual bioequivalence.

In all of these cases, a unifying overarching idea is that the acceptance criteria for bioequivalence also depend on the variability of the product(s), and not only on their means. See Chapter 7 of Patterson and Jones (2017) for a general overview of these topics.

While historically, population bioequivalence is important, it is not our focus here because it has mostly been abandoned from human trials. Note that it is still recommended for some in vitro bioequivlaence studies which are not the focus in this course. For more information about current recommendations for population bioequivalence, a starting point can be the FDA video Choi (2023). We do not discuss population bioequivalence or individual bioequivalence further.

Within the range of topics of reference scaling, our focus in this unit and the next is on reference scaled average bioequivalence (RSABE). A succinct description of this approach is in the FDA video Schuirmann (2023). We cover this material in this unit and the next.

The RSABE approach is used for two typical cases:

Highly variable drugs products (HVDP or HVD) - These are drugs with a high CV for the within subject variability of the reference product, where a standard average bioequivalence approach (ABE) may often fail to show bioequivalence even in cases where the products are BE. Namely for HVDs, average bioequivalence (ABE) may have a producer’s risk that is too high unless very large sample sizes are used. In this case, RSABE may essentially widen the effective limits beyond 80-125.
Narrow therapeutic index drugs (NTID) - As discussed in the Previous Unit, these are products which are very sensitive and require more stringent criteria. In this case, RSABE may essentially narrow the effective limits to be less than 80-125.

The first case, HVD, is not only covered by RSABE with the FDA approach, but also with other expanding limits approaches such as ABEL. In any circumstance, using such approaches requires planning for such cases with the study design and trial approval. Our focus in this unit is on the FDA approach for HVD, namely RSABE for HVD. The main FDA document for this approach is FDA (2011).

The second case, NTID is also handled in the FDA via reference scaling, and this is the focus of the next unit. As seen in the previous unit, NTID can also be handled in a simple approach by narrowing the limits (e.g. at a fixed change going from 80-125 to 90-111.11), yet the RSABE based FDA approach, outlined in FDA (2012) is more intricate and relies on RSABE.

We use the following packages.

using Bioequivalence  # available with Pumas products
using BioequivalencePower  # available with Pumas products
using PharmaDatasets # available with Pumas products
using SummaryTables

2 An Operational Description of RSABE for HVD

Let us first consider the basics of carrying out RSABE for HVD based on FDA (2011).

The approach is based on a three period, three sequence partial replicate design or a four period, two sequence fully replicate design. Thus the designs supported are:

Partial replicate: RRT|RTR|TRR.
Fully replicate: RTRT|TRTR, RTTR|TRRT, or RRTT|TTRR.

In both cases, an important ingredient is the estimation of within subject variability for the reference product. This is done via the auxiliary estimator described in Unit 10. The estimated variability is presented and denoted as CVᵣ.

Reference CV below threshold implies standard ABE criteria

If CVᵣ is less than Minimal CVᵣ for Reference Scaling which is set by the FDA as \(30\%\) then even if the study protocol was set as RSABE, we simply carry out standard ABE.

As an application of this consider the SLTGSF2020_DS16 dataset example where we purposefully take only rows 5:30 of the dataset to create modified_fully_replicate_data_1. This subset of the data is just an example where CVᵣ does not meet the threshold.

fully_replicate_data = dataset(joinpath("bioequivalence", "RTTR_TRRT", "SLTGSF2020_DS16"));
modified_fully_replicate_data_1 = fully_replicate_data[5:30, :]
simple_table(modified_fully_replicate_data_1)


id	sequence	period	PK

2	RTTR	1	0.202
2	RTTR	2	0.178
2	RTTR	3	0.208
2	RTTR	4	0.36
3	TRRT	1	0.433
3	TRRT	2	0.313
3	TRRT	3	0.213
3	TRRT	4	0.234
4	RTTR	1	1.4
4	RTTR	2	0.628
4	RTTR	3	2.21
4	RTTR	4	2.45
5	RTTR	1	1.72
5	RTTR	2	1.31
5	RTTR	3	2.25
5	RTTR	4	3.11
6	TRRT	1	0.256
6	TRRT	2	1.1
6	TRRT	3	0.528
6	TRRT	4	1.14
7	RTTR	1	0.436
7	RTTR	2	0.278
7	RTTR	3	0.483
7	RTTR	4	0.642
8	TRRT	1	0.592
8	TRRT	2	0.359

Now if the trial protocol indicates RSABE for HVD, we use the FDA_HighlyVariable in pumas_be:

pumas_be(modified_fully_replicate_data_1, FDA_HighlyVariable, endpoint = :PK)


Observation Counts
Sequence ╲ Period	1	2	3	4

RTTR	4	4	4	4
TRRT	3	3	2	2


Paradigm: Replicated crossover that supports reference scaling
Model: Mixed model (unequal variance)
Criteria: FDA RSABE for HV
Endpoint: PK
Formulations: Reference(R), Test(T)

		Results(PK)	Assessment	Criteria
R	Geometric Marginal Mean	0.5819
	Geometric Naive Mean	0.6678
T	Geometric Marginal Mean	0.5235
	Geometric Naive Mean	0.5427
	Geometric Mean T/R Ratio (%)	89.96
	Degrees of Freedom	9.655
	90% Confidence Interval (%)	[62.37, 129.8]	Fail	CI ⊆ [80, 125]
Variability	CVᵣ (%) \| σ̂ᵣ	10.57 \| 0.1054	Not OK for RS	CVᵣ ≥ Minimal CVᵣ
	CVₜ (%) \| σ̂ₜ	65.5 \| 0.5975
	Variability Ratio (%)	566.8
ANOVA	Formulation (p-value)	0.6112
	Sequence (p-value)	0.4157
	Period (p-value)	0.00366
Reference Scaling Params	Reference Scaling Constant	0.7967
	Minimal CVᵣ for Reference Scaling (%)	30.0 \| 0.294
Reference Scaling Analysis	Geometric Mean T/R Ratio (%)	80.47
	Standard Error (Log Scale)	0.0887
	90% Confidence Interval (%)	[66.6, 97.23]
	Degrees of Freedom	4
	Howe's Approx RSABE Stat (95%)	0.1565

In this particular case, with the modified_fully_replicate_data_1 dataset, you can see the Not OK for RS in the Assessment column and this is because the required criteria, CVᵣ ≥ Minimal CVᵣ was not met. Hence in this case, the trial is treated as a standard ABE trial with the CI ⊆ [80, 125] criteria.

Here as it is a standard ABE trial, since the confidence interval is [62.37, 129.8], we Fail. If the confidence interval would have been a subset of [80, 125] we would have obtained a Pass, even though we are still in Not OK for RS.

Note that that the Reference Scaling Params and Reference Scaling Analysis sections are still displayed, even though the lower CVᵣ implies we carried out standard ABE.

Reference CV above threshold implies RSABE criteria

Now let us slightly modify the dataset by selecting rows 1:30 of the original dataset. We get the following output:

modified_fully_replicate_data_2 = fully_replicate_data[5:35, :]
pumas_be(modified_fully_replicate_data_2, FDA_HighlyVariable, endpoint = :PK)


Observation Counts
Sequence ╲ Period	1	2	3	4

RTTR	4	4	4	4
TRRT	4	4	4	3


Paradigm: Replicated crossover that supports reference scaling
Model: Mixed model (unequal variance)
Criteria: FDA RSABE for HV
Endpoint: PK
Formulations: Reference(R), Test(T)

		Results(PK)	Assessment	Criteria
R	Geometric Marginal Mean	0.5468
	Geometric Naive Mean	0.5468
T	Geometric Marginal Mean	0.4757
	Geometric Naive Mean	0.5116
	Geometric Mean T/R Ratio (%)	87	Pass	GMR ∈ [80, 125]
	Degrees of Freedom	18.43
	90% Confidence Interval (%)	[66.74, 113.5]
Variability	CVᵣ (%) \| σ̂ᵣ	43.86 \| 0.4195	OK for RS	CVᵣ ≥ Minimal CVᵣ
	CVₜ (%) \| σ̂ₜ	59.47 \| 0.5503
	Variability Ratio (%)	131.2
ANOVA	Formulation (p-value)	0.3745
	Sequence (p-value)	0.3212
	Period (p-value)	0.04397
Reference Scaling Params	Reference Scaling Constant	0.7967
	Minimal CVᵣ for Reference Scaling (%)	30.0 \| 0.294
Reference Scaling Analysis	Geometric Mean T/R Ratio (%)	87
	Standard Error (Log Scale)	0.0957
	90% Confidence Interval (%)	[71.74, 105.5]
	Degrees of Freedom	6
	Howe's Approx RSABE Stat (95%)	-0.005847	Pass	≤ 0

In this case, the estimate CVᵣ at 43.86 is above the minimal value, thus we get OK for RS. With this indication, the acceptance/rejection of bioequivalence depend on two separate criteria that must be met:

Point estimate: The point estimate of the GMR needs to fall within the bounds. This is denoted as GMR ∈ [80, 125] in the Criteria column. In our case since the GMR is 87, it falls within the bounds and hence we Pass. (Note that it is just the point estimate, and not the 90% confidence interval that is considered).
Reference scaling statistic: The statistic denoted as Howe's Approx RSABE Statneeds to fall below a 95% quantile. In practice this is done by making sure the statistic is not greeater than 0. This is denoted as ≤ 0 in the Criteria column. In our case, the statistic is at -0.005847 and is negative and hence we Pass.

Note that the point estimate criteria is less stringent than the standard ABE criteria because the GMR estimate is always contained inside the confidence interval. Also note the use of the mathematical symbol ∈ (an “element of”) for that criteria in contrast to the mathematical symbol ⊆ (a “subset of”). This criteria was simply introduced by the FDA as an extra safety measure to ensure that we never approve a drug whose actual point estimate of the GMR falls out of the standard 80-125 bounds.

The second criteria, the reference scaling statistic lies at the heart of the RSABE approach and we describe it in more detail below. At this point consider that it is essentially implies that bioequivalence passed with adjusted (or implied) bounds which were adjusted based on the estimate of the within subject variability of the reference product.

In any case, to summarize, since we are in the reference CV above threshold case, to pass RSABE for HVD both the point estimate and the reference scaling criteria need to pass.

Here are see some situations that are OK for RS but are considered as not bioequivalent due to one or two reasons. Each of these cases is synthetically selected by choosing different subsets of the original dataset:

modified_fully_replicate_data_3 = fully_replicate_data[:, :] # subset is all of data
pumas_be(modified_fully_replicate_data_3, FDA_HighlyVariable, endpoint = :PK)


Observation Counts
Sequence ╲ Period	1	2	3	4

RTTR	20	20	20	20
TRRT	18	18	18	18


Paradigm: Replicated crossover that supports reference scaling
Model: Mixed model (unequal variance)
Criteria: FDA RSABE for HV
Endpoint: PK
Formulations: Reference(R), Test(T)

		Results(PK)	Assessment	Criteria
R	Geometric Marginal Mean	0.5916
	Geometric Naive Mean	0.5957
T	Geometric Marginal Mean	0.4663
	Geometric Naive Mean	0.4688
	Geometric Mean T/R Ratio (%)	78.83	Fail	GMR ∈ [80, 125]
	Degrees of Freedom	86.56
	90% Confidence Interval (%)	[69.31, 89.66]
Variability	CVᵣ (%) \| σ̂ᵣ	49.72 \| 0.47	OK for RS	CVᵣ ≥ Minimal CVᵣ
	CVₜ (%) \| σ̂ₜ	51.41 \| 0.4843
	Variability Ratio (%)	103.1
ANOVA	Formulation (p-value)	0.00283
	Sequence (p-value)	0.5266
	Period (p-value)	0.03876
Reference Scaling Params	Reference Scaling Constant	0.7967
	Minimal CVᵣ for Reference Scaling (%)	30.0 \| 0.294
Reference Scaling Analysis	Geometric Mean T/R Ratio (%)	78.83
	Standard Error (Log Scale)	0.0528
	90% Confidence Interval (%)	[72.12, 86.18]
	Degrees of Freedom	36
	Howe's Approx RSABE Stat (95%)	-0.04805	Pass	≤ 0

modified_fully_replicate_data_4 = fully_replicate_data[120:end, :]
pumas_be(modified_fully_replicate_data_4, FDA_HighlyVariable, endpoint = :PK)


Observation Counts
Sequence ╲ Period	1	2	3	4

RTTR	6	6	6	6
TRRT	2	2	2	3


Paradigm: Replicated crossover that supports reference scaling
Model: Mixed model (unequal variance)
Criteria: FDA RSABE for HV
Endpoint: PK
Formulations: Reference(R), Test(T)

		Results(PK)	Assessment	Criteria
R	Geometric Marginal Mean	0.7263
	Geometric Naive Mean	0.644
T	Geometric Marginal Mean	0.6139
	Geometric Naive Mean	0.5748
	Geometric Mean T/R Ratio (%)	84.52	Pass	GMR ∈ [80, 125]
	Degrees of Freedom	11.45
	90% Confidence Interval (%)	[57.29, 124.7]
Variability	CVᵣ (%) \| σ̂ᵣ	32.98 \| 0.3213	OK for RS	CVᵣ ≥ Minimal CVᵣ
	CVₜ (%) \| σ̂ₜ	65.97 \| 0.6011
	Variability Ratio (%)	187.1
ANOVA	Formulation (p-value)	0.4546
	Sequence (p-value)	0.8717
	Period (p-value)	0.6823
Reference Scaling Params	Reference Scaling Constant	0.7967
	Minimal CVᵣ for Reference Scaling (%)	30.0 \| 0.294
Reference Scaling Analysis	Geometric Mean T/R Ratio (%)	85.37
	Standard Error (Log Scale)	0.1356
	90% Confidence Interval (%)	[65.59, 111.1]
	Degrees of Freedom	6
	Howe's Approx RSABE Stat (95%)	0.1009	Fail	≤ 0

modified_fully_replicate_data_5 = fully_replicate_data[5:50, :]
pumas_be(modified_fully_replicate_data_5, FDA_HighlyVariable, endpoint = :PK)


Observation Counts
Sequence ╲ Period	1	2	3	4

RTTR	4	4	4	4
TRRT	8	8	7	7


Paradigm: Replicated crossover that supports reference scaling
Model: Mixed model (unequal variance)
Criteria: FDA RSABE for HV
Endpoint: PK
Formulations: Reference(R), Test(T)

		Results(PK)	Assessment	Criteria
R	Geometric Marginal Mean	0.7028
	Geometric Naive Mean	0.6366
T	Geometric Marginal Mean	0.5618
	Geometric Naive Mean	0.5268
	Geometric Mean T/R Ratio (%)	79.94	Fail	GMR ∈ [80, 125]
	Degrees of Freedom	29.22
	90% Confidence Interval (%)	[65.61, 97.4]
Variability	CVᵣ (%) \| σ̂ᵣ	38.86 \| 0.375	OK for RS	CVᵣ ≥ Minimal CVᵣ
	CVₜ (%) \| σ̂ₜ	49.97 \| 0.4721
	Variability Ratio (%)	125.9
ANOVA	Formulation (p-value)	0.0639
	Sequence (p-value)	0.5747
	Period (p-value)	0.00529
Reference Scaling Params	Reference Scaling Constant	0.7967
	Minimal CVᵣ for Reference Scaling (%)	30.0 \| 0.294
Reference Scaling Analysis	Geometric Mean T/R Ratio (%)	79.82
	Standard Error (Log Scale)	0.078
	90% Confidence Interval (%)	[69.19, 92.09]
	Degrees of Freedom	9
	Howe's Approx RSABE Stat (95%)	0.03767	Fail	≤ 0

A partial replicate example

As you can see, the threshold decisions of RSABE for HVD does not depend on the within subject variability of the test product. It only depends on the reference product. This also allows us to use a partial replicate design.

partial_replicate_data =
    dataset(joinpath("bioequivalence", "RRT_RTR_TRR", "SLTGSF2020_DS07"));

Here are the rows with the smallest endpoint values and the rows with the highest endpoint values. This is just a means to get some sort of feeling for the data:

simple_table(first(sort(partial_replicate_data, :PK), 10))


id	sequence	period	PK

134	RTR	2	11.9
271	TRR	1	14.8
271	TRR	3	15.4
259	TRR	1	15.9
191	RTR	2	16.8
271	TRR	2	19.3
134	RTR	3	19.7
73	RRT	3	19.8
278	TRR	1	19.9
93	RRT	1	21

simple_table(last(sort(partial_replicate_data, :PK), 10))


id	sequence	period	PK

75	RRT	2	348
159	RTR	3	353
273	TRR	3	377
311	TRR	3	379
111	RRT	2	384
170	RTR	1	387
273	TRR	1	432
359	TRR	2	436
49	RRT	2	450
292	TRR	1	542

Now we can use pumas_be with FDA_HighlyVariable:

pumas_be(partial_replicate_data, FDA_HighlyVariable; endpoint = :PK)


Observation Counts
Sequence ╲ Period	1	2	3

RRT	120	120	120
RTR	120	120	120
TRR	120	120	120


Paradigm: Replicated crossover that supports reference scaling
Model: Mixed model (unequal variance)
Criteria: FDA RSABE for HV
Endpoint: PK
Formulations: Reference(R), Test(T)

		Results(PK)	Assessment	Criteria
R	Geometric Marginal Mean	99.41
	Geometric Naive Mean	99.41
T	Geometric Marginal Mean	89.05
	Geometric Naive Mean	89.05
	Geometric Mean T/R Ratio (%)	89.58	Pass	GMR ∈ [80, 125]
	Degrees of Freedom	357.7
	90% Confidence Interval (%)	[86.44, 92.83]
Variability	CVᵣ (%) \| σ̂ᵣ	34.23 \| 0.3329	OK for RS	CVᵣ ≥ Minimal CVᵣ
ANOVA	Formulation (p-value)	0
	Sequence (p-value)	0.08425
	Period (p-value)	0.7295
Reference Scaling Params	Reference Scaling Constant	0.7967
	Minimal CVᵣ for Reference Scaling (%)	30.0 \| 0.294
Reference Scaling Analysis	Geometric Mean T/R Ratio (%)	89.58
	Standard Error (Log Scale)	0.0216
	90% Confidence Interval (%)	[86.44, 92.83]
	Degrees of Freedom	357
	Howe's Approx RSABE Stat (95%)	-0.06284	Pass	≤ 0

Designs for estimating `CVᵣ` and not supporting RSABE

Note that there are designs where in principle we could have carried out RSABE since CVᵣ is estimated, but RSABE is not supported since it is not in the FDA (2011) guidance. Here is an example (where we display the first few rows of the data with the lowest AUC):

dual_data = dataset(joinpath("bioequivalence", "RTT_TRR", "PJ2017_4_1"))
simple_table(first(sort(dual_data, :AUC), 10))


id	sequence	period	AUC	Cmax

167	TRR	2	3.31	0.32
124	TRR	1	5.82	0.347
186	RTT	1	6.06	0.31
157	TRR	2	7.16	0.459
142	RTT	2	8.08	0.65
179	RTT	3	9.82	0.71
179	RTT	1	10.1	1.02
157	TRR	3	10.9	0.756
138	RTT	2	11	0.62
101	TRR	3	11.3	0.533

If we would have tried to use pumas_be with FDA_HighlyVariable we would get an error:

pumas_be(dual_data, FDA_HighlyVariable) # throws an error!

Still, we may use this design with standard average bioequivalence. To be explicit about this we may use the StandardBioequivalenceCriterion input as the second argument. This is in fact the default value of that argument, and hence if needed, there is no need to specify it. Still here it is as an example:

pumas_be(dual_data, StandardBioequivalenceCriterion)


Observation Counts
Sequence ╲ Period	1	2	3

RTT	46	45	43
TRR	47	47	47


Paradigm: Replicated crossover that does not support reference scaling
Model: Mixed model (unequal variance)
Criteria: Standard ABE
Endpoint: AUC
Formulations: Reference(R), Test(T)

		Results(AUC)	Assessment	Criteria
R	Geometric Marginal Mean	102.8
	Geometric Naive Mean	112
T	Geometric Marginal Mean	100.1
	Geometric Naive Mean	93.96
	Geometric Mean T/R Ratio (%)	97.38
	Degrees of Freedom	148.7
	90% Confidence Interval (%)	[86.86, 109.2]	Pass	CI ⊆ [80, 125]
Variability	CVᵣ (%) \| σ̂ᵣ	42.75 \| 0.4097
	CVₜ (%) \| σ̂ₜ	69.65 \| 0.6289
ANOVA	Formulation (p-value)	0.7013
	Sequence (p-value)	0.2506
	Period (p-value)	0.01209

Observe that with this design the Paradigm is Replicated crossover that does not support reference scaling. Importantly we can estimate the within subject variability both for the reference (CVᵣ is 42.75) and for the test (CVₜ is 69.65). Using such a design in a pilot study can enable us to get preliminary estimates of the variability so that we can use them in a pivotal study down the road, perhaps with RSABE if we choose to apply that approach.

3 The rationale of RSABE (focused on HVD)

The RSABE approach for highly variable drugs is setout in FDA (2012). Further, as mentioned above the FDA video Schuirmann (2023) is a short neat (additional) description of the approach.

The basic hypothesis formulation

Let us go through the basics for the approach, and see the hypothesis formulation of RSABE.

We start with the geometric mean ratio (\(\text{GMR}\)) and the log transformed means, \(\mu_T\) and \(\mu_R\):

\[ \text{GMR} = \frac{\text{GM}_T}{\text{GM}_R} = \frac{e^{\mu_T}}{e^{\mu_R}}. \]

Now we can set \(\Delta = 1.25\) (and for NTID it is \(\Delta = 1.11111\)), and with standard ABE, say that R and T are bioequivalent if:

\[ \underbrace{\frac{1}{\Delta}}_{0.8} < \frac{e^{\mu_T}}{e^{\mu_R}} < \underbrace{\Delta}_{1.25}. \]

Take logarithms:

\[ \underbrace{-\log \Delta}_{-0.22314} < \mu_T - \mu_R < \underbrace{\log \Delta}_{0.22314}. \]

An equivalent inequality (where \(|x|\) is the absolute value of \(x\)) is:

\[ |\mu_T - \mu_R| < \log \Delta. \]

And further an equivalent inequality is:

\[ (\mu_T - \mu_R)^2 < (\log \Delta)^2. \]

This last (squaring based) representation of ABE is useful as it allows us to see how RSABE is a generalization of ABE.

Scaling

Now consider scaling the right hand side by a value \(\kappa\) that depends on the within subject variability. It is this value which influences the effective bounds:

\[ (\mu_T - \mu_R)^2 < (\log \Delta)^2 ~\kappa, \]

If \(\kappa = 1\) there is no change.
If \(\kappa < 1\) there are tighter bounds.
If \(\kappa > 1\) there are wider bounds.

An equivalent inequality is:

\[ |\mu_T - \mu_R| < (\log \Delta) ~\sqrt{\kappa}, \]

or

\[ -(\log \Delta) ~\sqrt{\kappa} ~<~ \mu_T - \mu_R ~<~ (\log \Delta) ~\sqrt{\kappa}. \]

To get \(\kappa\), Denote \(\sigma^2_{WR}\) as the within subject variability of the reference product, represented as a variance of the log transformed endpoint. Also denote \(\sigma_{W_0} = 0.25\) as a comparison constant. We now set,

\[ \kappa = \frac{\sigma^2_{WR}}{\sigma^2_{W_0}}. \]

The implied limits are then:

\[ |\mu_T - \mu_R| ~<~ (\log \Delta) \frac{\sigma_{WR}}{\sigma_{W_0}}. \]

Observe that as \(\sigma_{WR}\) the limits get wider (HVD) or as \(\sigma_{WR}\) decreases the limits get narrower (NTID).

Not that if we rearrange, we get:

\[ \frac{(\mu_T - \mu_R)^2}{\sigma_{WR}^2} ~<~ \frac{(\log \Delta)^2}{\sigma_{W_0}^2}. \]

The regulatory constant, linearized criterion, and hypothesis formulation

We can define the regulatory constant \(\theta\) as:

\[ \theta = \frac{(\log \Delta)^2}{\sigma_{W_0}^2}. \]

Thus the inequality for BE is:

\[ \frac{(\mu_T - \mu_R)^2}{\sigma_{WR}^2} < \theta. \]

Or in one last form, after rearranging, we have what is called the linearized criterion:

\[ (\mu_T - \mu_R)^2 - \theta \sigma^2_{WR} < 0. \]

It is really this linearized criterion which we consider in an hypothesis test. Hence finally here is the hypothesis test for reference scaled average bioequivalence:

\[ {\cal T}_{\text{RSABE}} = \begin{cases} H_{0}:& (\mu_T - \mu_R)^2 - \theta \sigma^2_{WR} \ge 0, \\[5pt] H_{1}:& (\mu_T - \mu_R)^2 - \theta \sigma^2_{WR} < 0.\\ \end{cases} \]

Here (us usual) in a TOST, \(H_{1}\) means we are bioequivalent.

What is \(\theta\) for HVD, see the draft guidance FDA (2011)?

Δ = 1.25
σ_W₀ = 0.25
θ = (log(Δ) / σ_W₀)^2

0.7966887118898779

What is \(\theta\) for NTID, see the draft guidance FDA (2012)?

Δ = 1 / 0.9 #1.11111
σ_W₀ = 0.1
θ = (log(Δ) / σ_W₀)^2

1.1100838259683068

4 Implementing the hypothesis test

The hypothesis formulation \({\cal T}_{\text{RSABE}}\) can be implemented in different ways. The RSABE approach suggests a test statistic based on a \(95\%\) upper confidence band for the linearized criterion. That statistic is compare to the threshold level \(0\). So if it is negative we reject and otherwise not.

We use Howe’s Approximation I (Howe (1974)) and a few other subtle computations suggested by in FDA (2011).

SAS code from the FDA guidance

Here are snippets of SAS code suggested in that guidance:

x=estimate**2-stderr**2;
boundx=(max(abs(lower),abs(upper)))**2;
theta=((log(1.25))/0.25)**2;
y=-theta*s2wr;
boundy=y*dfd/cinv(0.95,dfd);
sWR=sqrt(s2wr);
critbound=(x+y)+sqrt(((boundx-x)**2)+((boundy-y)**2))

Here the key quantities are:

estimate is the log-GMR based on the model used for within subject variability estimation.
stderr is the standard error of that parameter from that model.
lower and upper are 90% confidence bounds from that model. These are used within Howe’s approximation.
s2wr is the estimate of \(\sigma_{WR}\) from that model.
dfd are the degrees of freedom of that model.

Further in SAS:

** is squaring.
abs, max, and sqrt are functions as you may expect.
cinv is the quantile of a Chi-square distribution.

This computation then yields critbound which in Pumas we call the Howe's Approx RSABE Stat (95%).

A compatible Pumas/Julia computation

Here is for example how the computation is carried out inside Pumas once we are given, β (similar to SAS’s estimate), lower_bound (similar to SAS’s lower), upper_bound (similar to SAS’s upper), se (similar to SAS’s stderr):

x = β^2 - se^2
boundx = (max((abs(lower_bound)), (abs(upper_bound))))^2
y = -𝜃 * σwᵣ^2
boundy = y * k / quantile(Chisq(k), level_y)
howe_stat = x + y + √(((boundx - x)^2) + ((boundy - y)^2))

We can now to the last output to understand the related quantities under the Reference Scaling Analysis section:

pumas_be(partial_replicate_data, FDA_HighlyVariable; endpoint = :PK)


Observation Counts
Sequence ╲ Period	1	2	3

RRT	120	120	120
RTR	120	120	120
TRR	120	120	120


Paradigm: Replicated crossover that supports reference scaling
Model: Mixed model (unequal variance)
Criteria: FDA RSABE for HV
Endpoint: PK
Formulations: Reference(R), Test(T)

		Results(PK)	Assessment	Criteria
R	Geometric Marginal Mean	99.41
	Geometric Naive Mean	99.41
T	Geometric Marginal Mean	89.05
	Geometric Naive Mean	89.05
	Geometric Mean T/R Ratio (%)	89.58	Pass	GMR ∈ [80, 125]
	Degrees of Freedom	357.7
	90% Confidence Interval (%)	[86.44, 92.83]
Variability	CVᵣ (%) \| σ̂ᵣ	34.23 \| 0.3329	OK for RS	CVᵣ ≥ Minimal CVᵣ
ANOVA	Formulation (p-value)	0
	Sequence (p-value)	0.08425
	Period (p-value)	0.7295
Reference Scaling Params	Reference Scaling Constant	0.7967
	Minimal CVᵣ for Reference Scaling (%)	30.0 \| 0.294
Reference Scaling Analysis	Geometric Mean T/R Ratio (%)	89.58
	Standard Error (Log Scale)	0.0216
	90% Confidence Interval (%)	[86.44, 92.83]
	Degrees of Freedom	357
	Howe's Approx RSABE Stat (95%)	-0.06284	Pass	≤ 0

You may now work out these computations to verify Howe's Approx RSABE Stat (95%).

5 Conclusion

In this unit, we introduced the concepts and operational details of reference scaled average bioequivalence (RSABE), focusing on its application for highly variable drug products (HVDs) according to FDA guidance. We reviewed the motivations for reference scaling, emphasizing that acceptance criteria for bioequivalence can be adjusted based on the within-subject variability of the reference product, rather than fixed limits alone. Through practical examples using replicate study designs and the Pumas bioequivalence tools, we demonstrated how to determine when RSABE should be applied versus standard average bioequivalence, and clarified the two main criteria for RSABE: the point estimate of the geometric mean ratio (GMR) and the specialized reference scaling test statistic. We also explored the mathematical rationale behind RSABE, including its scaling mechanism that widens or narrows BE limits based on variability, and we discussed the hypothesis-testing approach that underpins regulatory decision-making. This unit thus provides a comprehensive foundation for understanding and implementing RSABE in the context of HVDs, setting the stage for the subsequent unit on reference scaling for narrow therapeutic index drugs (NTIDs).

6 Unit exercises

Core Concepts of Reference Scaling
1. In your own words, explain why reference scaling (RSABE) is used for highly variable drugs.
2. What is the key statistic from the reference product that determines whether RSABE can be applied?
3. How do the acceptance limits change for HVDs under RSABE as compared to standard average bioequivalence (ABE)?
RSABE Decision Scenarios

Suppose you are analyzing a fully replicate dataset for an HVD using the FDA RSABE approach:
- In Dataset X, the estimated within-subject CV for the reference product (CVᵣ) is 28%.
- In Dataset Y, CVᵣ is 42%. For each dataset, answer:
1. Should RSABE be applied, or should standard ABE criteria be used?
2. Briefly describe the criteria that must be met for the trial to pass bioequivalence under each scenario.

Interpreting Pumas Output

Given the following snippet of results from a Pumas FDA_HighlyVariable RSABE analysis (focus on the Assessment column):

Criteria type	Statistic name	Value	Criteria	Assessment
Point Estimate	GMR	90.0	∈ [80, 125]	Pass
Reference Scaling Stat	Howe’s Approx RSABE Stat	-0.23	≤ 0	Pass
Overall	-	-	Both above pass	Pass

What does the Point Estimate criterion require, and was it met?
What is the meaning of the Reference Scaling Stat, and was it met in this example?
Why do both criteria need to be individually passed for the study to be considered bioequivalent via RSABE?

Mathematical Rationale
1. Write the linearized criterion for RSABE as described in the unit, naming all variables.
2. For an HVD, what is the purpose of the regulatory constant \(\theta\) and how is it calculated? (Give the formula in terms of \(\Delta\) and \(\sigma_{W0}\).)
3. Explain how changing the estimate of \(\sigma_{WR}\) affects the implied bioequivalence limits.
Study Design and Regulatory Guidance
1. List the replicate study designs that are eligible for the FDA’s RSABE approach for HVDs, based on this unit.
2. Why do dual-sequence full replicate or partial replicate designs allow the estimation necessary for RSABE, while some other designs do not?
3. Briefly explain what you might do with a study dataset that does not support RSABE but still provides within-subject variability estimates for reference and test products.

A Course on Bioequivalence: Unit 12 - Reference Scaling Part I

1 Unit overview

2 An Operational Description of RSABE for HVD

Reference CV below threshold implies standard ABE criteria

Reference CV above threshold implies RSABE criteria

A partial replicate example

Designs for estimating `CVᵣ` and not supporting RSABE

3 The rationale of RSABE (focused on HVD)

The basic hypothesis formulation

Scaling

The regulatory constant, linearized criterion, and hypothesis formulation

4 Implementing the hypothesis test

SAS code from the FDA guidance

A compatible Pumas/Julia computation

5 Conclusion

6 Unit exercises

References

Reuse

1 Unit overview

2 An Operational Description of RSABE for HVD

Reference CV below threshold implies standard ABE criteria

Reference CV above threshold implies RSABE criteria

A partial replicate example

Designs for estimating CVᵣ and not supporting RSABE

3 The rationale of RSABE (focused on HVD)

The basic hypothesis formulation

Scaling

The regulatory constant, linearized criterion, and hypothesis formulation

4 Implementing the hypothesis test

SAS code from the FDA guidance

A compatible Pumas/Julia computation

5 Conclusion

6 Unit exercises

References

Reuse

Designs for estimating `CVᵣ` and not supporting RSABE