Skip to main content icon/video/no-internet

An equivalence test is a method of hypothesis testing that is a variation of the more commonly used method of significance testing. In significance testing, the idea is to test a null hypothesis that two means are equal. Rejecting the null hypothesis leads to the conclusion that the population means are significantly different from each other. Equivalence testing, on the other hand, is used to test a null hypothesis that two means are not equal. Rejection of the null hypothesis in an equivalence test leads to the conclusion that the population means are equivalent. The approach of equivalence testing differs from the more familiar hypothesis tests, such as the two-sample t test, where rejection of the null is used to infer that the population means are significantly different.

Equivalence testing originated in the fields of biostatistics and pharmacology, where one often wishes to show that two means are “equivalent” within a certain bound. Many researchers often incorrectly conclude that the failure to reject the null hypothesis in a standard hypothesis test (such as a t test) is “proof” that the null hypothesis is true and hence that the populations are “equivalent.” This erroneous inference neglects the possibility that the failure to reject the null is often merely indicative of a Type II error, particularly when the sample sizes being used are small and the power is low.

We will consider a common equivalence test known as the two one-sided tests procedure, or TOST. It is a variation of the standard independent-samples t test. With a TOST, the researcher will conclude that the two population means are equivalent if it can be shown that they differ by less than some constant τ, the equivalence bound, in both directions. This bound is often chosen to be the smallest difference between the means that is practically significant. Biostatisticians often have the choice for τ made for them by government regulation.

The null hypothesis for a TOST is H0:|μ1−μ2| ≥ τ. The alternative hypothesis is H1:|μ1−μ2| < τ.

The first one-sided test seeks to show that the difference between the two means is less than or equal to -τ. To do so, compute the test statistic

None

where sp is the pooled standard deviation of the two samples. Then, compute the p value as p1 = P(t1 < tv), where tv has a t-distribution with η = n1+n2–2 degrees of freedom.

Similarly, the second one-sided test seeks to show that the difference between the two means is greater than or equal to +τ. To do so, compute the test statistic

None

Compute the p value as p2 = P(t2 > tv). Then let p = max(p1, p2) and reject the null hypothesis of nonequivalence if p < α.

Establishing equivalence between two treatments or groups has applications not just in biostatistical and pharmacological settings but also in many situations in the social sciences. Many hypotheses currently tested and interpreted with standard significance testing should be approached with equivalence testing.

Christopher J.Mecklin

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading