## A Primer on the Use of Equivalence Testing for Evaluating Measurement Agreement

 dc.contributor.author Dixon, Philip dc.contributor.author Saint-Maurice, Pedro dc.contributor.author Kim, Youngwon dc.contributor.author Dixon, Philip dc.contributor.author Hibbing, Paul dc.contributor.author Bai, Yang dc.contributor.author Welk, Gregory dc.contributor.department Statistics dc.contributor.department Kinesiology dc.date 2019-06-26T01:05:45.000 dc.date.accessioned 2020-07-02T06:56:53Z dc.date.available 2020-07-02T06:56:53Z dc.date.copyright Sun Jan 01 00:00:00 UTC 2017 dc.date.issued 2018-04-01 dc.description.abstract

Purpose Statistical equivalence testing is more appropriate than conventional tests of difference to assess the validity of physical activity (PA) measures. This article presents the underlying principles of equivalence testing and gives three examples from PA and fitness assessment research.

Methods The three examples illustrate different uses of equivalence tests. Example 1 uses PA data to evaluate an activity monitor’s equivalence to a known criterion. Example 2 illustrates the equivalence of two field-based measures of physical fitness with no known reference method. Example 3 uses regression to evaluate an activity monitor’s equivalence across a suite of 23 activities.

Results The examples illustrate the appropriate reporting and interpretation of results from equivalence tests. In the first example, the mean criterion measure is significantly within ±15% of the mean PA monitor. The mean difference is 0.18 METs and the 90% confidence interval of −0.15 to 0.52 is inside the equivalence region of −0.65 to 0.65. In the second example, we chose to define equivalence for these two measures as a ratio of mean values between 0.98 and 1.02. The estimated ratio of mean V˙O2 values is 0.99, which is significantly (P = 0.007) inside the equivalence region. In the third example, the PA monitor is not equivalent to the criterion across the suite of activities. The estimated regression intercept and slope are −1.23 and 1.06. Neither confidence interval is within the suggested regression equivalence regions.

Conclusions When the study goal is to show similarity between methods, equivalence testing is more appropriate than traditional statistical tests of differences (e.g., ANOVA and t-tests).

This article is published as Dixon, P.M., Saint-Maurice, P.F., Kim, Y., Hibbing, P., Bai, Y. and Welk, G.J. 2018. A primer on the use of equivalence testing for evaluating measurement agreement. Medicine & Science in Sports & Exercise 50:837-845. doi: 10.1249/MSS.0000000000001481.

dc.format.mimetype application/pdf dc.identifier archive/lib.dr.iastate.edu/stat_las_pubs/160/ dc.identifier.articleid 1164 dc.identifier.contextkey 14403569 dc.identifier.s3bucket isulib-bepress-aws-west dc.identifier.submissionpath stat_las_pubs/160 dc.identifier.uri https://dr.lib.iastate.edu/handle/20.500.12876/90466 dc.language.iso en dc.source.bitstream archive/lib.dr.iastate.edu/stat_las_pubs/160/2018_Dixon_PrimerUse.pdf|||Fri Jan 14 20:53:42 UTC 2022 dc.source.uri 10.1249/MSS.0000000000001481 dc.subject.disciplines Exercise Science dc.subject.disciplines Kinesiology dc.subject.disciplines Statistical Methodology dc.subject.keywords calibration dc.subject.keywords validation dc.subject.keywords criterion validity dc.subject.keywords convergent validity dc.title A Primer on the Use of Equivalence Testing for Evaluating Measurement Agreement dc.type article dc.type.genre article dspace.entity.type Publication relation.isAuthorOfPublication 7b3eb8d2-a569-4aba-87a1-5d9c2d99fade relation.isOrgUnitOfPublication 264904d9-9e66-4169-8e11-034e537ddbca relation.isOrgUnitOfPublication f7b0f2ca-8e43-4084-8a10-75f62e5199dd
##### Original bundle
Now showing 1 - 1 of 1
Name:
2018_Dixon_PrimerUse.pdf
Size:
233.33 KB
Format: