An exploration of decision consistency indices for one form tests

Hagen, Randi
Two studies compared Huynh (1976) and Subkoviak (1976) estimation procedures for indices of classification consistency, rho and kappa, to be used when only one form of a test is administered. Rho indicates the degree of consistency and kappa indicates the degree of consistency beyond chance. Both studies manipulated test length (25, 50, 75 items), distribution shape (normal and two left-skewed) and passing standards (70%, 80%, 90% of items correct). Study I used simulated data; Study II used data from actual test administrations;Both Huynh and Subkoviak procedures yielded rho estimates of similar magnitude and acceptably high as reliability coefficients. Subkoviak kappa estimates were consistently higher than Huynh kappa estimates. No differences in behavior of coefficients was seen between Studies I and II. Test length affected all estimates, increasing in magnitude as test length increased. Magnitudes of rho coefficients decreased as skewness increased; kappa was at its lowest with less skewed distributions, but no consistent pattern of magnitude and skewness interaction was apparent. The effect of distribution shape was a function of the proximity between standard and distribution mode: rho was at its lowest value and kappa at its highest value when the standard neared the mode;Difficulties with interpretation of kappa coefficients are discussed. Suggestions regarding the practical use of and future research directions for one-form rho and kappa estimates are made;References;Huynh, H. On the reliability of decisions in domain-referenced testing. Journal of Educational Measurement, 1976, 13, 253-264. Subkoviak, M. J. Estimating reliability from a single administration of a criterion-referenced test. Journal of Educational Measurement, 1976, 13, 265-276.