Determining an inter-rater agreement metric for researchers evaluating student pathways in problem solving
A new inter-rater agreement metric (IRRA) was developed for measuring agreement between multiple research coders when they code activities as they observe student problem-solving sessions. The complex nature of the student data includes activities that 1) are dependent on prior activities, 2) are ordinal data types, 3) can occur at any point in time, and 4) can reoccur. The assumptions used in traditional inter-rater agreement metrics are violated in this context and may lead to erroneous conclusions in particular datasets. In this study, coded activities are considered to be a sequence codes that can be analyzed using a string matching algorithm. We evaluated the metric's performance by simulating the variability of coders in a controlled fashion. The results show that the algorithm performed well as an inter-rater agreement metric over a wide range of conditions.