A Study in Reproducibility: The Congruent Matching Cells Algorithm and cmcR Package

Thumbnail Image
Supplemental Files
Date
2022-12
Authors
Zemmels, Joseph
VanderPlas, Susan
Hofmann, Heike
Major Professor
Advisor
Committee Member
Journal Title
Journal ISSN
Volume Title
Publisher
© The R Foundation 2022
Authors
Research Projects
Organizational Units
Organizational Unit
Center for Statistics and Applications in Forensic Evidence
The Center for Statistics and Applications in Forensic Evidence (CSAFE) carries out research on the scientific foundations of forensic methods, develops novel statistical methods and transfers knowledge and technological innovations to the forensic science community. We collaborate with more than 80 researchers and across six universities to drive solutions to support our forensic community partners with accessible tools, open-source databases and educational opportunities.
Organizational Unit
Statistics

The Department of Statistics seeks to teach students in the theory and methodology of statistics and statistical analysis, preparing its students for entry-level work in business, industry, commerce, government, or academia.

History
The Department of Statistics was formed in 1948, emerging from the functions performed at the Statistics Laboratory. Originally included in the College of Sciences and Humanities, in 1971 it became co-directed with the College of Agriculture.

Dates of Existence
1948-present

Related Units

Journal Issue
Is Version Of
Versions
Series
Abstract
Scientific research is driven by our ability to use methods, procedures, and materials from previous studies and further research by adding to it. As the need for computationally-intensive methods to analyze large amounts of data grows, the criteria needed to achieve reproducibility, specifically computational reproducibility, have become more sophisticated. In general, prosaic descriptions of algorithms are not detailed or precise enough to ensure complete reproducibility of a method. Results may be sensitive to conditions not commonly specified in written-word descriptions such as implicit parameter settings or the programming language used. To achieve true computational reproducibility, it is necessary to provide all intermediate data and code used to produce published results. In this paper, we consider a class of algorithms developed to perform firearm evidence identification on cartridge case evidence known as the Congruent Matching Cells (CMC) methods. To date, these algorithms have been published as textual descriptions only. We introduce the first open-source implementation of the Congruent Matching Cells methods in the R package cmcR. We have structured the cmcR package as a set of sequential, modularized functions intended to ease the process of parameter experimentation. We use cmcR and a novel variance ratio statistic to explore the CMC methodology and demonstrate how to fill in the gaps when provided with computationally ambiguous descriptions of algorithms.
Comments
This article is published as Zemmels, et al., "The R Journal: A Study in Reproducibility: The Congruent Matching Cells Algorithm and cmcR Package", The R Journal 14 (2023): 79-102. doi:10.32614/RJ-2023-014. Posted with permission of CSAFE.

This article is licensed under the Creative Commons Attribution 4.0 International license (CC BY 4.0, http://creativecommons.org/licenses/by/4.0/).
Description
Keywords
Citation
DOI
Copyright
Collections