Quantifying Bayes Factors for Forensic Handwriting Evidence

Thumbnail Image
Ray, Anyesha
Ommen, Danica
Major Professor
Committee Member
Journal Title
Journal ISSN
Volume Title
Copyright 2023, The Authors
Research Projects
Organizational Units
Organizational Unit
Center for Statistics and Applications in Forensic Evidence
The Center for Statistics and Applications in Forensic Evidence (CSAFE) carries out research on the scientific foundations of forensic methods, develops novel statistical methods and transfers knowledge and technological innovations to the forensic science community. We collaborate with more than 80 researchers and across six universities to drive solutions to support our forensic community partners with accessible tools, open-source databases and educational opportunities.
Organizational Unit

The Department of Statistics seeks to teach students in the theory and methodology of statistics and statistical analysis, preparing its students for entry-level work in business, industry, commerce, government, or academia.

The Department of Statistics was formed in 1948, emerging from the functions performed at the Statistics Laboratory. Originally included in the College of Sciences and Humanities, in 1971 it became co-directed with the College of Agriculture.

Dates of Existence

Related Units

Journal Issue
Is Version Of
Questioned Document Examiners (QDEs) are tasked with analyzing handwriting evidence to make source (or writership) determinations. The Center for Statistics and Applications of Forensic Evidence (CSAFE) has previously developed computational methods to automatically extract quantifiable handwriting features and statistical methods to analyze handwriting evidence to aid QDEs.1-3 The method developed by Crawford et. al uses a K-means clustering algorithm and Bayesian hierarchical model to perform closed-set writer identification.2 This means a questioned document is assigned to its most likely writer from a set of known writers but does not allow for the possibility of the questioned document to be written by someone not included in the set. Another method developed by Johnson and Ommen utilized machine learning techniques and score-based likelihood ratios (SLRs).3 SLRs have been criticized for a variety of shortcomings, including a lack of coherence and ability to incorporate the rarity of the features. Our goal is to develop a method that supports feature-based open-set writer identification while avoiding these issues. We implement an approach to quantify the value of forensic handwriting evidence using Bayes factors and Markov chain Monte Carlo (MCMC) computational techniques like those described in Collins and Ommen.4 There are two paths to consider depending on the forensic question: the common source and the specific source identification problems. We demonstrate the approach for each identification problem using documents from the CSAFE Handwriting database, which consists of documents of various lengths from over 240 writers: the London Letter is the longest, followed by an excerpt chosen from the book The Wonderful Wizard of Oz, and the phrase “The early bird may get the worm, but the second mouse gets the cheese” is the shortest.5 Handwriting features are extracted using the “handwriter” system, clustered using K-means, and subsequently used to quantify the Bayes factor. The performance of the methods is assessed using cross-validation and rates of misleading evidence (among other measures).
The following poster was presented at the 75th Anniversary Conference of the American Academy of Forensic Sciences, Orlando, Florida, February 13-18, 2023. Posted with permission of CSAFE.