A Novel, Data-Driven Approach to the Classification of Bloodstain Patterns
Bloodstain pattern analysis is the study of bloodstains at a crime scene with the purpose of drawing inference about the crime. A typical objective for bloodstain pattern analysis is to identify the cause/mechanism that produced the pattern. Examples of possible blood-producing mechanisms or events include blunt impact, cast-off, and gunshot. Analysts of bloodstain patterns frequently rely on measurements of the size, direction, shape and spatial distribution of the stains to classify the patterns. In recent years, bloodstain pattern analysis and other forensic disciplines have drawn considerable scrutiny. A comprehensive 2009 National Academy of Sciences report questioned the opinions of bloodstain pattern experts’ as more subjective than scientific, and also questioned the reliability of the traditional methods used for analyses.1 One result of this scrutiny is an increasing interest trend in the forensic science research community to build models to classify bloodstain patterns that can address the concerns that have been expressed. Recent studies have achieved notable classification accuracy by using machine learning models based on researcher-defined features of the stains and patterns.2,3 However, there are two limitations associated with these methods: 1) they rely on an ellipse representation of each bloodstain and thus may not work well for all categories of bloodstain patterns; 2) although pattern-level features may be defined that allow us to differentiate certain categories of bloodstain pattern, there is no guarantee they are sufficient for all possible categories, because the reliance on researcherdefined features can be limiting. In this presentation we report preliminary results on a novel method to address these two limitations of recently reported machine learning approaches. First, as noted above, the traditional ellipse representation of stains, which requires each stain be approximated well by an ellipse and omits non-elliptical shaped stains, may fail to characterize some types of patterns. We address this concern by generalizing the ellipse representation to allow for each stain to be approximated by multiple ellipses through an ellipse segmentation algorithm. Consequently, overlapping stains and stains with irregular shapes can be included to enhance the analysis. The second limitation is the reliance on expert-defined pattern-level features that are believed to serve as a suitable summary of the distribution of the ellipses in the pattern. A concern is that some important information regarding the distribution of the ellipses is lost during when relying only on these features. Instead of designing possible features based on prior knowledge, we apply a flexible approach that summarizes the information in the pattern, i.e., the information about the distribution of the stains, using a statistical concept known as the characteristic function. Then, by defining a distance metric between any two characteristic functions, we are able to apply flexible machine-learning classification methods (e.g. support vector machine) to bloodstain patterns. This approach transfers complete information about the distribution of stains in a pattern to a classifier without defining any explicit summary features. To illustrate the potential of our method, we conducted a pilot experiment with two sets of bloodstain patterns collected from different laboratory apparatuses. Preliminary results have shown superior performance of our method over previous ones in classifying bloodstain patterns produced by different mechanisms.
Posted with permission of CSAFE.