A Likelihood Ratio Approach for Detecting Behavioral Changes in Device Usage Over Time

Thumbnail Image
Longjohn, Rachel
Smyth, Padhraic
Major Professor
Committee Member
Journal Title
Journal ISSN
Volume Title
Copyright 2023, The Authors
Research Projects
Organizational Units
Journal Issue
Is Version Of
Center for Statistics and Applications in Forensic Evidence
This work focuses on the situation in which investigators have obtained as evidence logs of user-generated activities on a device, such as sending text messages or emails, opening or interacting with mobile apps, or making calls from particular locations. Quantitative methodologies for analyzing this kind of behavioral data from devices could be useful to investigators in a number of situations. For example, if a device is suspected to have not been with the owner during a time period of forensic interest, one could analyze the pattern of events on the device to try to determine if they are consistent with the device owner’s behavior, or if there is evidence of a change in behavior. Inconsistency could, for example, indicate that another person was using the device during this time. A time at which there was a change in the patterns of events on the device is referred to as a changepoint. For this analysis, two different source hypotheses are considered for a given set of user-generated event data: the same-source hypothesis and the different-source hypothesis. The same-source hypothesis assumes that all of the events in the evidence were generated by a single source. Alternatively, the different-source hypothesis posits that the data was generated by two different sources, i.e., a changepoint occurred at some point during the time period over which the device’s event data was obtained. The strength of the evidence in support of these hypotheses is reported through a likelihood ratio, which is a statistical method for quantifying the weight of the evidence and has been used in a variety of forensic applications. To arrive at a likelihood ratio, the data are modeled using a Bayesian statistical framework, in which the sequence of events generated on the device is the observed data and the underlying model parameters and the potential time of the changepoint are considered unobserved. It is shown that the proposed model leads to a straightforward formula for the likelihood ratio. This formula is flexible in that it can incorporate pre-existing knowledge about where a changepoint may have taken place, e.g., investigators may suspect a changepoint in a particular time window or feel that a changepoint is more probable within a particular time window compared to another. This work generalizes prior work to the practical situation in which the time of change (for the different-source hypothesis) is unknown. The potential usefulness of the proposed method is evaluated through experiments across a combination of simulated data and real-world datasets that are relevant to digital forensics.
The following poster was presented at the 75th Anniversary Conference of the American Academy of Forensic Sciences, Orlando, Florida, February 13-18, 2023. Posted with permission of CSAFE.