Improving the precision of estimates of the frequency of rare events
Date
Authors
Major Professor
Advisor
Committee Member
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
The probability of a rare event is usually estimated directly as the number of times the event occurs divided by the total sample size. Unfortunately, the precision of this estimate is low. For typical sample sizes of N < 100 in ecological studies, the coefficient of variation (cv) of this estimate of the probability of a rare event can exceed 300%. Sample sizes on the order of 103–104 observations are needed to reduce the cv to below 10%. If it is impractical or impossible to increase the sample size, auxiliary data can be used to improve the precision of the estimate. We describe four approaches for using auxiliary data to improve the precision of estimates of the probability of a rare event: (1) Bayesian analysis that includes prior information about the probability; (2) stratification that incorporates information on the heterogeneity in the population; (3) regression models that account for information correlated with the probability; and (4) inclusion of aggregated data collected at larger spatial or temporal scales. These approaches are illustrated using data on the probability of capture of vespulid wasps by the insectivorous plant Darlingtonia californica. All four methods increase the precision of the estimate relative to the simple frequency-based estimate (absolute precision = 1.26, relative precision [cv] = 70%): stratification (absolute precision = 1.10, cv = 62%); regression models (absolute precision = 1.59, cv = 55%); Bayesian analysis with an informative prior probability distribution (absolute precision = 4.28, cv = 47%); and using temporally aggregated data (absolute precision = 6.75, cv = 36%). When informative auxiliary data is available, we recommend including it when estimating the probability of rare events.
Series Number
Journal Issue
Is Version Of
Versions
Series
Academic or Administrative Unit
Type
Comments
This is an article from Ecology 86 (2005): 1114, doi:10.1890/04-0601. Posted with permission.