Bayesian hierarchical modeling for disease outbreaks
Is Version Of
Influenza is a common illness which affects many people every year. In the past few years, we have seen the great impact influenza can have on the population and the health care system. For most, influenza will result in a minor inconvenience, but influenza can lead to serious health problems including death especially among the young, the elderly and expecting women. The Centers for Disease Control and Prevention (CDC) has created the U.S. Outpatient Influenza-like Illness Surveillance Network (ILINet), a network of outpatient healthcare providers throughout the United States of America and its territories who have agreed to report the weekly number of patients they see in their office showing influenza-like illness (ILI) and the total number of patients seen for any reason. Though ILINet is viewed as the gold standard for estimating influenza activity, it is often reported at a one or two week lag. Internet searches can provide a better real time view of influenza activity though they can be biased.
In this thesis, we develop multiple models using only ILINet data then develop a method for these models to incorporate a second data source through data fusion. The first model employs a Bayesian hierarchical structure with the mean modeled by an asymmetrical Gaussian functional form. Multiple hierarchical structures are compared to see which fits the data best. When forecasting, all hierarchical structures preform better than the independent model. The second model takes a functional data approach and uses functional principal component analysis to model and forecast the influenza season. Shrinkage distributions are used to choose the number of principal components. A hierarchical structure is created for the shrinkage distributions. Again, we find the hierarchical structures help in providing better forecasts. The forecasts are able to predict the peak week and peak percentage with little data from the forecasted season. Lastly, we preform a simulation study to see how adding a second data source such as Google search data can benefit the forecasting abilities in both of these models. We found that both models can benefit from a second source of data even if it biased. The benefit is most noticeable when forecasting around the peak of the influenza season.