Analysis of Environmental Data with Censored Observations
The potential threats to humans and to terrestrial and aquatic ecosystems from environmental contamination could depend on the sum of the concentrations of different chemicals. However, direct summation of environmental data is not generally feasible because it is common for some chemical concentrations to be recorded as being below the analytical reporting limit. This creates special problems in the analysis of the data. A new model selection procedure, named forward censored regression, is introduced for selecting an appropriate model for environmental data with censored observations. The procedure is demonstrated using concentrations of atrazine (2-chloro-4-ethylamino-6-isopropylamino-s-triazine), deethylatrazine (DEA, 2-amino-4-chloro-6-isopropylamino-s-triazine), and deisopropylatrazine (DIA, 2-amino-4-chloro-6-ethylamino-s-triazine) in groundwater in the midwestern United States by using the data derived from a previous study conducted by the U.S. Geological Survey. More than 80% of the observations for each compound for this study were left censored at 0.05 μg/L. The values for censored observations of atrazine, DEA, and DIA are imputed with the selected models. The summation of atrazine residue (atrazine + DEA + DIA) can then be calculated using the combination of observed and imputed values to generate a pseudo-complete data set. The all-subsets regression procedure is applied to the pseudo-complete data to select the final model for atrazine residue. The methodology presented can be used to analyze similar cases of environmental contamination involving censored data.
This article is published as Liu, Shiping, Jye-Chyi Lu, Dana W. Kolpin, and William Q. Meeker. "Analysis of environmental data with censored observations." Environmental Science & Technology 31, no. 12 (1997): 3358-3362. doi:10.1021/es960695x.