Generalized estimating equations for clustered survival data
Although time to event data is traditionally analyzed assuming independent responses, it is common to encounter correlated time to event data in the form of repeated measurements on subjects or clusters of subjects formed by genetic or social relationships. The objective of this research is to develop estimation procedures for clustered survival data that improve efficiency in estimating regression coefficients in Cox proportional hazards model without imposing overwhelming computational burdens;A commonly used method for clustered survival data obtains parameter estimates from the partial likelihood score equations based on a model that incorrectly assumes independent observations. This independent working model (IWM) approach provides consistent estimators with asymptotic Gaussian distribution and a robust covariance estimator provides a consistent estimator of the covariance matrix of the parameter estimates. The availability of the software in most statistical packages has led to the wide use of this methodology for correlated survival data. Because of the potential loss of efficiency when within cluster correlation is strong, we examine two alternative methods to improve efficiency;We first considered a simplified approach to estimate weights in the weighted estimating equations proposed by Cai and Prentice (1997). This approach reduces the computational burden of the Cai and Prentice methodology. We also consider a new set of weighted estimating equations obtained by inserting weight matrices into the IWM score equations in a different manner. Another set of estimating equations is developed by applying a generalized estimating equation (GEE) approach using approximate Poisson distributions for counting process differentials. The bootstrap procedure is used to estimate the covariance matrix of the parameter estimates;Simulation studies are used to assess bias, variance and relative efficiency of the proposed estimators. Results show that the biases of all of the estimators are small and comparable, but there may be substantial gains in efficiency by incorporating weight matrices into estimating equations when the within cluster correlation is strong and the censoring rate is low. Simulation studies confirm that the bootstrap procedure provides accurate standard errors for estimates of regression coefficients and confidence intervals with appropriate coverage probabilities.