Can the strengths of AIC and BIC be shared?
A traditional approach to statistical inference is to identify the true or best model first with little or no consideration of the specific goal of inference in the model identification stage. Can the pursuit of the true model also lead to optimal regression estimation? In model selection, it is well known that BIC is consistent in selecting the true model, and AIC is minimax-rate optimal for estimating the regression function. A recent promising direction is adaptive model selection, in which, in contrast to AIC and BIC, the penalty term is data-dependent. Some theoretical and empirical results have been obtained in support of adaptive model selection, but it is still not clear if it can really share the strengths of AIC and BIC. Model combining or averaging has attracted increasing attention as a means to overcome the model selection uncertainty. Can Bayesian model averaging be optimal for estimating the regression function in a minimax sense? We show that the answers to these questions are basically in the negative: for any model selection criterion to be consistent, it must behave suboptimally for estimating the regression function in terms of minimax rate of covergence; and Bayesian model averaging cannot be minimax-rate optimal for regression estimation.
This preprint was published as Yuhong Yang, "Can the Strengths of AIC and BIC be Shared? A Conflict Between Model Indentification and Regression Estimation", Biometrika (2005): 937-950, doi: 10.1093/biomet/92.4.937.