Adaptive estimation in pattern recognition by combining different procedures
We study a problem of adaptive estimation of a conditional probability function in a pattern recognition setting. In many applications, for more flexibility, one may want to consider various estimation procedures targeted at different scenarios and/or under different assumptions. For example, when the feature dimension is high, to overcome the familiar curse of dimensionality one may seek a good parsimonious model among a number of candidates such as CART, neural nets and additive models. For such a situation, one wishes to have an automated final procedure that performs as well as the best candidate. In this work, we propose a method to combine a countable collection of procedures for estimating the conditional probability. We show that the combined procedure has a property that its statistical risk is bounded above by that of any of the procedure being considered plus a small penalty. Thus asymptotically, the strengths of the different estimation procedures are shared by the combined procedure. A simulation study shows the potential advantage of combining models compared with model selection.
This preprint was published as Yuhong Yang, "Adaptive Estimation in Pattern Recognition by Combining Different Procedures", Statistics Sinica (2000): 1069-1089.