Stability of Random Forests and Coverage of Random-Forest Prediction Intervals

dc.contributor.author Wang, Yan
dc.contributor.author Wu, Huaiqing
dc.contributor.author Nettleton, Dan
dc.contributor.department Statistics
dc.date.accessioned 2023-11-30T20:58:20Z
dc.date.available 2023-11-30T20:58:20Z
dc.date.issued 2023-10-28
dc.description.abstract We establish stability of random forests under the mild condition that the squared response (Y2) does not have a heavy tail. In particular, our analysis holds for the practical version of random forests that is implemented in popular packages like \texttt{randomForest} in \texttt{R}. Empirical results show that stability may persist even beyond our assumption and hold for heavy-tailed Y2. Using the stability property, we prove a non-asymptotic lower bound for the coverage probability of prediction intervals constructed from the out-of-bag error of random forests. With another mild condition that is typically satisfied when Y is continuous, we also establish a complementary upper bound, which can be similarly established for the jackknife prediction interval constructed from an arbitrary stable algorithm. We also discuss the asymptotic coverage probability under assumptions weaker than those considered in previous literature. Our work implies that random forests, with its stability property, is an effective machine learning method that can provide not only satisfactory point prediction but also justified interval prediction at almost no extra computational cost.
dc.description.comments This preprint is made available through arXiv at doi: https://doi.org/10.48550/arXiv.2310.18814. Copyright 2023,The Authors. Posted with permission.
dc.identifier.uri https://dr.lib.iastate.edu/handle/20.500.12876/ywAbKO9v
dc.language.iso en
dc.source.uri https://doi.org/10.48550/arXiv.2310.18814 *
dc.subject.disciplines DegreeDisciplines::Physical Sciences and Mathematics::Statistics and Probability::Statistical Methodology
dc.subject.disciplines DegreeDisciplines::Physical Sciences and Mathematics::Computer Sciences::Theory and Algorithms
dc.title Stability of Random Forests and Coverage of Random-Forest Prediction Intervals
dc.type Preprint
dspace.entity.type Publication
relation.isAuthorOfPublication 7d86677d-f28f-4ab1-8cf7-70378992f75b
relation.isOrgUnitOfPublication 5a1eba07-b15d-466a-a333-65bd63a4001a
File
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
2023-Nettleton-StabilityRandomPreprint.pdf
Size:
633.83 KB
Format:
Adobe Portable Document Format
Description:
Collections