Selection using a dichotomized auxiliary variate
Date
Authors
Major Professor
Advisor
Committee Member
Journal Title
Journal ISSN
Volume Title
Publisher
Altmetrics
Abstract
Consider a random sample of n objects from which we wish to select the k best. Quality will be defined in terms of measurements Y(,i) (i = 1, ..., n); the larger the value of Y(,i), the more valuable the object. The obvious solution--measure Y(,1), ..., Y(,n), and select those objects with the k largest Y(,i) values--may be impractical, if the Y(,i)'s are difficult to observe. An alternative is to select objects on the basis of an auxiliary variate X(,i) (i = 1, ..., n), related to Y(,i), but less elusive;Under the assumption that X(,i) and Y(,i) are positively associated, a reasonable procedure is to choose those objects with X(,i) values exceeding some fixed point x(,c). Denote the probability that the sub- set of items thus selected does indeed include the k best by (,n)(PI)(,k). Then, x(,c) may be chosen so that (,n)(PI)(,k) (GREATERTHEQ) P*, where P* (0 ) (INFIN), is examined when k is either constant or a function of n (k = k(n) = f(.)n or n('f) (0 ) 1; when k is proportional to n, (,n)(PI)(,k) (--->) 0; and when k = n('f), the limit is distribution dependent (see text for details);Yeo and David (1984) consider a fixed-size subset selection procedure--choosing those objects with the s largest X(,i) values--and calculate (,n)(PI)(,s:k), the probability that this subset includes the k best;objects. Our approach is contrasted with Yeo and David's, and an alternative derivation of (,n)(PI)(,s:k) is provided; Reference Yeo, W. B., and David, H. A. 1984. Selection through an associated; characteristic, with applications to the random effects model. JASA 79, 399-405.