Exhaustive Search Algorithms to Find α(0,1) Designs with High Harmonic Mean Canonical Efficiency Factors

Resolvable incomplete block designs have been of interest in many fields for decades. Considerable research has been done on how to construct highly efficient resolvable incomplete block designs with respect to various efficiency criteria. Introduced by E.R Williams in 1975, α designs with zeros or ones in the off-diagonal of the concurrence matrix (denoted as α(0, 1) designs) comprise one class of resolvable incomplete block designs with v varieties, b blocks, r replicates and block size of k. It can be challenging to find an α(0, 1) design with high efficiency, and when v and b become large, an exhaustive search is required. This paper proposes algorithms for constructing α(0, 1) designs for r = 2, r = 3 and r = 4 with highest possible harmonic mean canonical efficiency factors. The algorithms are based on the ideas of the combinatorial and factorial number systems. The performance of the algorithms for large v (v ≥ 100) is evaluated, and harmonic mean canonical efficiency factors are compared to theoretical upper bounds. A simulation study was carried out to compare randomized complete block designs and α(0, 1) designs with different efficiency factors.


Introduction
A block design is resolvable if the blocks can be partitioned into replicates such that each replicate is essentially one complete block with one experimental unit for each treatment. Let D(v, b, r, k) denote the class of resolvable incomplete block designs with v treatments, b blocks of size k and r replicates. By definition, the class D (v, b, r, k) is nonempty only when v k = b r = s where s is an integer representing the number of blocks per replication. A typical problem is to find a design in class D (v, b, r, k) that has the highest possible efficiency. One efficiency criterion is the harmonic mean canonical efficiency factor (see Raghavarao (1971) and section 2 below). John (1966) pointed out that reducing the range of the off-diagonal elements of the concurrence matrix (defined in Section 2) improves the harmonic mean canonical efficiency factor. The class of α(0, 1) designs is of interest because such designs have only zeros or ones in the off-diagonal elements. Williams (1975) presented the specific forms of the generating array for α(0, 1) designs with k = s and k = s − 1 for r = 2, 3, 4. However, for r ≤ k < s − 1 , there are no specific forms, and α(0, 1) designs are numerous when the number of varieties v becomes large. Therefore, an exhaustive search is needed to find an α(0, 1) design with highest possible efficiency among all the α(0, 1) designs given (v, b, r, k). Williams (1975) stated for large value of v (v ≥ 100) and certain values of r, s and k, there are too many different generating arrays (see section 2 below) for an exhaustive search to be carried out. In this paper we will show that with the combinatorial and factorial number systems, it is possible to order all generating arrays and corresponding α(0, 1) designs sequentially. This sequential ordering permits an exhaustive search so that, given (v, b, r, k), we are able to identify the most efficient α(0, 1) design eventually.

Background Review
The incidence matrix N of a design in D (v, b, r, k) is a v × b matrix with entry n ij being the number of experimental units assigned to variety i in block j. The concurrence matrix of a design is defined as the product N N ′ where N ′ is the transpose of the incidence matrix N. The harmonic mean canonical efficiency factor (denoted asĒ) of a (v, b, r, k) design is the harmonic mean of the non-zero eigenvalues of matrix C, i.e.,Ē = v−1 rk N N ′ , θ i , i = 1, 2, ..., v − 1 are the non-zero eigenvalues of C (Raghavarao, 1971).
The design D d with varieties and blocks interchanged is called the dual of a design D. The harmonic mean canonical efficiency factorĒ d for the dual design D d can be also defined in terms rk N ′ N . Because C and C d have the same non-unit eigenvalues (Williams, 1975) whereĒ d is the harmonic mean canonical efficiency factor of the dual design and γ j , j = 1, 2, ..., b − 1 are the eigenvalues of C d .
The theoretical upper bound for the harmonic mean canonical efficiency factor of a resolvable (v−1)(r−1)+r(s−1) (Patterson and Williams, 1975). Given the class D (v, b, r, k) where v k = b r = s, a design constructed as follows is called an α design: 1. Let a generating array α be a k × r array whose entries can take values from 0, 1, ..., s − 1.
For each of the columns in α, we generate another s − 1 columns by cyclically adding 1 to the elements in that column and reducing modulus s. This results in a k × rs array α ⋆ .
The resulting array is a resolvable design with r replicates; i.e., each of the r columns of α generates a complete replicate.
The dual generating array α ′ of a generating array α is a r × k array whose (m, l)th element is defined by a ′ ml =−a lm (l = 1, ..., k; m = 1, ..., r) where a ′ ml is the (m, l)th element in array α ′ , a lm is the (l, m)th element in array α, and− represents the operation of subtracting from s and performing modulo s (Williams, 1975).
The dual generating array α ′ of the generating array α will be Two generating arrays with same parameters are equivalent if the α designs produced have the same efficiency factor. Every generating array α is equivalent to a special array called the reduced array for α, with all elements in the first row and first column equal to zero (Williams, 1975).
Isomorphic designs are designs with the same reduced array α. Isomorphic operations on a generating array α include permutation of rows and columns. (Patterson and Williams, 1976).

Introduction of combinatorial and factorial number systems
Later on in the algorithms in section 4, we will show that we need a way to decode a natural number in (0, 1, 2, ..., C(N, k) − 1) to a specific combination that is in the C(N, k) combinations and decode a natural number in (0, 1, 2, ..., P (N, k) − 1) to a specific permutation that is in the The combinatorial number system of degree k gives a one-to-one correspondence between natural numbers (0, 1, 2, ..., C(N, k) − 1) and C(N, k) combinations where C(N, k) = N ! (N −k)!k! . Specifically, we can represent a natural number n ∈ {0, 1, 2, ..., C(N, k) To find (C k , C k−1 , ..., C 2 , C 1 ), take C k as the maximum number satisfying then take C k−1 as the maximum number satisfying Hence, we have a one-to-one correspondence between n and (C k , C k−1 , ..., C 2 , C 1 ) which is one of the combinations. One of the proofs for the correspondence was shown by Abu et al. (2016). The factorial number system converts natural numbers 0, 1, 2, ..., n! − 1 to factorial representations through which we can obtain permutations of 0, 1, 2, ..., n − 1.
We adopt zero-based numbering, and for a given natural number k (0 ≤ k < n!), suppose we are interested in finding the kth permutation of the n numbers labelled as 0, 1, ..., n − 1.
First, we represent k as (F n , F n−1 , ..., F 2, F 1 ) such that k = F n (n − 1)! + F n−1 (n − 2)! + ... + F 2 1! + F 1 0! where the maximum number F i can take is i − 1, and the minimum is 0 for i = 1, 2, ..., n. (F n , F n−1 , ..., F 2, F 1 ) is called the factorial representation of k. We argue that there are one-to-one correspondences between the natural numbers from 0 to n! − 1 and The one-to-one correspondence can be shown as follows: 1. We have n! permutations in total, and we also have n! representations from (F n , F n−1 , ..., F 2, F 1 ) due to the fact that F n can take n different values, F n−1 can take n − 1 different values and so forth.
Assume the equation holds for n = k, i.e., Hence, it also holds for n = k + 1.

Construction of α(0, 1)-design
An α design is called an α(0, 1) design if the off-diagonal elements of the concurrence matrix N N ′ consist of only zeros and ones. Williams (1975) showed that a generating array α produces an α(0, 1) design if and only  Williams (1975) showed that for an α(0, 1) design to exist, k ≤ s is a necessary condition and presented the specific forms for α(0, 1) designs for k = s and k = s − 1 for r = 2, 3, 4.
However, the α(0, 1) design with highest possible efficiency for r ≤ k < s − 1 need an exhaustive search, and we propose algorithms to conduct the exhaustive search.
In section 2, we haveĒ = . Therefore it is easier to get the harmonic mean canonical efficiency factor through the dual design because b ≤ v, and calculating the eigenvalues of a b × b matrix C d requires less effort than doing so for a v × v matrix C.
By the isomorphy property, we have C(s − 1, k − 1) = (s−1)! (k−1)!(s−k)! choices in total because permutation of columns is an isomorphic operation. Based on the combinatorial number system, we construct a combination-generating function COM BIN ADIC(N, k, i) that generates the ith combination case in C(N, k) combinations without the knowledge of previous i−1 combinations.

21:
Calculate the Efficiency factor asĒ = ,l (l = 2, ..., k) because the elements in the first column of reduced array are 0.

21:
Construct the dual design incidence matrix N ′

Simulation study
Consider simulating from the spherical covariance model (Stroup, 2002) where β represents the treatment effect, γ represents the replicate effect and error vector e ∼ M V N (0, R). The ijth row-column element of R, denoted as r ij is where d is the Euclidean distance between the ith and jth observation (the observations were recorded columnwise), and l is the range that is the Euclidean distance between two observations below which spatial correlation is nonzero.
Suppose we have 216 treatments and 2 replicates, and we are interested in estimating the differences between the effect of each treatment and the overall average treatment effect, i.e., Suppose we have a 9 × 48 field available (432 plots in total) and there are resources to assign two plots to each treatment. We have three competing designs as shown in Figure 5.1 where the treatment numbers are zero-based. The first design is a randomized complete block design in which we randomly assign the 216 treatments to the 2 large complete blocks. The second design is an α(0, 1) design with v = 216, b = 48, r = 2, k = 9 where each column is an incomplete block.
(2) Obtain three sets of simulated response Y C , Y S and Y RCBD corresponding to the α(0, 1) design with incomplete blocks arranged as columns, the α(0, 1) design with incomplete blocks arranged as squares and the randomized complete block design because we need to match the generated treatment effects from (1) to the three designs respectively.
(3) Fit three models corresponding to the three competing designs and calculate the RM SE = √ 1 216 ) 2 for each model. (i) Model for randomized complete block design: 2, ..., 216; j = 1, 2 where β i is the i th treatment effect, γ j is the j th complete block effect, and e ij is the error term with e ij iid ∼ N (0, σ 2 ).
(iii) Model for the α(0, 1) design with square shaped incomplete blocks: The same model as the second but the value e ij may be different because the locations of some treatments are different.
(4) To further compare the design with RCBD randomization and square shaped incomplete blocks to the α(0, 1) design with the square shaped incomplete blocks, fit the model for the design with RCBD randomization and square shaped incomplete blocks: , and e ij is the error term with e ij iid ∼ N (0, σ 2 ).
(5) Calculate five log ratios : where RM SE C is from the model fitted with the α(0, 1) design with column shaped incomplete blocks, RM SE S is from the model fitted with the α(0, 1) design with square shaped incomplete blocks, RM SE RCBD is from the model fitted with the randomized complete block design and RM SE RCBD S is from the model fitted the design with RCBD randomization and square shaped incomplete blocks. (a) Randomized complete block design The columns of Table 3 shows the proportions of positive of the log ratios where the α(0, 1) design has the highest mean canonical efficiency factor (0.8015); i.e., the search is exhaustive. The following graph shows the distribution of the efficiency factor values from the design with RCBD randomization and the square shaped incomplete blocks with 10000 iterations. We notice that most efficiency factor values are between 0.78 and 0.79 which are close to but lower than 0.8015, the efficiency factor value for the α(0, 1) design with highest efficiency.
The following boxplot shows the log ratios from 10000 iterations and α(0, 1) design has the highest mean canonical efficiency factor (0.8015).
As we can see from Table 3 and the boxplot, the α(0, 1) design with square shaped blocks has smaller RM SE than the α(0, 1) design with column shaped blocks as well as the randomized complete block designs about 90 percent of the time. The α(0, 1) design with column shaped blocks has larger RM SE than the randomized complete block designs more than 50 percent of the time.
The design with RCBD randomization and square shaped incomplete blocks has larger RM SE than the α(0, 1) design with square shaped blocks 55 percent of the time because the efficiency factors of the designs with RCBD randomization and square shaped incomplete blocks are slightly smaller than 0.8015. The α(0, 1) design with column shaped blocks has larger RM SE than the design with RCBD randomization and square shaped incomplete blocks more than 80 percent of the time.
To examine whether the α(0, 1) design with square blocks still have a larger proportion of positive in the log ratio compared to the randomized complete block design if we consider the spatial spherical covariance structure in the error term, we fit another two models as follows: (i) Spatial model for the randomized complete block design: Y ij = β i + τ j + e ij , i = 1, 2, ..., 216; j = 1, 2 where β i is the i th treatment effect, τ j is the j th complete block effect, and e is the error vector with e ∼ M V N (0, R). R is the spherical covariance structure defined as above.
(ii) Spatial model for the α(0, 1) design with square shaped incomplete blocks: Y ij = β i + γ j + τ k + e ij , i = 1, 2, ..., 216; j = 1, 2; k = 1, 2, ..., 48 where β i is the i th treatment effect, γ j is the j th complete block effect, τ k is the k th incomplete block effect with τ k iid ∼ N (0, σ 2 b ), and e is the error vector with e ∼ M V N (0, R). The result of comparison between the α(0, 1) design with square shaped blocks and the randomized complete block design after accounting for the spherical spatial structure is shown above. RM SE S ′ is from the model fitted with the α(0, 1) design with square shaped incomplete blocks, RM SE RCBD ′ is from the model fitted with the randomized complete block design.
It indicates that the α(0, 1) design with square shaped incomplete blocks still has smaller RM SE than the randomized complete block design around 70 percent of the time.

Conclusion
The combinatorial number system gives the correspondence between natural numbers and combinations, and factorial number system gives the correspondence between natural numbers and permutations. Based on the ideas of the combinatorial and factorial number systems, we start with the construction of generating array and then conduct an exhaustive search for α(0, 1) designs with r ≤ k < s − 1 and r = 2, 3, 4. Finally, we are able to find the α(0, 1) design with the highest possible efficiency. The simulation study showed that the α(0, 1) design with high efficiency factor performed better in estimating the contrast between the effect of each treatment and the overall average effect than the randomized complete block design no matter we consider the spatial structure or not when fitting the model, and adding the square shaped incomplete blocks to the randomized complete block design will reduce the RM SE of the contrast estimate.