Numerical solution for the minimum norm solution to the first kind integral equation with a special kernel and efficient implementations of the Cholesky factorization algorithm on the vector and parallel supercomputers
Part I. Let K: L[subscript]2[a,b] → L[subscript]2[c,d] be a bounded linear operator defined by (Kf)(x)=ϵt[subscript]spabk(x,y)f(y)dy, where kϵ L[subscript]2([c,d]x[a,b]) and fϵ L[subscript]2[a,b]. Define k[subscript]x by k[subscript]x(y) = k(x,y). Assume K has the property that (a) k[subscript]xϵ L[subscript]2[a,b] for all xϵ[c,d] and (b) Kf = 0 a.e. implies (Kf)(x) = 0 for all xϵ[c,d]. Then, it is shown that the minimum norm solution f[subscript]0 to the first kind of Fredholm integral equation Kf = g is the L[subscript]2-norm limit of linear combinations of the k[subscript]x's. Next, it is shown how to choose constants c[subscript]1, c[subscript]2, ·s, c[subscript]n to minimize ǁ f[subscript]0-[sigma][subscript]spj=1nc[subscript]jk[subscript]x[subscript] j ǁ [subscript]2 for n fixed points x[subscript]1, x[subscript]2, ·s, x[subscript]n [underline]without knowing what f[subscript]0 is. Perturbation results and some characteristics of this approximate solution f[subscript]n = [sigma][subscript]spj=1nc[subscript]jk[subscript]x[subscript] j for f[subscript]0 are presented;This paper also contains a numerical method choosing n points x[subscript]1, x[subscript]2, ·s, x[subscript]n at which ǁ f[subscript]0- [sigma][subscript]spj=1nc[subscript]jk[subscript]x[subscript] j ǁ [subscript]2 is minimized [underline]for only a fixed number \it n. Lastly, numerical results for different types of examples are provided to evaluate this numerical method;Part II. First, a blocked Cholesky factorization algorithm using non-standard level-2 BLAS (Basic Linear Algebra Subprograms) and three blocked Cholesky algorithms using standard level 2 & 3 BLAS are developed on the Hitachi Data Systems (HDS) AS/XL V60, and their performances are compared to the existing unblocked algorithm. The blocked algorithm using non-standard level-2 BLAS performs best of all algorithms considered on HDS computer, but non-standard level-2 BLAS were optimized and performed well on only HDS computer. For this reason, a blocked algorithm using standard BLAS and giving a near optimal performance on all of the HDS AS/EX V60, the IBM 3090E, the Cray 2, X-MP, and Y-MP is found and its performance is compared to the vendor supplied Cholesky routine (when available) on each computer. Since the IBM ESSL vector library does not have an optimized DSYRK, it was optimized for the IBM 3090E before all algorithms were tested;Next, five parallel Cholesky factorization algorithms each of which uses standard BLAS are presented. The parallel performance of these algorithms is measured on each of the Cray-2, Cray X-MP/48, Cray Y-MP/832, and the IBM 3090-600E and J. For the IBM 3090 computers, the parallel performance of these algorithms is also compared with a vendor optimized Cholesky factorization from ESSL.