Performance analysis and acceleration of nuclear physics application on high-performance computing platforms using GPGPUs and topology-aware mapping techniques

dc.contributor.advisor Pieter Maris
dc.contributor.advisor Joseph Zambreno
dc.contributor.author Oryspayev, Dossay
dc.contributor.department Electrical and Computer Engineering
dc.date 2018-09-12T04:24:01.000
dc.date.accessioned 2020-06-30T03:11:41Z
dc.date.available 2020-06-30T03:11:41Z
dc.date.copyright Fri Jan 01 00:00:00 UTC 2016
dc.date.embargo 2018-07-21
dc.date.issued 2016-01-01
dc.description.abstract <p>The number of nodes on current generation of high performance computing (HPC) platforms increases with a steady rate, and nodes of these computing platforms support multiple/many core hardware designs. As the number of cores per node increase, either CPU or accelerator based, we need to make use of all those cores. Thus, one has to use the accelerators as much as possible inside scientific applications. Furthermore, with the increase of the number of nodes, the communication time between nodes is likely to increase, which necessitates application specific network topology-aware mapping techniques for efficient utilization of these platforms. In addition, one also needs to construct network models in order to study the benefits of specific network mapping. The specific topology-aware mapping techniques will help to distribute the computational tasks so that the communication patterns make optimal use of the underlying network hardware. This research will mainly focus on the Many Fermion Dynamics nuclear (MFDn) application developed at Iowa State University, a computational tool for low-energy nuclear physics, which utilizes the so-called Lanczos algorithm (LA), an algorithm for diagonalization of sparse matrices that is widely used in the scientific parallel computing domain. We present techniques applied to this application which enhance its performance with the utilization of general purpose graphics processing units (GPGPUs). Additionally, we compare the performance of the sparse matrix vector multiplication (SpMVM), the main computationally intensive kernel in the LA, with other efficient approaches presented in the literature. We compare results for the total HPC platforms' resources needed for different SpMVM implementations, present and analyze the implementation of communication and computation overlapping method, and extend a model for the analysis of network topology presented in the literature. Finally, we present network topology-aware mapping techniques, focused at the LA stage, for IBM Blue Gene/Q (BG/Q) supercomputers, which enhance</p> <p>the performance as compared to the default mapping, and validate the results of our test using the network model.</p>
dc.format.mimetype application/pdf
dc.identifier archive/lib.dr.iastate.edu/etd/16524/
dc.identifier.articleid 7531
dc.identifier.contextkey 12807295
dc.identifier.s3bucket isulib-bepress-aws-west
dc.identifier.submissionpath etd/16524
dc.identifier.uri https://dr.lib.iastate.edu/handle/20.500.12876/30707
dc.language.iso en
dc.source.bitstream archive/lib.dr.iastate.edu/etd/16524/Oryspayev_iastate_0097E_15933.pdf|||Fri Jan 14 21:01:46 UTC 2022
dc.subject.disciplines Computer Engineering
dc.title Performance analysis and acceleration of nuclear physics application on high-performance computing platforms using GPGPUs and topology-aware mapping techniques
dc.type article
dc.type.genre dissertation
dspace.entity.type Publication
relation.isOrgUnitOfPublication a75a044c-d11e-44cd-af4f-dab1d83339ff
thesis.degree.discipline Computer Engineering
thesis.degree.level dissertation
thesis.degree.name Doctor of Philosophy
File
Original bundle
Now showing 1 - 1 of 1
Name:
Oryspayev_iastate_0097E_15933.pdf
Size:
3.89 MB
Format:
Adobe Portable Document Format
Description: