Benchmarking: more aspects of high performance computing

Thumbnail Image
Date
2004-01-01
Authors
Ravindrudu, Rahul
Major Professor
Advisor
Committee Member
Journal Title
Journal ISSN
Volume Title
Publisher
Altmetrics
Abstract

The aim of this project was to encapsulate the needs of computational science applications. Performance metrics can be used to decide the appropriate resource for a given application. Unfortunately, most benchmarks available today measure only one or two of the major sub-systems of any given scalable computer. In this paper we present the modifications made to High Performance Linpack (HPL) using OpenMP. HPL is a software package that solves a (random) dense linear system in double precision arithmetic on distributed-memory computers. The benchmark suite deals with the computational and main memory performance of a distributed memory system. The way in which HPL is used today is that it is a measure of the peak computational performance of a given supercomputer as a function of the distributed memory bandwidth and latency. Most applications that utilize high performance computing (HPC) systems do many other things besides stressing the distributed memory and computational properties. The end result of this work will be a benchmark well suited to using the nature of SMPs (e.g., thread based work) and the associated secondary storage sybsystem of a computer as well as the traditional HPL offereings. The benchmark will be able to be more representative of real applications by tuning the input parameters of HPL. The first extension made to HPL, is to add an out-of-core capability adding a new component to the benchmark suite. This enables us to observe the performance of the system when disk I/O operations are involved. The out-of-core modifications also enable a better observation and tuning of a hierarchical memory system. The second modification to the HPL benchmark suite is the parallelization of the Basic Linear Algebra Subprograms (BLAS), in particular, Level 3 BLAS algorithms in HPL by using OpenMP. The timings and performance of the two modifications are plotted and analyzed by varying various input parameters. The parameters which influence these modifications are the ones dealing with memory access and allocation, and interprocess communication.

Series Number
Journal Issue
Is Version Of
Versions
Series
Academic or Administrative Unit
Type
thesis
Comments
Rights Statement
Copyright
Thu Jan 01 00:00:00 UTC 2004
Funding
Supplemental Resources
Source