Recovery from Fail-Stop Failures in Parallel Fortran Applications
Recovery from Fail-Stop Failures in Parallel Fortran Applications
Date
2019-02-14
Authors
Weeks, Nathan
Weeks, Nathan
Luecke, Glenn
Maris, Pieter
Vary, James
Weeks, Nathan
Luecke, Glenn
Maris, Pieter
Vary, James
Major Professor
Advisor
Committee Member
Journal Title
Journal ISSN
Volume Title
Publisher
Altmetrics
Authors
Weeks, Nathan
Person
Research Projects
Organizational Units
Computer Science
Organizational Unit
Physics and Astronomy
Organizational Unit
Mathematics
Organizational Unit
Journal Issue
Series
Department
Computer SciencePhysics and AstronomyMathematics
Abstract
The Fortran 2018 standard defines syntax and semantics to allow a parallel application to recover from failed images (processes) during execution. This poster presents work to extend the GFortran compiler front end and OpenCoarrays library to support fault tolerant teams of images, enabling use of collective routines after an image failure.