Random Convolutional Coding for Robust and Straggler Resilient Distributed Matrix Computation

Date
2019-01-01
Authors
Das, Anindya
Ramamoorthy, Aditya
Vaswani, Namrata
Ramamoorthy, Aditya
Journal Title
Journal ISSN
Volume Title
Publisher
Source URI
Altmetrics
Authors
Research Projects
Organizational Units
Mathematics
Organizational Unit
Journal Issue
Series
Abstract

Distributed matrix computations (matrix-vector and matrix-matrix multiplications) are at the heart of several tasks within the machine learning pipeline. However, distributed clusters are well-recognized to suffer from the problem of stragglers (slow or failed nodes). Prior work in this area has presented straggler mitigation strategies based on polynomial evaluation/interpolation. However, such approaches suffer from numerical problems (blow up of round-off errors) owing to the high condition numbers of the corresponding Vandermonde matrices. In this work, we introduce a novel solution approach that relies on embedding distributed matrix computations into the structure of a convolutional code. This simple innovation allows us to develop a provably numerically robust and efficient (fast) solution for distributed matrix-vector and matrix-matrix multiplication.

Description
<p>This is a pre-print of the article Das, Anindya B., Aditya Ramamoorthy, and Namrata Vaswani. "Random Convolutional Coding for Robust and Straggler Resilient Distributed Matrix Computation." <em>arXiv preprint arXiv:1907.08064</em> (2019). Posted with permission.</p>
Keywords
Citation
Collections