Parallel Streaming Frequency-Based Aggregates

Date
2014-01-01
Authors
Tangwongsan, Kanat
Tirthapura, Srikanta
Journal Title
Journal ISSN
Volume Title
Publisher
Altmetrics
Authors
Research Projects
Organizational Units
Computer Science
Organizational Unit
Journal Issue
Series
Abstract

We present efficient parallel streaming algorithms for fundamental frequency-based aggregates in both the sliding window and the infinite window settings. In the sliding window setting, we give a parallel algorithm for maintaining a space-bounded block counter (SBBC). Using SBBC, we derive algorithms for basic counting, frequency estimation, and heavy hitters that perform no more work than their best sequential counterparts. In the infinite window setting, we present algorithms for frequency estimation, heavy hitters, and count-min sketch. For both the infinite window and sliding window settings, our parallel algorithms process a "minibatch" of items using linear work and polylog parallel depth. We also prove a lower bound showing that the work of the parallel algorithm is optimal in the case of heavy hitters and frequency estimation. To our knowledge, these are the first parallel algorithms for these problems that are provably work efficient and have low depth.

Description

This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in Tangwongsan, Kanat, Srikanta Tirthapura, and Kun-Lung Wu. "Parallel streaming frequency-based aggregates." In Proceedings of the 26th ACM Symposium on Parallelism in Algorithms and Architectures, pp. 236-245. ACM, 2014. DOI:10.1145/2612669.2612695.

Keywords
parallel streaming, stream processing, heavy hitter, basic counting
Citation
DOI