Space‐efficient tracking of persistent items in a massive data stream
dc.contributor.author | Tirthapura, Srikanta | |
dc.contributor.author | Tirthapura, Srikanta | |
dc.contributor.department | Computer Science | |
dc.contributor.department | Electrical and Computer Engineering | |
dc.date | 2018-04-29T11:15:34.000 | |
dc.date.accessioned | 2020-06-30T02:02:34Z | |
dc.date.available | 2020-06-30T02:02:34Z | |
dc.date.copyright | Tue Jan 01 00:00:00 UTC 2013 | |
dc.date.issued | 2014-01-01 | |
dc.description.abstract | <p>Motivated by scenarios in network anomaly detection, we consider the problem of detecting persistent items in a data stream, which are items that occur ‘regularly’ in the stream. In contrast with heavy hitters, persistent items do not necessarily contribute significantly to the volume of a stream, and may escape detection by traditional volume‐based anomaly detectors.</p> <p>We first show that any online algorithm that tracks persistent items exactly must necessarily use a large workspace, and is infeasible to run on a traffic monitoring node. In light of this lower bound, we introduce an approximate formulation of the problem and present a small‐space algorithm to approximately track persistent items over a large data stream. We experimented with three different datasets to see how the accuracy and memory footprint of the algorithm varies with the skewness of the dataset. Our algorithms performed best for the two datasets out of three which had highest skewness of persistence and lowest mean persistence. To our knowledge, this is the first systematic study of the problem of detecting persistent items in a data stream, and our work can help detect anomalies that are temporal, rather than volume‐based.</p> | |
dc.description.comments | <p>This is the peer-reviewed version of the following article: Lahiri, Bibudh, Srikanta Tirthapura, and Jaideep Chandrashekar. "Space‐efficient tracking of persistent items in a massive data stream." <em>Statistical Analysis and Data Mining: The ASA Data Science Journal</em> 7, no. 1 (2014): 70-92, which has been published in final form at DOI:<a href="http://dx.doi.org/10.1002/sam.11214" target="_blank">10.1002/sam.11214</a>. This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Self-Archiving.</p> | |
dc.format.mimetype | application/pdf | |
dc.identifier | archive/lib.dr.iastate.edu/ece_pubs/177/ | |
dc.identifier.articleid | 1177 | |
dc.identifier.contextkey | 12009715 | |
dc.identifier.s3bucket | isulib-bepress-aws-west | |
dc.identifier.submissionpath | ece_pubs/177 | |
dc.identifier.uri | https://dr.lib.iastate.edu/handle/20.500.12876/21002 | |
dc.language.iso | en | |
dc.source.bitstream | archive/lib.dr.iastate.edu/ece_pubs/177/2014_Tirthapura_SpaceEfficient.pdf|||Fri Jan 14 21:27:40 UTC 2022 | |
dc.source.uri | 10.1002/sam.11214 | |
dc.subject.disciplines | Electrical and Computer Engineering | |
dc.subject.disciplines | Systems and Communications | |
dc.subject.keywords | Data streams | |
dc.subject.keywords | persistence | |
dc.subject.keywords | sketches | |
dc.subject.keywords | hash-based filters | |
dc.title | Space‐efficient tracking of persistent items in a massive data stream | |
dc.type | article | |
dc.type.genre | article | |
dspace.entity.type | Publication | |
relation.isAuthorOfPublication | b0235db2-0a72-4dd1-8d5f-08e5e2e2bf7d | |
relation.isOrgUnitOfPublication | f7be4eb9-d1d0-4081-859b-b15cee251456 | |
relation.isOrgUnitOfPublication | a75a044c-d11e-44cd-af4f-dab1d83339ff |
File
Original bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- 2014_Tirthapura_SpaceEfficient.pdf
- Size:
- 532.95 KB
- Format:
- Adobe Portable Document Format
- Description: