On accelerating ultra-large-scale mining

Thumbnail Image
Date
2017-01-01
Authors
Upadhyaya, Ganesha
Major Professor
Advisor
Committee Member
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract

Ultra-large-scale mining has been shown to be useful for a number of software engineering tasks e.g. mining specifications, defect prediction. We propose a new research direction for accelerating ultra-large-scale mining that goes beyond parallelization. Our key idea is to analyze the interaction pattern between the mining task and the artifact to cluster artifacts such that running the mining task on one candidate artifact from each cluster is sufficient to produce results for other artifacts in the same cluster. Our artifact clustering criteria go beyond syntactic, semantic, and functional similarities to mining-task-specific similarity, where the interaction pattern between the mining task and the artifact is used for clustering. Our preliminary evaluation demonstrates that our technique significantly reduces the overall mining time.

Series Number
Journal Issue
Is Version Of
Versions
Series
Academic or Administrative Unit
Type
article
Comments

This is a manuscript of a proceeding published as Upadhyaya, Ganesha, and Hridesh Rajan. "On accelerating ultra-large-scale mining." In Proceedings of the 39th International Conference on Software Engineering: New Ideas and Emerging Results Track, pp. 39-42. IEEE Press, 2017. doi: 10.1109/ICSE-NIER.2017.11. Posted with permission.

Rights Statement
Copyright
Sun Jan 01 00:00:00 UTC 2017
Funding
DOI
Supplemental Resources