Analysis and acceleration of data mining algorithms on high performance reconfigurable computing platforms

dc.contributor.advisor Joseph Zambreno
dc.contributor.author Sun, Song
dc.contributor.department Electrical and Computer Engineering
dc.date 2018-08-11T05:22:47.000
dc.date.accessioned 2020-06-30T02:27:55Z
dc.date.available 2020-06-30T02:27:55Z
dc.date.copyright Sat Jan 01 00:00:00 UTC 2011
dc.date.embargo 2013-06-05
dc.date.issued 2011-01-01
dc.description.abstract <p>With the continued development of computation and communication technologies, we are overwhelmed with electronic data. Ubiquitous data in governments, commercial enterprises, universities and various organizations records our decisions, transactions and thoughts. The data collection rate is undergoing tremendous increase. And there is no end in sight. On one hand, as the volume of data explodes, the gap between the human being's understanding of the data and the knowledge hidden in the data will be enlarged. The algorithms and techniques, collectively known as data mining, are emerged to bridge the gap. The data mining algorithms are usually data-compute intensive. On the other hand, the overall computing system performance is not increasing at an equal rate. Consequently, there is strong requirement to design special computing systems to accelerate data mining applications. FPGAs based High Performance Reconfigurable Computing(HPRC) system is to design optimized hardware architecture for a given problem. The increased gate count, arithmetic capability, and other features of modern FPGAs now allow researcher to implement highly complicated reconfigurable computational architecture. In contrast with ASICs, FPGAs have the advantages of low power, low nonrecurring engineering costs, high design flexibility and the ability to update functionality after shipping. In this thesis, we first design the architectures for data intensive and data-compute intensive applications respectively. Then we present a general HPRC framework for data mining applications:</p> <p>Frequent Pattern Mining(FPM) is a data-compute intensive application which is to find commonly occurring itemsets in databases. We use systolic tree architecture in FPGA hardware to mimic the internal memory layout of FP-growth algorithm while achieving higher throughput. The experimental results demonstrate that the proposed hardware architecture is faster than the software approach.</p> <p>Sparse Matrix-Vector Multiplication(SMVM) is a data-intensive application which is an important computing core in many applications. We present a scalable and efficient FPGA-based SMVM architecture which can handle arbitrary matrix sizes without preprocessing or zero padding and can be dynamically expanded based on the available I/O bandwidth. The experimental results using a commercial FPGA-based acceleration system demonstrate that our reconfigurable SMVM engine is more efficient than existing state-of-the-art, with speedups over a highly optimized software implementation of 2.5X to 6.5X, depending on the sparsity of the input benchmark.</p> <p>Accelerating Text Classification Using SMVM is performed in Convey HC-1 HPRC platform. The SMVM engines are deployed into multiple FPGA chips. Text documents are represented as large sparse matrices using Vector Space Model(VSM). The k-nearest neighbor algorithm uses SMVM to perform classification simultaneously on multiple FPGAs. Our experiment shows that the classification in Convey HC-1 is several times faster compared with the traditional computing architecture.</p> <p>MapReduce Reconfigurable Framework for Data Mining Applications is a pipelined and high performance framework for FPGA design based on the MapReduce model. Our goal is to lessen the FPGA programmer burden while minimizing performance degradation. The designer only need focus on the mapper and reducer modules design. We redesigned the SMVM architecture using the MapReduce Framework. The manual VHDL code is only 15 percent of that used in the customized architecture.</p>
dc.format.mimetype application/pdf
dc.identifier archive/lib.dr.iastate.edu/etd/10360/
dc.identifier.articleid 1421
dc.identifier.contextkey 2798796
dc.identifier.doi https://doi.org/10.31274/etd-180810-795
dc.identifier.s3bucket isulib-bepress-aws-west
dc.identifier.submissionpath etd/10360
dc.identifier.uri https://dr.lib.iastate.edu/handle/20.500.12876/24573
dc.language.iso en
dc.source.bitstream archive/lib.dr.iastate.edu/etd/10360/Sun_iastate_0097E_12338.pdf|||Fri Jan 14 18:19:22 UTC 2022
dc.subject.disciplines Electrical and Computer Engineering
dc.title Analysis and acceleration of data mining algorithms on high performance reconfigurable computing platforms
dc.type article
dc.type.genre dissertation
dspace.entity.type Publication
relation.isOrgUnitOfPublication a75a044c-d11e-44cd-af4f-dab1d83339ff
thesis.degree.level dissertation
thesis.degree.name Doctor of Philosophy
File
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Sun_iastate_0097E_12338.pdf
Size:
1.89 MB
Format:
Adobe Portable Document Format
Description: