Identify algorithms from code

dc.contributor.advisor Wei . Le
dc.contributor.author Leslie, Stroh
dc.contributor.department Department of Computer Science
dc.date 2020-02-12T22:57:28.000
dc.date.accessioned 2020-06-30T03:20:30Z
dc.date.available 2020-06-30T03:20:30Z
dc.date.copyright Sun Dec 01 00:00:00 UTC 2019
dc.date.embargo 2021-12-03
dc.date.issued 2019-01-01
dc.description.abstract <p>Choosing an algorithm to use can depend on a variety of factors such as runtime, space, and</p> <p>problem requirements. Many algorithms already have tested implementations in open source code.</p> <p>Reusing or interchanging algorithms can help save development time and improve the performance</p> <p>of applications.</p> <p>Existing code search techniques often rely heavily on natural language components of the code.</p> <p>Simple techniques, such as Grep, are sensitive to the naming choices and conventions in code. Grep</p> <p>in particular do not precisely find implementations, outputting single lines. Grep does not rank</p> <p>the result, and is subject to lots of noise.</p> <p>We develop a technique to search for algorithms in code using existing pseudo code as a query.</p> <p>We leverage the structural, mathematical and natural language components of pseudo code to find</p> <p>its corresponding implementation in code. This approach defines a simple language to represent</p> <p>pseudo code with atoms that include different features of the algorithm. We then use these features</p> <p>to search code using a bounding box and extract the code snippet that contains the functionality</p> <p>of the pseudo code.</p> <p>We collected 19 different repositories in both C and Java and searched for 27 different algorithms.</p> <p>Using our technique we found over 60 algorithm implementations in roughly 1.8 million lines of</p> <p>code. We also conduct a comparison of our tool against a search implementation using a popular</p> <p>enterprise search platform Apache Solr and show our approach can find more algorithms with high</p> <p>rank.</p>
dc.format.mimetype application/pdf
dc.identifier archive/lib.dr.iastate.edu/etd/17728/
dc.identifier.articleid 8735
dc.identifier.contextkey 16524976
dc.identifier.s3bucket isulib-bepress-aws-west
dc.identifier.submissionpath etd/17728
dc.identifier.uri https://dr.lib.iastate.edu/handle/20.500.12876/31911
dc.language.iso en
dc.source.bitstream archive/lib.dr.iastate.edu/etd/17728/Leslie_iastate_0097M_18558.pdf|||Fri Jan 14 21:28:11 UTC 2022
dc.subject.disciplines Computer Sciences
dc.subject.keywords Algorithms
dc.subject.keywords Code Search
dc.title Identify algorithms from code
dc.type thesis en_US
dc.type.genre thesis en_US
dspace.entity.type Publication
relation.isOrgUnitOfPublication f7be4eb9-d1d0-4081-859b-b15cee251456
thesis.degree.discipline Computer Science
thesis.degree.level thesis
thesis.degree.name Master of Science
File
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Leslie_iastate_0097M_18558.pdf
Size:
1.89 MB
Format:
Adobe Portable Document Format
Description: