Identify algorithms from code
dc.contributor.advisor | Wei . Le | |
dc.contributor.author | Leslie, Stroh | |
dc.contributor.department | Department of Computer Science | |
dc.date | 2020-02-12T22:57:28.000 | |
dc.date.accessioned | 2020-06-30T03:20:30Z | |
dc.date.available | 2020-06-30T03:20:30Z | |
dc.date.copyright | Sun Dec 01 00:00:00 UTC 2019 | |
dc.date.embargo | 2021-12-03 | |
dc.date.issued | 2019-01-01 | |
dc.description.abstract | <p>Choosing an algorithm to use can depend on a variety of factors such as runtime, space, and</p> <p>problem requirements. Many algorithms already have tested implementations in open source code.</p> <p>Reusing or interchanging algorithms can help save development time and improve the performance</p> <p>of applications.</p> <p>Existing code search techniques often rely heavily on natural language components of the code.</p> <p>Simple techniques, such as Grep, are sensitive to the naming choices and conventions in code. Grep</p> <p>in particular do not precisely find implementations, outputting single lines. Grep does not rank</p> <p>the result, and is subject to lots of noise.</p> <p>We develop a technique to search for algorithms in code using existing pseudo code as a query.</p> <p>We leverage the structural, mathematical and natural language components of pseudo code to find</p> <p>its corresponding implementation in code. This approach defines a simple language to represent</p> <p>pseudo code with atoms that include different features of the algorithm. We then use these features</p> <p>to search code using a bounding box and extract the code snippet that contains the functionality</p> <p>of the pseudo code.</p> <p>We collected 19 different repositories in both C and Java and searched for 27 different algorithms.</p> <p>Using our technique we found over 60 algorithm implementations in roughly 1.8 million lines of</p> <p>code. We also conduct a comparison of our tool against a search implementation using a popular</p> <p>enterprise search platform Apache Solr and show our approach can find more algorithms with high</p> <p>rank.</p> | |
dc.format.mimetype | application/pdf | |
dc.identifier | archive/lib.dr.iastate.edu/etd/17728/ | |
dc.identifier.articleid | 8735 | |
dc.identifier.contextkey | 16524976 | |
dc.identifier.s3bucket | isulib-bepress-aws-west | |
dc.identifier.submissionpath | etd/17728 | |
dc.identifier.uri | https://dr.lib.iastate.edu/handle/20.500.12876/31911 | |
dc.language.iso | en | |
dc.source.bitstream | archive/lib.dr.iastate.edu/etd/17728/Leslie_iastate_0097M_18558.pdf|||Fri Jan 14 21:28:11 UTC 2022 | |
dc.subject.disciplines | Computer Sciences | |
dc.subject.keywords | Algorithms | |
dc.subject.keywords | Code Search | |
dc.title | Identify algorithms from code | |
dc.type | thesis | en_US |
dc.type.genre | thesis | en_US |
dspace.entity.type | Publication | |
relation.isOrgUnitOfPublication | f7be4eb9-d1d0-4081-859b-b15cee251456 | |
thesis.degree.discipline | Computer Science | |
thesis.degree.level | thesis | |
thesis.degree.name | Master of Science |
File
Original bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- Leslie_iastate_0097M_18558.pdf
- Size:
- 1.89 MB
- Format:
- Adobe Portable Document Format
- Description: