An automated approach to program repair with semantic code search

dc.contributor.advisor Kathryn Stolee Ke, Yalin
dc.contributor.department Computer Science 2018-08-11T15:48:42.000 2020-06-30T02:59:22Z 2020-06-30T02:59:22Z Thu Jan 01 00:00:00 UTC 2015 2016-05-04 2015-01-01
dc.description.abstract <p>Every year software companies dedicate numerous developer hours to debugging and fixing defects. Automated program repair has the potential to greatly decrease the costs of debugging. Existing automated repair techniques, such as Genprog, TSPRepair, and AE, show great promise but are not able to repair all bugs. We propose a new automated program repair technique, SearchRepair, which is a complementary program repair technique. We take advantage of existing open source code to find potential fixes based on the assumption that there are correct implementations in open source project code for some defects. The key challenges lie in efficiently finding code semantically similar (but not identical) to defective code and then appropriately integrating that code into the buggy program. The technique we present, SearchRepair, addresses these challenges by (1) encoding a large database of human-written code fragments as SMT constraints on input-output behavior, (2) localizing a given defect to likely-buggy program fragments, (3) dynamically analyzing those buggy fragments to derive input-output pairs that describe likely buggy behavior and that can be encoded as SMT constraints, (4) using state-of-the-art constraint solvers to find fragments in the code database that satisfy those constraints, and (5) validating patches that repair the bug against program test suites. We evaluate our technique, SearchRepair, on a program repair benchmark set IntroClass, which provides 998 buggy programs written by novice students, two test suites for each program, and repair results for existing program repair technique, Genprog, TSPRepair and AE. The two test suites, of which one is written by a human and the other one is automatically generated by a computer, are used to determine if a program is buggy and to evaluate the quality of a repair. We use instructor test suite to refer the test suite that is written by a human. And we use KLEE test suite to refer the test suite that are generated by the computer. We consider a program as a potential fixable defect if it fails and passes at least one test case in a test suite. Note that extracting input-output behaviors for the semantic code search requires that at least one passed test case so some buggy programs are excluded from our evaluation. There are 778 defects in IntroClass based on the instructor test suite and 845 defects in IntroClass based on the KLEE test suite. We find that when using the instructor test suite, SearchRepair is able to successfully repair 150 of 778 defects, Gengprog is able to fix 287 defects, TSPRepair is able to fix 247 defects, AE is able to fix 159 defects. In total, these 4 techniques are able to fix 310 defects using the instructor test suite and 20 of the 310 defects can only be fixed by SearchRepair. We also find that when using the computer generated test suite, there are 58 unique defects that can only fixed by SearchRepair out of 339 total unique defects that can be fixed by the 4 techniques. These results suggest that SearchRepair is a complementary technique to existing program repair techniques.</p>
dc.format.mimetype application/pdf
dc.identifier archive/
dc.identifier.articleid 5820
dc.identifier.contextkey 8330828
dc.identifier.s3bucket isulib-bepress-aws-west
dc.identifier.submissionpath etd/14813
dc.language.iso en
dc.source.bitstream archive/|||Fri Jan 14 20:27:08 UTC 2022
dc.subject.disciplines Computer Sciences
dc.subject.keywords Computer Science
dc.subject.keywords automated
dc.subject.keywords program repair
dc.subject.keywords semantic code search
dc.title An automated approach to program repair with semantic code search
dc.type article
dc.type.genre thesis
dspace.entity.type Publication
relation.isOrgUnitOfPublication f7be4eb9-d1d0-4081-859b-b15cee251456 thesis Master of Science
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
522.35 KB
Adobe Portable Document Format