An automated approach to program repair with semantic code search
Is Version Of
Every year software companies dedicate numerous developer hours to debugging and fixing defects. Automated program repair has the potential to greatly decrease the costs of debugging. Existing automated repair techniques, such as Genprog, TSPRepair, and AE, show great promise but are not able to repair all bugs. We propose a new automated program repair technique, SearchRepair, which is a complementary program repair technique. We take advantage of existing open source code to find potential fixes based on the assumption that there are correct implementations in open source project code for some defects. The key challenges lie in efficiently finding code semantically similar (but not identical) to defective code and then appropriately integrating that code into the buggy program. The technique we present, SearchRepair, addresses these challenges by (1) encoding a large database of human-written code fragments as SMT constraints on input-output behavior, (2) localizing a given defect to likely-buggy program fragments, (3) dynamically analyzing those buggy fragments to derive input-output pairs that describe likely buggy behavior and that can be encoded as SMT constraints, (4) using state-of-the-art constraint solvers to find fragments in the code database that satisfy those constraints, and (5) validating patches that repair the bug against program test suites. We evaluate our technique, SearchRepair, on a program repair benchmark set IntroClass, which provides 998 buggy programs written by novice students, two test suites for each program, and repair results for existing program repair technique, Genprog, TSPRepair and AE. The two test suites, of which one is written by a human and the other one is automatically generated by a computer, are used to determine if a program is buggy and to evaluate the quality of a repair. We use instructor test suite to refer the test suite that is written by a human. And we use KLEE test suite to refer the test suite that are generated by the computer. We consider a program as a potential fixable defect if it fails and passes at least one test case in a test suite. Note that extracting input-output behaviors for the semantic code search requires that at least one passed test case so some buggy programs are excluded from our evaluation. There are 778 defects in IntroClass based on the instructor test suite and 845 defects in IntroClass based on the KLEE test suite. We find that when using the instructor test suite, SearchRepair is able to successfully repair 150 of 778 defects, Gengprog is able to fix 287 defects, TSPRepair is able to fix 247 defects, AE is able to fix 159 defects. In total, these 4 techniques are able to fix 310 defects using the instructor test suite and 20 of the 310 defects can only be fixed by SearchRepair. We also find that when using the computer generated test suite, there are 58 unique defects that can only fixed by SearchRepair out of 339 total unique defects that can be fixed by the 4 techniques. These results suggest that SearchRepair is a complementary technique to existing program repair techniques.