Improving software quality with programming patterns

dc.contributor.advisor Tien N. Nguyen Nguyen, Tung
dc.contributor.department Electrical and Computer Engineering 2018-08-11T08:12:00.000 2020-06-30T02:50:41Z 2020-06-30T02:50:41Z Tue Jan 01 00:00:00 UTC 2013 2015-07-30 2013-01-01
dc.description.abstract <p>Software systems and services are increasingly important, involving and improving the work and lives of billions people. However, software development is still human-intensive and error-prone. Established studies report that software failures cost the global economy $312 billion annually and software vendors often spend 50-75% of the total development cost for finding and fixing bugs, i.e. subtle programming errors that cause software failures.</p> <p>People rarely develop software from scratch, but frequently reuse existing software artifacts. In this dissertation, we focus on programming patterns, i.e. frequently occurring code resulted from reuse, and explore their potential for improving software quality. Specially, we develop techniques for recovering programming patterns and using them to find, fix, and prevent bugs more effectively.</p> <p>This dissertation has two main contributions. One is Graph-based Object Usage Model (GROUM), a graph-based representation of source code. A GROUM abstracts a fragment of code as a graph representing its object usages. In a GROUM, nodes correspond to the function calls and control structures while edges capture control and data relationships between them. Based on GROUM, we developed a graph mining technique that could recover programming patterns of API usage and use them for detecting bugs. GROUM is also used to find similar bugs and recommend similar bug fixes.</p> <p>The other main contribution of this dissertation is SLAMC, a Statistical Semantic LAnguage Model for Source Code. SLAMC represents code as sequences of code elements of different roles, e.g. data types, variables, or functions and annotate those elements with sememes, a text-based annotation of their semantic information. SLAMC models the regularities over the sememe sequences code-based factors like local code context, global concerns, and pair-wise associations, thus, implicitly captures programming idioms and patterns as sequences with high probabilities. Based on SLAMC, we developed a technique for recommending most likely next code sequences, which could improve programming productivity and might reduce the odds of programming errors.</p> <p>Empirical evaluation shows that our approaches can detect meaningful programming patterns and anomalies that might cause bugs or maintenance issues, thus could improve software quality. In addition, our models have been successfully used for several other problems, from library adaptation, code migration, to bug fix generation. They also have several other potential applications, which we will explore in the future work.</p>
dc.format.mimetype application/pdf
dc.identifier archive/
dc.identifier.articleid 4583
dc.identifier.contextkey 5050416
dc.identifier.s3bucket isulib-bepress-aws-west
dc.identifier.submissionpath etd/13576
dc.language.iso en
dc.source.bitstream archive/|||Fri Jan 14 19:55:48 UTC 2022
dc.subject.disciplines Computer Engineering
dc.subject.keywords data mining
dc.subject.keywords graph
dc.subject.keywords machine learning
dc.subject.keywords programming pattern
dc.subject.keywords software quality
dc.subject.keywords statistical model
dc.title Improving software quality with programming patterns
dc.type article
dc.type.genre dissertation
dspace.entity.type Publication
relation.isOrgUnitOfPublication a75a044c-d11e-44cd-af4f-dab1d83339ff dissertation Doctor of Philosophy
Original bundle
Now showing 1 - 1 of 1
1.08 MB
Adobe Portable Document Format