Usage and refactoring studies of python regular expressions

dc.contributor.advisor Kathryn T. Stolee
dc.contributor.author Chapman, Carl
dc.contributor.department Computer Science
dc.date 2018-08-11T10:53:43.000
dc.date.accessioned 2020-06-30T03:01:43Z
dc.date.available 2020-06-30T03:01:43Z
dc.date.copyright Fri Jan 01 00:00:00 UTC 2016
dc.date.embargo 2001-01-01
dc.date.issued 2016-01-01
dc.description.abstract <p>Though regular expressions provide a powerful search technique that is baked into every major language, is incorporated into a myriad of essential tools, and has been a fundamental aspect of Computer Science since the 1960's, no one has ever formally studied how they are used in practice, or how to apply refactoring principals to improve understandability and conformance to community standards. This thesis presents the original work of studying a sample of regexes taken from Python projects mined from GitHub, determining what features are used most often, defining some categories that illuminate common use cases, and identifying areas of significance for language and tool designers. Furthermore, this thesis defines an equivalence class model used to explore comprehension of regexes, identifying the most common and most understandable representations of semantically identical regexes, suggesting several refactorings and preferred representations. Opportunities for future work include the novel and rich field of regex refactoring, semantic search of regexes, and further fundamental research into regex usage and understandability.</p>
dc.format.mimetype application/pdf
dc.identifier archive/lib.dr.iastate.edu/etd/15139/
dc.identifier.articleid 6146
dc.identifier.contextkey 8928982
dc.identifier.doi https://doi.org/10.31274/etd-180810-4743
dc.identifier.s3bucket isulib-bepress-aws-west
dc.identifier.submissionpath etd/15139
dc.identifier.uri https://dr.lib.iastate.edu/handle/20.500.12876/29323
dc.language.iso en
dc.source.bitstream archive/lib.dr.iastate.edu/etd/15139/Chapman_iastate_0097M_15725.pdf|||Fri Jan 14 20:36:27 UTC 2022
dc.subject.disciplines Computer Sciences
dc.subject.keywords Computer Science
dc.subject.keywords features
dc.subject.keywords Mining
dc.subject.keywords Python
dc.subject.keywords refactoring
dc.subject.keywords regex
dc.subject.keywords regular expressions
dc.title Usage and refactoring studies of python regular expressions
dc.type article
dc.type.genre thesis
dspace.entity.type Publication
relation.isOrgUnitOfPublication f7be4eb9-d1d0-4081-859b-b15cee251456
thesis.degree.discipline Computer Science
thesis.degree.level thesis
thesis.degree.name Master of Science
File
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Chapman_iastate_0097M_15725.pdf
Size:
1.37 MB
Format:
Adobe Portable Document Format
Description: