Semantics-based approach for generating partial views from linked life-cycle highway project data

dc.contributor.advisor H. David Jeong
dc.contributor.author Le, Tuyen
dc.contributor.department Civil, Construction, and Environmental Engineering
dc.date 2018-08-11T07:56:04.000
dc.date.accessioned 2020-06-30T03:04:48Z
dc.date.available 2020-06-30T03:04:48Z
dc.date.copyright Sun Jan 01 00:00:00 UTC 2017
dc.date.embargo 2001-01-01
dc.date.issued 2017-01-01
dc.description.abstract <p>The purpose of this dissertation is to develop methods that can assist data integration and extraction from heterogeneous sources generated throughout the life-cycle of a highway project. In the era of computerized technologies, project data is largely available in digital format. Due to the fragmented nature of the civil infrastructure sector, digital data are created and managed separately by different project actors in proprietary data warehouses. The differences in the data structure and semantics greatly hinder the exchange and fully reuse of digital project data. In order to address those issues, this dissertation carries out the following three individual studies.</p> <p>The first study aims to develop a framework for interconnecting heterogeneous life cycle project data into an unified and linked data space. This is an ontology-based framework that consists of two phases: (1) translating proprietary datasets into homogeneous RDF data graphs; and (2) connecting separate data networks to each other. Three domain ontologies for design, construction, and asset condition survey phases are developed to support data transformation. A merged ontology that integrates the domain ontologies is constructed to provide guidance on how to connect data nodes from domain graphs.</p> <p>The second study is to deal with the terminology inconsistency between data sources. An automated method is developed that employs Natural Language Processing (NLP) and machine learning techniques to support constructing a domain specific lexicon from design manuals. The method utilizes pattern rules to extract technical terms from texts and learns their representation vectors using a neural network based word embedding approach. The study also includes the development of an integrated method of minimal-supervised machine learning, clustering analysis, and word vectors, for computing the term semantics and classifying the relations between terms in the target lexicon.</p> <p>In the last study, a data retrieval technique for extracting subsets of an XML civil data schema is designed and tested. The algorithm takes a keyword input of the end user and returns a ranked list of the most relevant XML branches. This study utilizes a lexicon of the highway domain generated from the second study to analyze the semantics of the end user keywords. A context-based similarity measure is introduced to evaluate the relevance between a certain branch in the source schema and the user query.</p> <p>The methods and algorithms resulting from this research were tested using case studies and empirical experiments.</p> <p>The results indicate that the study successfully address the heterogeneity in the structure and terminology of data and enable a fast extraction of sub-models of data. The study is expected to enhance the efficiency in reusing digital data generated throughout the project life-cycle, and contribute to the success in transitioning from paper-based to digital project delivery for civil infrastructure projects.</p>
dc.format.mimetype application/pdf
dc.identifier archive/lib.dr.iastate.edu/etd/15559/
dc.identifier.articleid 6566
dc.identifier.contextkey 11057982
dc.identifier.doi https://doi.org/10.31274/etd-180810-5176
dc.identifier.s3bucket isulib-bepress-aws-west
dc.identifier.submissionpath etd/15559
dc.identifier.uri https://dr.lib.iastate.edu/handle/20.500.12876/29742
dc.language.iso en
dc.source.bitstream archive/lib.dr.iastate.edu/etd/15559/Le_iastate_0097E_16588.pdf|||Fri Jan 14 20:42:59 UTC 2022
dc.subject.disciplines Civil Engineering
dc.subject.keywords Civil Information Model
dc.subject.keywords Civil Infrastructure
dc.subject.keywords Data Extraction
dc.subject.keywords Data Intergration
dc.subject.keywords Natural Language Processing
dc.subject.keywords Semantic Search
dc.title Semantics-based approach for generating partial views from linked life-cycle highway project data
dc.type article
dc.type.genre dissertation
dspace.entity.type Publication
thesis.degree.discipline Civil Engineering
thesis.degree.level dissertation
thesis.degree.name Doctor of Philosophy
File
Original bundle
Now showing 1 - 1 of 1
Name:
Le_iastate_0097E_16588.pdf
Size:
996.07 KB
Format:
Adobe Portable Document Format
Description: