Data integration for biological network databases: MetNetDB labeled graph model and graph matching algorithm

dc.contributor.advisor Eve S. Wurtele
dc.contributor.advisor Leslie Miller
dc.contributor.author Li, Jie
dc.contributor.department Genetics, Development and Cell Biology
dc.date 2018-08-11T17:31:19.000
dc.date.accessioned 2020-06-30T02:31:58Z
dc.date.available 2020-06-30T02:31:58Z
dc.date.copyright Tue Jan 01 00:00:00 UTC 2008
dc.date.embargo 2013-06-05
dc.date.issued 2008-01-01
dc.description.abstract <p>To understand the cellular functions of genes requires investigating a variety of biological data, including experimental data, annotation from online databases and literatures, information about cellular interactions, and domain knowledge from biologists. These requirements demand a flexible and powerful biological data management system. MetNetDB is the biological database component of the MetNet platform (http://metnetdb.org/), a software platform for Arabidopsis system biology. This work describes a labeled graph model that addresses the challenges associated with biological network databases, and discusses the implementation of this model in MetNetDB.</p> <p>MetNetDB integrates most recent data from various sources, including biological networks, gene annotation, metabolite information, and protein localization data. The integration contains four steps: data model transformation and integration; semantic mapping; data conversion and integration; and conflict resolution. MetNetDB is established as a labeled graph model. The graph structure supports network data storage and application of graph analysis algorithm. The node and edge labels have the same extension capability as object data model. In addition, rules are used to guarantee the biological network data integrity; operations are defined for graph edit and comparison.</p> <p>To facilitate the integration of network data, which is often inaccurate or incomplete, a subgraph extraction algorithm is designed for MetNetDB. This algorithm allows subgraph querying based on user-specified biomolecules. Both exact matching and approximate matching with biomolecules in networks are supported. The similarity among biomolecules is inferred from expression patterns, gene ontology, chemical ontology, and protein-gene relationships. Combined with the implementation of Messmer's approximate subgraph isomorphism algorithm, MetNetDB supports exact and approximate graph matching.</p> <p>Based on the MetNetDB labeled graph model and the graph matching algorithms, the MetNetDB curator tool is built with several innovative features, including active biological rule checking during network curation, tracking data change history, and a biologist-friendly visual graph query system.</p>
dc.format.mimetype application/pdf
dc.identifier archive/lib.dr.iastate.edu/etd/10921/
dc.identifier.articleid 1920
dc.identifier.contextkey 2807118
dc.identifier.doi https://doi.org/10.31274/etd-180810-4324
dc.identifier.s3bucket isulib-bepress-aws-west
dc.identifier.submissionpath etd/10921
dc.identifier.uri https://dr.lib.iastate.edu/handle/20.500.12876/25127
dc.language.iso en
dc.source.bitstream archive/lib.dr.iastate.edu/etd/10921/Li_iastate_0097E_10058.pdf|||Fri Jan 14 18:31:08 UTC 2022
dc.subject.disciplines Cell and Developmental Biology
dc.subject.disciplines Genetics and Genomics
dc.subject.keywords Arabidopsis
dc.subject.keywords biological data integration
dc.subject.keywords biological network database
dc.subject.keywords graph query
dc.subject.keywords MetNetDB
dc.subject.keywords subgraph isomorphism
dc.title Data integration for biological network databases: MetNetDB labeled graph model and graph matching algorithm
dc.type article
dc.type.genre dissertation
dspace.entity.type Publication
relation.isOrgUnitOfPublication 9e603b30-6443-4b8e-aff5-57de4a7e4cb2
thesis.degree.discipline Bioinformatics and Computational Biology
thesis.degree.level dissertation
thesis.degree.name Doctor of Philosophy
File
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Li_iastate_0097E_10058.pdf
Size:
3.25 MB
Format:
Adobe Portable Document Format
Description: