PathBinder: a sentence repository of biochemical interactions extracted from MEDLINE
dc.contributor.author | Ding, Jing | |
dc.contributor.department | Department of Electrical and Computer Engineering | |
dc.date | 2020-08-05T18:36:45.000 | |
dc.date.accessioned | 2021-02-26T08:41:45Z | |
dc.date.available | 2021-02-26T08:41:45Z | |
dc.date.copyright | Wed Jan 01 00:00:00 UTC 2003 | |
dc.date.issued | 2003-01-01 | |
dc.description.abstract | <p>MEDLINE is a fast growing online scientific literature database covering the fields of life science, medicine, health care, etc. It provides attractive opportunities for automatic information extraction for tasks such as extracting networks of protein interactions, as well as for benefiting researchers who need to efficiently sift through the literature to find work relating to small sets of biochemicals of interest. PathBinder is a software system that extracts sentences containing potential biochemical interactions from the baseline MEDLINE database annual distribution. Interactions between two biochemicals are assumed if they co-occur in a single sentence. Single sentences were parsed from MEDLINE abstracts, and scanned against a dictionary containing more than 80,000 entries (>40,000 biochemicals and their aliases) for at least two different biochemicals. The dictionary was constructed automatically by extracting names and synonyms of protein and non-protein biochemicals from four databases. The extracted sentences are organized in a repository, about 11 GB in size, easily retrievable through a 2-level index system based on two biochemical names. The performance of PathBinder in terms of information extraction metrics (e.g. precision and recall) was evaluated using a sample MEDLINE file. Sentence parsing has a precision of 99.6% and a recall of 99.5%. Biochemical labeling had a precision of 80.5% and a recall of 57.3%.</p> | |
dc.format.mimetype | application/pdf | |
dc.identifier | archive/lib.dr.iastate.edu/rtd/19945/ | |
dc.identifier.articleid | 20944 | |
dc.identifier.contextkey | 18780054 | |
dc.identifier.doi | https://doi.org/10.31274/rtd-20200803-167 | |
dc.identifier.s3bucket | isulib-bepress-aws-west | |
dc.identifier.submissionpath | rtd/19945 | |
dc.identifier.uri | https://dr.lib.iastate.edu/handle/20.500.12876/97312 | |
dc.language.iso | en | |
dc.source.bitstream | archive/lib.dr.iastate.edu/rtd/19945/Ding_ISU_2003_D56.pdf|||Fri Jan 14 22:01:19 UTC 2022 | |
dc.subject.keywords | Electrical and computer engineering | |
dc.subject.keywords | Computer engineering | |
dc.title | PathBinder: a sentence repository of biochemical interactions extracted from MEDLINE | |
dc.type | thesis | en_US |
dc.type.genre | thesis | en_US |
dspace.entity.type | Publication | |
relation.isOrgUnitOfPublication | a75a044c-d11e-44cd-af4f-dab1d83339ff | |
thesis.degree.discipline | Computer Engineering | |
thesis.degree.level | thesis | |
thesis.degree.name | Master of Science |
File
Original bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- Ding_ISU_2003_D56.pdf
- Size:
- 789.62 KB
- Format:
- Adobe Portable Document Format
- Description: