Learning information extraction patterns

dc.contributor.author Chen, Fajun
dc.contributor.department Computer Science
dc.date 2020-11-22T06:42:05.000
dc.date.accessioned 2021-02-26T09:03:41Z
dc.date.available 2021-02-26T09:03:41Z
dc.date.copyright Sat Jan 01 00:00:00 UTC 2000
dc.date.issued 2000-01-01
dc.description.abstract <p>The rapid growth of online texts call for systems that can extract relevant information. Many information extraction systems have been developed using the knowledge engineering approach, which is often time-consuming, laborious, and of no portability. A more promising direction is to apply machine learning techniques to information extraction. A complete Information Extraction (IE) system, IEPlus, has been developed for exploring various design issues. Fine-grained semantic units were defined, and a strategy for semantic resolution was proposed in IEPlus. An enhancement for rule evaluation based on case frame matching was implemented in IEPlus. A rule firing strategy was also presented in IEPlus, which prioritizes the most specific rule in terms of the number of terms matched. Experiments on the Rental Ads domain demonstrated the effectiveness of the IEPlus system. IEPlus is highly flexible resulting from its object-oriented design, and has the capability of exploring various issues in information extraction system design.</p>
dc.format.mimetype application/pdf
dc.identifier archive/lib.dr.iastate.edu/rtd/21116/
dc.identifier.articleid 22115
dc.identifier.contextkey 20252223
dc.identifier.doi https://doi.org/10.31274/rtd-20201118-80
dc.identifier.s3bucket isulib-bepress-aws-west
dc.identifier.submissionpath rtd/21116
dc.identifier.uri https://dr.lib.iastate.edu/handle/20.500.12876/98483
dc.language.iso en
dc.source.bitstream archive/lib.dr.iastate.edu/rtd/21116/Chen_ISU_2000_C545.pdf|||Fri Jan 14 22:34:58 UTC 2022
dc.subject.keywords Computer science
dc.title Learning information extraction patterns
dc.type article
dc.type.genre thesis
dspace.entity.type Publication
relation.isOrgUnitOfPublication f7be4eb9-d1d0-4081-859b-b15cee251456
thesis.degree.discipline Computer Science
thesis.degree.level thesis
thesis.degree.name Master of Science
File
Original bundle
Now showing 1 - 1 of 1
Name:
Chen_ISU_2000_C545.pdf
Size:
1.17 MB
Format:
Adobe Portable Document Format
Description: