Learning information extraction patterns

Date
2000-01-01
Authors
Chen, Fajun
Major Professor
Advisor
Committee Member
Journal Title
Journal ISSN
Volume Title
Publisher
Altmetrics
Authors
Research Projects
Organizational Units
Computer Science
Organizational Unit
Journal Issue
Series
Department
Computer Science
Abstract

The rapid growth of online texts call for systems that can extract relevant information. Many information extraction systems have been developed using the knowledge engineering approach, which is often time-consuming, laborious, and of no portability. A more promising direction is to apply machine learning techniques to information extraction. A complete Information Extraction (IE) system, IEPlus, has been developed for exploring various design issues. Fine-grained semantic units were defined, and a strategy for semantic resolution was proposed in IEPlus. An enhancement for rule evaluation based on case frame matching was implemented in IEPlus. A rule firing strategy was also presented in IEPlus, which prioritizes the most specific rule in terms of the number of terms matched. Experiments on the Rental Ads domain demonstrated the effectiveness of the IEPlus system. IEPlus is highly flexible resulting from its object-oriented design, and has the capability of exploring various issues in information extraction system design.

Comments
Description
Keywords
Citation
Source