Finding and extracting topic-specific information from free-text sources is an important task for classifying and distinguishing content of information systems. Such a compression process of information, in which non-relevant text parts can also be ignored, is also advantageous with regard to the further machine processing and evaluation of topic-specific documents. State-of-the-art approaches normally use well-trained modern Natural Language Processing (NLP) methods to solve such tasks. However, use cases can arise where no suitable training data sets are available to adequately prepare or fine-tune the NLP methods used. In this paper, we want to detail a model-driven approach, applying an XML data model to an application-specific scenario, combining different NLP methods into a dynamic automated NLP pipeline. The goal of this pipeline is the automatic extraction of specific information (related to certain domains or topics) from text documents allowing a structured further processing of this information. Specifically, a scenario is considered where such information has to be aligned to a given information model, defining e.g. the terms relevant for the further processing. The solution approaches described here deal with a scenario in which information clusters on a specific topic can be obtained from a given data set, even without domain-specific model training. The basis is the use of a dynamic (i.e., using different NLP methods and models) and fully automatic (i.e., using different topics at the same time) pipeline architecture combined with an XML data model. The presented approach details and extends our earlier work and gives new qualitative and first quantitative results.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.