For unsupervised cross-domain named entity recognition, the texts of different domains have different features, and there are also a large number of domain-related specific vocabularies, resulting in some specific words of the target domain are rarely learned in source domain, or have different meanings. In order to solve above problems, we are the first to propose that embedding hierarchical vector representation into the multi-cell compositional LSTM-CRF model, sentence vectors are added on the basis of the character-word vector to form the character-word-sentence hierarchical vector representation. Based on the different contributions of different words to sentences, the model constructs sentence vectors by using label attention mechanism, so that sentence vectors can use more comprehensive information to infer the features of domain-related specific vocabularies, reduce its interference to model understanding. The multi-cell compositional LSTM model encodes various entities and uses the relationship between words and entities to transfer cross-domain knowledge from the word sequence level to the entity sequence level. Finally, after modifying the boundary of the label sequence through the CRF layer, the final result is obtained. In addition to the main task of named entity recognition, the model also uses LM (Language Modeling, LM) task to assist in learning the domain features of the target domain. The experimental results show that the F1 value of the model proposed in this paper has been greatly improved in the different cross-domain datasets.
Transformer has good feature extraction ability and has achieved good performance on various NLP tasks such as sentence classification, machine translation and reading comprehension, but it does not perform well in named entity recognition tasks. According to recent researches, the Long Short-Term Memory (LSTM) usually performs better than Transformer in NER task. LSTM is a variant of Recurrent Neural Network (RNN), because of its natural chain structure, it can learn the front and back dependencies between words well, which is very suitable for processing text sequences. In this paper, the BiLSTM network structure is embedded into the Transformer Encoder, and a new network structure BiLSTM-IN-TRANS is proposed, which combines the sequential feature extraction capability of BiLSTM and the powerful global feature extraction capability of Transformer Encoder. The experiments can reflect that applying the model based on BiLSTM-INTRANS could work better than applying one of LSTM or Transformer alone in the NER task.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.