Cross-domain text classification has broad application prospects in the field of data mining. Since transfer learning can help target domain data to achieve the sharing and transfer of semantic information with the help of existing knowledge domains, transfer learning are generally used to achieve cross-domain text processing. Based on this, we propose a cross-domain text classification algorithm -MTrA. The algorithm is based on TrAdaBoost, taking into account the distribution differences between the source domain and the target domain. It uses the Maximum Mean Discrepancy(MMD) as the initial weight parameter of the two domain. MTrA adds a weight backfill factor that considers the accuracy of the source domain classification and balances the weight update method of the source domain data. Through the verification in the dataset 20 Newsgroups, Compared with the traditional TrAdaBoost algorithm, it improves the classification accuracy by 9.4% on average. it proves the effectiveness and advantages of the algorithm.
|