Language styles, including word choice and syntactic structure, are all pivotal factors that affect the formality of the text. The formalization of text style is a sub-problem of text style transfer, which aims to transform daily expression into academic style. SeqGAN bypasses the indifferentiable problem in backpropagation caused by the discrete nature of the token, and pioneers the application of GAN for text generation. In the formalization of text style, GAN was not used for formality style transfer which is a task worth exploring. We apply the Monte Carlo idea in SeqGAN to the task of formalizing the text style, that is, we use the sampling method to obtain the state-action value. In academic writing, the choice of key words would affect the quality of the entire sentence. In this paper, we propose the Masked SeqGAN to cope with this problem. The architecture of our proposed model is similar to SeqGAN, but the difference is that after the complete sentence is generated, a <πππ π> tag is added to the current position and the discriminator scores the sentence marked with mask and the original sentence separately. The difference in score indicates the contribution of words to the entire sentence. Words with high contributions will be considered important words, and this contribution will be used to update the policy. Experiments show that Masked SeqGAN is better than previous GAN-based methods, both in terms of automatic scoring and manual scoring.
|