PGN-LM Model and Forcing-Seq2Seq Model: Multiple automatic models of title generation for natural text using Deep Learning

To Thanh Nhan, Nguyen Thi Hiep Thuan, Quan Thanh Tho

Abstract


In the current era, the amount of information from the Internet in general and the electronic press in particular has increased rapidly and has extremely useful information value in all aspects of life, many popular users have posted several high-quality writings as casual blogs, notes or reviews. Some of them are even selected by editors to be published in professional venues. However, the original posts often come without titles, which are needed to be manually added by the editing teams. This task would be  done automatically, with the recent advancement of AI techniques, especially deep learning. Even though auto-title can be considered as a specific case of text summarization, this job poses some major different requirements. Basically, a title is generally short but it needs to capture major content while still maintaining the writing style of the original document. To fulfill those constraints, we introduce PGN-LM Model, an architecture evolved from the Pointer Generator Network, with the ability to solve Out-of-Vocabulary problems that traditional Seq2Seq models cannot handle, and at the same time combined with language modeling techniques. In addition, we also introduce a model called Forcing-Seq2Seq Model, an enhanced Seq2Seq architecture, in which the classical TF-IDF scores are incorporated with Named Entity Recognition method to identify the major keywords of the original texts. To enforce the appearance of those keywords in the generated titles, the specific Teacher Forcing mechanism combined with the language model technique are employed. We have tested our approaches with real datasets and obtained promising initial results, on both metrics of machine and human perspectives.

Full Text:

PDF

References


Mehdi Allahyari, Seyedamin Pouriyeh, Mehdi Assef, Saeid Safaei, Eliza-beth D. Trippe, Juan B. Gutierrez and Krys Kochut, Text Summarization Techniques: A Brief Survey. arXiv:1707.02268v3, 2017

K. S. Jones, Automatic summarizing: the state of the art. Information Processing and Management, Elsevier, Vol. 43, No. 6, 2007.

Vishal Gupta and Gurpreet Leha, A Survey of Text Summarization Extrac-tive Techniques. Journal of Emerging Technologies in Web Intelligence2, 2010

Chandra Khatri, Gyanit Singh and Nish Parikh, Abstractive and Extractive Text Summarization using Document Context Vector and Recurrent Neural Network. arXiv:1807.08000, 2018

P. Li, W. Lam, L. Bing, and Z. Wang, Deep recurrent generative decoder for abstractive text summarization. arXiv preprint arXiv:1708.00625,2017

Tal Baumel, Matan Eyal, Michael Elhada, Query Focused Abstractive Summarization: Incorporating Query Relevance, Multi Document Coverage, and Summary Length Constraints into seq2seq Models.arXiv:1801.07704, 2018

Abigail See, Peter J. Liu, Christopher D. Manning, Get To The Point: Summarization with Pointer-Generator Networks.

arXiv:1704.04368,2017

K. Lopyrev, Generating News Headlines with Recurrent Neural Networks.arXivpreprint arXiv:1512.01712, 2015

Thomas Cherian, Akshay Badola, Vineet Padmanabhan, Multi-cell LSTM Based Neural Language Model. arXiv:1811.06477, 2018

A. Aizawa, An information-theoretic perspective of tf–idf measures. Information Processing and Management, vol. 39, no. 1, pp. 381-397,2003

Anirudh Goyal, Alex Lamb, Ying Zhang, Saizheng Zhang, AaronCourville and Yoshua Bengio, Professor Forcing: A New Algorithm for Training Recurrent Networks. NeurIPS 2016

Ralf C. Staudemeyer, Eric Rothstein Morris, Understanding LSTM - a tutorial into Long Short-Term Memory Recurrent Neural Networks.arXiv:1909.09586, 2019

Andrea Galassi, Marco Lippi, Paolo Torroni, Attention in Natural Language Processing. arXiv:1902.02181, 2019

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, LlionJones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin, Attention Is All You Need. arXiv:1706.03762, 2017.

lya Sutskever, Oriol Vinyals, Quoc V. Le, Sequence to Sequence Learning with Neural Networks. arXiv:1409.3215, 2014

Shahzad Qaiser, Ramsha Ali, Text Mining: Use of TF-IDF to Examine the Relevance of Words to Documents. International Journal of Computer Applications (0975 – 8887), 2018

Amir Jalilifard, Vinicius F. Carid ́a, Alex F. Mansano, Rogers S. Cristo,Felipe Penhorate C. da Fonseca, Semantic Sensitive TF-IDF to Determine Word Relevance in Documents. arXiv:2001.09896, 2021

Qingyun Dou, Yiting Lu, Joshua Efiong, Mark J. F. Gale, Attention Forcing for Sequence-to-sequence Model Training. arXiv:1909.12289,2019

Tomas Mikolov, Kai Chen, Greg Corrado, Jeffrey Dean, Efficient Estimation of Word Representations in Vector Space. arXiv:1301.3781,2013

Fred Jelinek and Robert L. Mercer, Interpolated estimation of Markovsource parameters from sparse data. In: Proceedings, Workshop on Pattern Recognition in Practice, pp. 381-397, 1980

Christopher D. Manning and Hinrich Schutze, Foundations of Statistical Natural Language Processing, 1999

Arman Cohan, Nazli Goharian, Revisiting Summarization Evaluation for Scientific Articles. arXiv:1604.00400, 2016

Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu, BLEU: a method for automatic evaluation of machine translation. In: ACL-2002:40th Annual meeting of the Association for Computational Linguistics, pp. 311-318, 2002

Kishore Papineni, Salim Roukos, Todd Ward and John Henderson, Corpus-based comprehensive and diagnostic MT evaluation: Initial Ara-bic, Chinese, French, and Spanish results. In: Proceedings of HumanLanguage Technology, pp. 132-137, 2002

Chris Callison-Burch, Miles Osborne and Philipp Koehn, Reevaluating the Role of BLEU in Machine Translation Research. In: 1th Conference of the European Chapter of the Association for Computational Linguistics:EACL, pp. 249-256, 2006




DOI: http://dx.doi.org/10.21553/rev-jec.285

Copyright (c) 2022 REV Journal on Electronics and Communications


Copyright © 2011-2022
Radio and Electronics Association of Vietnam
All rights reserved