Authors
Tomohito Ouchi*, Masayoshi Tabuse
Graduate School of Life and Environmental Sciences, Kyoto Prefectural University,
1-5 Shimogamohangi-cho, Sakyo-ku, Kyoto 606-8522, Japan
*Corresponding author. Email: [email protected]
Corresponding Author
Tomohito Ouchi
Received 19 October 2020, Accepted 1 May 2021, Available Online 23 July
2021.
DOI
https://doi.org/10.2991/jrnal.k.210713.003
Keywords
Automatic summarization; data augmentation; pointer–generator model; extractive
summarization
Abstract
In this research, we proposed a data augmentation method using topic model
for Pointer–Generator model. This method is that adding important sentences
to an article as extended article. Furthermore, we compare our proposed
method with data augmentation methods using Easy Data Augmentation (EDA),
LexRank and Luhn. EDA consists of synonym replacement, random insertion,
random swap, and random deletion. LexRank is based on Google’s search method
and Luhn defines sentence features and ranks sentences. We considered which
method is suitable for data augmentation. We confirm that most accurate
model is the model using data augmentation method by topic model.
Copyright
© 2021 The Authors. Published by ALife Robotics Corp. Ltd.
Open Access
This is an open access article distributed under the CC BY-NC 4.0 license
(http://creativecommons.org/licenses/by-nc/4.0/).