Authors
Jun Kimura, Yasunari Yoshitomi, Masayoshi Tabuse
Corresponding Author
Jun Kimura
Available Online 1 December 2015.
DOI
https://doi.org/10.2991/jrnal.2015.2.3.10
Keywords
Clustering, Document classification, Extraction of representative document,
Frequency of nouns.
Abstract
We developed a method for classification of Japanese documents and ranking
of representative documents by using the characteristic of the frequencies
of nouns. A representative document is defined as a document whose feature
vector is the closest to the center of gravity of the class in the feature
vector space among all documents belonging to the class belonging to the
class. The ranking of representative documents is decided in descending
order of the number of documents belonging to the class.
Copyright
© 2013, the Authors. Published by ALife Robotics Corp. Ltd
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).