Using bipartite graphs projected onto two dimensions for text classification

ELENI, ROZAKI and STEPHEN, REDMOND (2017) Using bipartite graphs projected onto two dimensions for text classification. In: Fifth International Conference on Advances in Computing, Communication and Information Technology - CCIT 2017, 02-03 September, 2017, Zurich, Switzerland.

[img]
Preview
Text
20170922_041739.pdf - Published Version

Download (1MB) | Preview
Official URL: https://www.seekdl.org/conferences/paper/details/9...

Abstract

In our Big Data world, the amount of text being gathered is ever expanding. For many years, data curators have sought ways to group thes e documents and identify common topics. As the size of the problem increases, solutions that will scale are needed . The purpose of this work is to present a novel text classifier that can be used for text - mining and interactive information access. The mode l that is demonstrated can be used to extract hierarchical relations between topics , as well as to conducted unsupervised clustering of documents and keywords. The approach that is taken with this model is the use of a graph - of - words key term extraction an d a dimensional projection of the bipartite graph of documents and key terms. This projection makes it possible for terms to be co - clustered in an efficient manner in relation to their documents and the documents in relation to their terms. Furthermore, t h e key term extraction process that is outlined can be scaled on a large corpus using a distributed processing system such as Apache Spark, and the resultant model can be visually interacted with by users.

Item Type: Conference or Workshop Item (Paper)
Uncontrolled Keywords: text mining, classification, clustering, bipartite graph, Apache Spark
Depositing User: Mr. John Steve
Date Deposited: 11 Mar 2019 08:31
Last Modified: 11 Mar 2019 08:31
URI: http://publications.theired.org/id/eprint/391

Actions (login required)

View Item View Item