Not Logged In

Text document topical recursive clustering and automatic labeling of a hierarchy of document clusters

Full Text: pakdd13-2.pdf PDF

The overwhelming amount of textual documents available nowadays highlights the need for information organization and discovery. Effectively organizing documents into a hierarchy of topics and subtopics makes it easier for users to browse the documents. This paper borrows community mining from social network analysis to generate a hierarchy of topically coherent document clusters. It focuses on giving the document clusters descriptive labels. We propose to use betweenness centrality measure in networks of co-occurring terms to label the clusters. We also incorporate keyphrase extraction and automatic titling in cluster labeling. The results show that the cluster labeling method utilizing KEA to extract keyphrases from the documents generates the best labels overall comparing to other methods and baselines.

Citation

J. Chen, O. Zaiane. "Text document topical recursive clustering and automatic labeling of a hierarchy of document clusters". Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), pp 197-208, April 2013.

Keywords: Text Mining and Web Mining, Cluster Labeling, Document Clustering
Category: In Conference
Web Links: Springer

BibTeX

@incollection{Chen+Zaiane:PAKDD13,
  author = {Jiyang Chen and Osmar R. Zaiane},
  title = {Text document topical recursive clustering and automatic labeling of
    a hierarchy of document clusters},
  Pages = {197-208},
  booktitle = {Pacific-Asia Conference on Knowledge Discovery and Data Mining
    (PAKDD)},
  year = 2013,
}

Last Updated: January 13, 2020
Submitted by Sabina P

University of Alberta Logo AICML Logo