Web community identification from random walks
- Jiayuan Huang, University of Waterloo
- Tingshao Zhu
- Dale Schuurmans, AICML
We propose a technique for identifying latent Web commu- nities based solely on the hyperlink structure of the WWW, via random walks. Although the topology of the Directed Web Graph encodes important information about the content of individual Web pages, it also reveals useful meta-level information about user communities. Random walk models are capable of propagating local link information throughout the Web Graph, which can be used to reveal hidden global relationships between different regions of the graph. Variations of these random walk models are shown to be effective at identifying latent Web communities and revealing link topology. To efficiently extract these communities from the stationary distribution defined by a random walk, we exploit a computationally efficient form of directed spectral clustering. The performance of our approach is evaluated in real Web applications, where the method is shown to effectively identify latent Web communities based on link topology only.
Citation
J. Huang, T. Zhu, D. Schuurmans. "Web community identification from random walks". European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD), Berlin, Germany, January 2006.Keywords: | machine learning |
Category: | In Conference |
BibTeX
@incollection{Huang+al:PKDD06, author = {Jiayuan Huang and Tingshao Zhu and Dale Schuurmans}, title = {Web community identification from random walks}, booktitle = {European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD)}, year = 2006, }Last Updated: June 06, 2007
Submitted by Nelson Loyola