Session Boundary Detection for Association Rule Learning Using n-Gram Language Models
- Xiangji Huang, School of Computer Science, University of Waterloo
- Fuchun Peng, Department of Computer Science, University of Massachusetts at Amherst
- Aijun An, Department of Computer Science, York University
- Dale Schuurmans, AICML
- Nick Cercone, School of Computer Science, University of Waterloo
We present a statistical method using n-gram language mod- els to identify session boundaries in a large collection of Livelink log data. The identied sessions are then used for association rule learning. Unlike the traditional ad hoc timeout method, which uses xed time thresh- olds for session identication, our method uses an information theoretic approach that provides a natural technique for performing dynamic ses- sion identication. The eectiveness of our approach is evaluated with respect to 4 dierent interestingness measures. We nd that we obtain a signicant improvement in each interestingness measure, ranging from a 26.6% to 39% improvement on average over the best results obtained with standard timeout methods.
Citation
X. Huang, F. Peng, A. An, D. Schuurmans, N. Cercone. "Session Boundary Detection for Association Rule Learning Using n-Gram Language Models". Canadian Conference on Artificial Intelligence (CAI), Halifax, Nova Scotia, Canada, January 2003.Keywords: | web usage mining, language modeling, evaluation, machine learning |
Category: | In Conference |
BibTeX
@incollection{Huang+al:CAI03, author = {Xiangji Huang and Fuchun Peng and Aijun An and Dale Schuurmans and Nick Cercone}, title = {Session Boundary Detection for Association Rule Learning Using n-Gram Language Models}, booktitle = {Canadian Conference on Artificial Intelligence (CAI)}, year = 2003, }Last Updated: June 01, 2007
Submitted by Staurt H. Johnson