Not Logged In

Language independent authorship attribution using character level language models

Full Text: eacl03.ps.ps PS

We present a method for computer­ assisted authorship attribution based on character­ level n­gram language mod­ els. Our approach is based on sim­ple information theoretic principles, and achieves improved performance across a variety of languages without requir­ing extensive pre­processing or feature selection. To demonstrate the effec­tiveness and language independence of our approach, we present experimen­tal results on Greek, English, and Chi­nese data. We show that our approach achieves state of the art performance in each of these cases. In particular, we ob­ tain a 20% accuracy improvement over the best published results for a Greek data set, while using a far simpler tech­ nique than previous investigations.

Citation

F. Peng, D. Schuurmans, S. Wang. "Language independent authorship attribution using character level language models". EACL, April 2003.

Keywords: machine learning
Category: In Conference

BibTeX

@incollection{Peng+al:EACL03,
  author = {Fuchun Peng and Dale Schuurmans and Shaojun Wang},
  title = {Language independent authorship attribution using character level
    language models},
  booktitle = {},
  year = 2003,
}

Last Updated: June 01, 2007
Submitted by Staurt H. Johnson

University of Alberta Logo AICML Logo