View Publication

Improved estimation for unsupervised part-of-speech tagging

Qin Iris Wang, Department of Computing Science, University of Alberta
Dale Schuurmans, AICML

We demonstrate that a simple hidden Markov model can achieve state of the art performance in unsupervised part-of-speech tagging, by improving aspects of standard Baum- Welch (EM) estimation. One improvement uses word similarities to smooth the lexical tag->word probability estimates, which avoids over-fitting the lexical model. Another improvement constrains the model to preserve a specified marginal distribution over the hidden tags, which avoids over-fitting the tag->tag transition model. Although using more contextual information than an HMM remains desirable, improving basic estimation still leads to significant improvements and remains a prerequisite for training more complex models.

Citation

Q. Wang, D. Schuurmans. "Improved estimation for unsupervised part-of-speech tagging". IEEE, January 2005.

Keywords:	machine learning
Category:	In Conference

BibTeX

@incollection{Wang+Schuurmans:IEEE05,
  author = {Qin Iris Wang and Dale Schuurmans},
  title = {Improved estimation for unsupervised part-of-speech tagging},
  booktitle = {},
  year = 2005,
}

Last Updated: March 13, 2007
Submitted by AICML Admin Assistant

Not Logged In

PapersDB

Improved estimation for unsupervised part-of-speech tagging

Citation

BibTeX