Latent Maximum Entropy Approach for Semantic N-Gram Language Modeling
- Shaojun Wang, Dept of Computing Science
- Dale Schuurmans, AICML
- Fuchun Peng, Department of Computer Science, University of Massachusetts at Amherst
In this paper, we describe a unied prob- abilistic framework for statistical language modeling|the latent maximum entropy principle|which can eectively incorporate various aspects of natural language, such as local word interaction, syntactic structure and semantic document information. Unlike previous work on maximum entropy methods for language modeling, which only allow ex- plicit features to be modeled, our framework also allows relationships over hidden features to be captured, resulting in a more expres- sive language model. We describe eÆcient algorithms for marginalization, inference and normalization in our extended models. We then present promising experimental results for our approach on the Wall Street Journal corpus.
Citation
S. Wang, D. Schuurmans, F. Peng. "Latent Maximum Entropy Approach for Semantic N-Gram Language Modeling". International Workshop on Artificial Intelligence and Statistics (AISTATS), January 2003.Keywords: | language modeling, machine learning |
Category: | In Conference |
BibTeX
@incollection{Wang+al:AISTATS03, author = {Shaojun Wang and Dale Schuurmans and Fuchun Peng}, title = {Latent Maximum Entropy Approach for Semantic N-Gram Language Modeling}, booktitle = {International Workshop on Artificial Intelligence and Statistics (AISTATS)}, year = 2003, }Last Updated: June 01, 2007
Submitted by Staurt H. Johnson