Not Logged In

Stochastic Analysis of Lexical and Semantic Enhanced Structural Language Model

Full Text: ICGI-06.pdf PDF

In this paper, we present a directed Markov random field model that integrates trigram models, structural language models (SLM) and probabilistic latent semantic analysis (PLSA) for the purpose of statistical language modeling. The SLM is essentially a generalization of shift-reduce probabilistic push-down automata thus more complex and powerful than probabilistic context free grammars (PCFGs). The added context-sensitiveness due to trigrams and PLSAs and violation of tree structure in the topology of the underlying random field model make the inference and parameter estimation problems plausibly intractable, however the analysis of the behavior of the lexical and semantic enhanced structural language model leads to a generalized inside-outside algorithm and thus to rigorous exact EM type re-estimation of the composite language model parameters.

Citation

S. Wang, S. Wang, L. Cheng, R. Greiner, D. Schuurmans. "Stochastic Analysis of Lexical and Semantic Enhanced Structural Language Model". International Colloquium on Grammatical Inference (ICGI), Chofu, Tokyo, Japan, pp 97-111, September 2006.

Keywords: language modeling, structural language model, trigram, machine learning, probabilistic latent semantic analysis
Category: In Conference

BibTeX

@incollection{Wang+al:ICGI06,
  author = {Shaojun Wang and Shaomin Wang and Li Cheng and Russ Greiner and
    Dale Schuurmans},
  title = {Stochastic Analysis of Lexical and Semantic Enhanced Structural
    Language Model},
  Pages = {97-111},
  booktitle = {International Colloquium on Grammatical Inference (ICGI)},
  year = 2006,
}

Last Updated: October 13, 2013
Submitted by Russ Greiner

University of Alberta Logo AICML Logo