View Publication

Native Language Identification Using Probabilistic Graphical Models

Garrett Nicolai
Md Asadul Islam
Russ Greiner, Dept of Computing Science; PI of AICML

Native Language Identification (NLI) is the task of identifying the native language of an author of a text written in a second language. Support Vector Machines and Maximum Entropy Learners are the most common methods used to solve this problem, but we consider it from the point-of-view of probabilistic graphical models. We hypothesize that graphical models are well-suited to this task, as they can capture feature inter-dependencies that cannot be exploited by SVMs. Using progressively more connected graphical models, we show that these models out-perform SVMs on reduced feature sets. Furthermore, on full feature sets, even naïve Bayes increases accuracy from 82.06% to 83.41% over SVMs on a 5-language classification task.

Citation

G. Nicolai, M. Islam, R. Greiner. "Native Language Identification Using Probabilistic Graphical Models". International Conference on Electrical Information and Communication Technology , pp n/a, February 2014.

Keywords:	PGM, NLU, machine learning
Category:	In Conference
Web Links:	Journal URL
	DOI

BibTeX

@incollection{Nicolai+al:EICT14,
  author = {Garrett Nicolai and Md Asadul Islam and Russ Greiner},
  title = {Native Language Identification Using Probabilistic Graphical Models},
  Pages = {n/a},
  booktitle = {International Conference on Electrical Information and
    Communication Technology },
  year = 2014,
}

Last Updated: February 12, 2020
Submitted by Sabina P

Not Logged In

PapersDB

Native Language Identification Using Probabilistic Graphical Models

Citation

BibTeX