Not Logged In

Documents and Dependencies: an Exploration of Vector Space Models for Semantic Composition

Full Text: W13-3510.pdf PDF

In most previous research on distributional semantics, Vector Space Models (VSMs) of words are built either from topical information (e.g., documents in which a word is present), or from syntactic/semantic types of words (e.g., dependency parse links of a word in sentences), but not both. In this paper, we explore the utility of combining these two representations to build VSM for the task of semantic composition of adjective-noun phrases. Through extensive experiments on benchmark datasets, we find that even though a type-based VSM is effective for semantic composition, it is often outperformed by a VSM built using a combination of topic- and type-based statistics. We also introduce a new evaluation task wherein we predict the composed vector representation of a phrase from the brain activity of a human subject reading that phrase. We exploit a large syntactically parsed corpus of 16 billion tokens to build our VSMs, with vectors for both phrases and words, and make them publicly available.

Citation

A. Fyshe, P. Talukdar, B. Murphy, T. Mitchell. "Documents and Dependencies: an Exploration of Vector Space Models for Semantic Composition". International Conference on Computational Natural Language Learning, Sofia, Bulgaria, pp 84–93, August 2013.

Keywords:  
Category: In Conference
Web Links: ACL

BibTeX

@incollection{Fyshe+al:CoNLL13,
  author = {Alona Fyshe and Partha Talukdar and Brian Murphy and Tom M.
    Mitchell},
  title = {Documents and Dependencies: an Exploration of Vector Space Models
    for Semantic Composition},
  Pages = {84–93},
  booktitle = {International Conference on Computational Natural Language
    Learning},
  year = 2013,
}

Last Updated: June 22, 2020
Submitted by Sabina P

University of Alberta Logo AICML Logo