Not Logged In

Stochastic Wasserstein Autoencoder for Probabilistic Sentence Generation

Full Text: N19-1411.pdf PDF

The variational autoencoder (VAE) imposes a probabilistic distribution (typically Gaussian) on the latent space and penalizes the Kullback-Leibler (KL) divergence between the posterior and prior. In NLP, VAEs are extremely difficult to train due to the problem of KL collapsing to zero. One has to implement various heuristics such as KL weight annealing and word dropout in a carefully engineered manner to successfully train a VAE for text. In this paper, we propose to use the Wasserstein autoencoder (WAE) for probabilistic sentence generation, where the encoder could be either stochastic or deterministic. We show theoretically and empirically that, in the original WAE, the stochastically encoded Gaussian distribution tends to become a Dirac-delta function, and we propose a variant of WAE that encourages the stochasticity of the encoder. Experimental results show that the latent space learned by WAE exhibits properties of continuity and smoothness as in VAEs, while simultaneously achieving much higher BLEU scores for sentence reconstruction.

Citation

H. Bahuleyan, L. Mou, H. Zhou, O. Vechtomova. "Stochastic Wasserstein Autoencoder for Probabilistic Sentence Generation". NAACL Annual Conference of the North American Chapter of the Association for Computational Linguisti, Minneapolis, USA, pp 4068–4076, June 2019.

Keywords:  
Category: In Conference
Web Links: doi
  ACL

BibTeX

@incollection{Bahuleyan+al:NAACL19,
  author = {Hareesh Bahuleyan and Lili Mou and Hao Zhou and Olga Vechtomova},
  title = {Stochastic Wasserstein Autoencoder for Probabilistic Sentence
    Generation},
  Pages = {4068–4076},
  booktitle = {NAACL Annual Conference of the North American Chapter of the
    Association for Computational Linguisti},
  year = 2019,
}

Last Updated: February 02, 2021
Submitted by Sabina P

University of Alberta Logo AICML Logo